feat: Add UnstructuredCSVLoader for CSV files (#5844)

### Summary

Adds an `UnstructuredCSVLoader` for loading CSVs. One advantage of using
`UnstructuredCSVLoader` relative to the standard `CSVLoader` is that if
you use `UnstructuredCSVLoader` in `"elements"` mode, an HTML
representation of the table will be available in the metadata.

#### Who can review?

@hwchase17
 @eyurtsev
This commit is contained in:
Matt Robinson
2023-06-07 22:18:01 -04:00
committed by GitHub
parent 0b4a51930c
commit 11fec7d4d1
5 changed files with 248 additions and 7 deletions

View File

@@ -0,0 +1,15 @@
import os
from pathlib import Path
from langchain.document_loaders import UnstructuredCSVLoader
EXAMPLE_DIRECTORY = file_path = Path(__file__).parent.parent / "examples"
def test_unstructured_csv_loader() -> None:
"""Test unstructured loader."""
file_path = os.path.join(EXAMPLE_DIRECTORY, "stanley-cups.csv")
loader = UnstructuredCSVLoader(str(file_path))
docs = loader.load()
assert len(docs) == 1

View File

@@ -0,0 +1,5 @@
Stanley Cups,,
Team,Location,Stanley Cups
Blues,STL,1
Flyers,PHI,2
Maple Leafs,TOR,13
1 Stanley Cups
2 Team Location Stanley Cups
3 Blues STL 1
4 Flyers PHI 2
5 Maple Leafs TOR 13