community[patch]: Load list of files using UnstructuredFileLoader (#16216)

- **Description:** Updated `_get_elements()` function of
`UnstructuredFileLoader `class to check if the argument self.file_path
is a file or list of files. If it is a list of files then it iterates
over the list of file paths, calls the partition function for each one,
and appends the results to the elements list. If self.file_path is not a
list, it calls the partition function as before.
  
  - **Issue:** Fixed #15607,
  - **Dependencies:** NA
  - **Twitter handle:** NA

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
This commit is contained in:
Raunak
2024-01-24 09:07:37 +05:30
committed by GitHub
parent 019b6ebe8d
commit 476bf8b763
3 changed files with 74 additions and 3 deletions

View File

@@ -170,7 +170,13 @@ class UnstructuredFileLoader(UnstructuredBaseLoader):
def _get_elements(self) -> List:
from unstructured.partition.auto import partition
return partition(filename=self.file_path, **self.unstructured_kwargs)
if isinstance(self.file_path, list):
elements = []
for file in self.file_path:
elements.extend(partition(filename=file, **self.unstructured_kwargs))
return elements
else:
return partition(filename=self.file_path, **self.unstructured_kwargs)
def _get_metadata(self) -> dict:
return {"source": self.file_path}