mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-11 10:06:05 +00:00
### Summary Adds a `UnstructuredURLLoader` that supports loading data from a list of URLs. ### Testing ```python from langchain.document_loaders import UnstructuredURLLoader urls = [ "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023", "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023" ] loader = UnstructuredURLLoader(urls=urls) raw_documents = loader.load() ``` |
||
---|---|---|
.. | ||
example_data | ||
azlyrics.ipynb | ||
college_confidential.ipynb | ||
directory_loader.ipynb | ||
email.ipynb | ||
everynote.ipynb | ||
gcs_directory.ipynb | ||
gcs_file.ipynb | ||
googledrive.ipynb | ||
gutenberg.ipynb | ||
html.ipynb | ||
imsdb.ipynb | ||
microsoft_word.ipynb | ||
notion.ipynb | ||
obsidian.ipynb | ||
pdf.ipynb | ||
powerpoint.ipynb | ||
readthedocs_documentation.ipynb | ||
roam.ipynb | ||
s3_directory.ipynb | ||
s3_file.ipynb | ||
unstructured_file.ipynb | ||
url.ipynb | ||
web_base.ipynb | ||
youtube.ipynb |