..
blob_loaders
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
parsers
community: add custom model for OpenAIWhisperParser ( #29831 )
2025-02-16 21:26:07 -05:00
__init__.py
community[patch]: Refactoring PDF loaders: 01 prepare ( #29062 )
2025-01-07 11:00:04 -05:00
acreom.py
community[patch]: Add missing annotations ( #24890 )
2024-07-31 18:13:44 +00:00
airbyte_json.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
airbyte.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
airtable.py
docs: fix kwargs docstring ( #25010 )
2024-08-02 19:54:54 -07:00
apify_dataset.py
docs: update apify integration ( #29553 )
2025-02-12 20:02:55 -08:00
arcgis_loader.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
arxiv.py
docs: Arxiv docs update ( #23871 )
2024-07-05 11:43:51 -04:00
assemblyai.py
community[patch]: docstrings update ( #20301 )
2024-04-11 16:23:27 -04:00
astradb.py
multiple: update removal targets ( #25361 )
2024-08-14 09:50:39 -04:00
async_html.py
community[patch]: Release 0.2.11 ( #24989 )
2024-08-02 20:08:44 +00:00
athena.py
community: make AthenaLoader profile_name optional and fix type hint ( #24958 )
2024-08-05 14:28:58 +00:00
azlyrics.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
azure_ai_data.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
azure_blob_storage_container.py
community[patch]: type ignore fixes ( #18395 )
2024-03-01 11:21:02 -08:00
azure_blob_storage_file.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
baiducloud_bos_directory.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
baiducloud_bos_file.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
base_o365.py
community: update base_o365.py ( #29657 )
2025-02-07 08:43:29 -05:00
base.py
core: Move document loader interfaces to core ( #17723 )
2024-03-06 13:59:00 -05:00
bibtex.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
bigquery.py
multiple: update removal targets ( #25361 )
2024-08-14 09:50:39 -04:00
bilibili.py
community[patch]: docstrings update ( #20301 )
2024-04-11 16:23:27 -04:00
blackboard.py
community: add flag to toggle progress bar ( #24463 )
2024-07-20 13:18:02 +00:00
blockchain.py
community: add supported blockchains to Blockchain Document Loader ( #25428 )
2024-08-23 14:39:42 +00:00
brave_search.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
browserbase.py
community: updated Browserbase loader ( #21757 )
2024-05-16 08:21:23 -07:00
browserless.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
cassandra.py
community[minor]: Add Cassandra ByteStore ( #22064 )
2024-05-23 10:46:23 -04:00
chatgpt.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
chm.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
chromium.py
community[minor]: add user agent for web scraping loaders ( #22480 )
2024-06-05 15:20:34 +00:00
college_confidential.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
concurrent.py
community[patch]: import flattening fix ( #20110 )
2024-04-10 13:01:19 -04:00
confluence.py
community: ConfluenceLoader: add a filter method for attachments ( #29882 )
2025-02-19 18:20:45 -05:00
conllu.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
couchbase.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
csv_loader.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
cube_semantic.py
community: add missing format specifier in error log in CubeSemanticLoader ( #29172 )
2025-01-13 09:32:57 -05:00
datadog_logs.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
dataframe.py
Update dataframe.py ( #28871 )
2024-12-22 19:16:16 -05:00
dedoc.py
community[minor]: added new document loaders based on dedoc library ( #24303 )
2024-07-23 02:04:53 +00:00
diffbot.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
directory.py
community: glob multiple patterns when using DirectoryLoader ( #22852 )
2024-06-18 09:24:50 -07:00
discord.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
doc_intelligence.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
docugami.py
multiple: pydantic 2 compatibility, v0.3 ( #26443 )
2024-09-13 14:38:45 -07:00
docusaurus.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
dropbox.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
duckdb_loader.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
email.py
all: test 3.13 ci ( #27197 )
2024-10-25 12:56:58 -07:00
epub.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
etherscan.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
evernote.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
excel.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
facebook_chat.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
fauna.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
figma.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
firecrawl.py
Community: Updated Firecrawl Document Loader to v1 ( #26548 )
2024-10-15 13:13:28 +00:00
gcs_directory.py
multiple: update removal targets ( #25361 )
2024-08-14 09:50:39 -04:00
gcs_file.py
multiple: update removal targets ( #25361 )
2024-08-14 09:50:39 -04:00
generic.py
community[patch]: import flattening fix ( #20110 )
2024-04-10 13:01:19 -04:00
geodataframe.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
git.py
all: test 3.13 ci ( #27197 )
2024-10-25 12:56:58 -07:00
gitbook.py
community: add flag to toggle progress bar ( #24463 )
2024-07-20 13:18:02 +00:00
github.py
multiple: pydantic 2 compatibility, v0.3 ( #26443 )
2024-09-13 14:38:45 -07:00
glue_catalog.py
community[minor]: Add glue catalog loader ( #20220 )
2024-04-16 11:39:23 -04:00
google_speech_to_text.py
multiple: update removal targets ( #25361 )
2024-08-14 09:50:39 -04:00
googledrive.py
multiple: pydantic 2 compatibility, v0.3 ( #26443 )
2024-09-13 14:38:45 -07:00
gutenberg.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
helpers.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
hn.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
html_bs.py
multiple: pydantic 2 compatibility, v0.3 ( #26443 )
2024-09-13 14:38:45 -07:00
html.py
community: add init for UnstructuredHTMLLoader
to solve pathlib paths ( #29091 )
2025-01-08 10:19:27 -05:00
hugging_face_dataset.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
hugging_face_model.py
community[patch]: Add missing annotations ( #24890 )
2024-07-31 18:13:44 +00:00
ifixit.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
image_captions.py
all: test 3.13 ci ( #27197 )
2024-10-25 12:56:58 -07:00
image.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
imsdb.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
iugu.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
joplin.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
json_loader.py
community[minor]: Fix json._validate_metadata_func() ( #22842 )
2024-12-13 21:24:20 +00:00
kinetica_loader.py
community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API ( #22724 )
2024-06-11 10:01:26 -04:00
lakefs.py
docs: docstrings langchain_community
update ( #14889 )
2023-12-19 08:58:24 -05:00
larksuite.py
community[minor]: Add LarkSuite wiki document loader. ( #21016 )
2024-04-29 10:37:50 -04:00
llmsherpa.py
community[minor]: add support for llmsherpa ( #19741 )
2024-03-29 16:04:57 -07:00
markdown.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
mastodon.py
Merge pull request #18671
2024-03-06 13:23:14 -05:00
max_compute.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
mediawikidump.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
merge.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
mhtml.py
community[patch]: upgrade to recent version of mypy ( #21616 )
2024-05-13 14:55:07 -04:00
mintbase.py
community[minor]: add mintbase loader to langchain ( #20089 )
2024-04-30 04:11:56 +00:00
modern_treasury.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
mongodb.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
needle.py
community: add Needle retriever and document loader integration ( #28157 )
2024-12-03 22:06:25 +00:00
news.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
notebook.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
notion.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
notiondb.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
nuclia.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
obs_directory.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
obs_file.py
add mode arg to OBSFileLoader.load() method ( #29246 )
2025-01-16 11:09:04 -05:00
obsidian.py
community[patch]: Add missing annotations ( #24890 )
2024-07-31 18:13:44 +00:00
odt.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
onedrive_file.py
multiple: pydantic 2 compatibility, v0.3 ( #26443 )
2024-09-13 14:38:45 -07:00
onedrive.py
community: Allow other than default parsers in SharePointLoader and OneDriveLoader ( #27716 )
2024-11-06 17:44:34 -05:00
onenote.py
community[patch]: Fix validation error in SettingsConfigDict across multiple Langchain modules ( #26852 )
2024-09-25 10:02:14 -04:00
open_city_data.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
oracleadb_loader.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
oracleai.py
community[minor]: Oraclevs integration ( #21123 )
2024-05-04 03:15:35 +00:00
org_mode.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
pdf.py
community[minor]: 05 - Refactoring PyPDFium2 parser ( #29625 )
2025-02-07 21:31:12 -05:00
pebblo.py
community[minor]: [Pebblo] Enhance PebbloSafeLoader to take anonymize flag ( #26812 )
2024-09-25 09:33:06 -04:00
polars_dataframe.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
powerpoint.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
psychic.py
multiple: Remove unnecessary Ruff suppression comments ( #21050 )
2024-04-30 17:13:48 +00:00
pubmed.py
community[patch]: upgrade to recent version of mypy ( #21616 )
2024-05-13 14:55:07 -04:00
pyspark_dataframe.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
python.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
quip.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
readthedocs.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
recursive_url_loader.py
docs: fix broken Appearance of langchain_community/document_loaders/recursive_url_loader API Reference ( #29305 )
2025-01-20 10:56:59 -05:00
reddit.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
roam.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
rocksetdb.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
rspace.py
community: Bump ruff version to 0.9 ( #29206 )
2025-02-08 01:21:10 +00:00
rss.py
multiple: Remove unnecessary Ruff suppression comments ( #21050 )
2024-04-30 17:13:48 +00:00
rst.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
rtf.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
s3_directory.py
community[patch]: Skip nested directories when using S3DirectoryLoader ( #17829 )
2024-03-08 16:50:58 -08:00
s3_file.py
community[patch]: support unstructured_kwargs for s3 loader ( #15473 )
2024-03-27 22:03:48 +00:00
scrapfly.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
scrapingant.py
community[minor]: Add ScrapingAnt Loader Community Integration ( #24514 )
2024-07-24 21:11:43 -04:00
sharepoint.py
community: Allow other than default parsers in SharePointLoader and OneDriveLoader ( #27716 )
2024-11-06 17:44:34 -05:00
sitemap.py
community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) ( #22903 )
2024-06-14 13:04:40 -04:00
slack_directory.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
snowflake_loader.py
community[patch]: upgrade to recent version of mypy ( #21616 )
2024-05-13 14:55:07 -04:00
spider.py
doc list not empty ( #21208 )
2024-05-20 08:24:06 -07:00
spreedly.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
sql_database.py
community[patch]: restore compatibility with SQLAlchemy 1.x ( #22546 )
2024-06-19 17:58:57 +00:00
srt.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
stripe.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
surrealdb.py
community[patch]: SurrealDB fix for asyncio ( #16092 )
2024-01-23 19:46:19 -08:00
telegram.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
tencent_cos_directory.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
tencent_cos_file.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
tensorflow_datasets.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
text.py
community: better support of pathlib paths in document loaders ( #18396 )
2024-03-26 11:51:52 -04:00
tidb.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
tomarkdown.py
community[patch]: Update URL to the 2markdown API ( #24546 )
2024-07-23 14:27:55 +00:00
toml.py
community: Use default load() implementation in doc loaders ( #18385 )
2024-03-01 14:46:52 -05:00
trello.py
community: Implement lazy_load() for TrelloLoader ( #18658 )
2024-03-06 13:04:36 -05:00
tsv.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
twitter.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
unstructured.py
multiple: update removal targets ( #25361 )
2024-08-14 09:50:39 -04:00
url_playwright.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
url_selenium.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
url.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
vsdx.py
community[patch]: import flattening fix ( #20110 )
2024-04-10 13:01:19 -04:00
weather.py
infra: update mypy 1.10, ruff 0.5 ( #23721 )
2024-07-03 10:33:27 -07:00
web_base.py
community: Corrected aload func to be asynchronous from webBaseLoader ( #28337 )
2024-12-20 14:42:52 -05:00
whatsapp_chat.py
community: Implement lazy_load() for WhatsAppChatLoader ( #18677 )
2024-03-06 13:03:46 -05:00
wikipedia.py
community[patch]: upgrade to recent version of mypy ( #21616 )
2024-05-13 14:55:07 -04:00
word_document.py
community: add init for unstructured file loader ( #29101 )
2025-01-13 09:26:00 -05:00
xml.py
all: test 3.13 ci ( #27197 )
2024-10-25 12:56:58 -07:00
xorbits.py
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community ( #14463 )
2023-12-11 13:53:30 -08:00
youtube.py
community:Fix for Pydantic model validator of GoogleApiYoutubeLoader ( #29694 )
2025-02-10 08:57:58 -05:00
yuque.py
community[minor]: add Yuque document loader ( #17924 )
2024-03-05 15:54:07 -08:00