langchain/libs/community/tests/unit_tests/document_loaders
Philippe PRADOS 2921597c71
community[patch]: Refactoring PDF loaders: 01 prepare (#29062)
- **Refactoring PDF loaders step 1**: "community: Refactoring PDF
loaders to standardize approaches"

- **Description:** Declare CloudBlobLoader in __init__.py. file_path is
Union[str, PurePath] anywhere
- **Twitter handle:** pprados

This is one part of a larger Pull Request (PR) that is too large to be
submitted all at once.
This specific part focuses to prepare the update of all parsers.

For more details, see [PR
28970](https://github.com/langchain-ai/langchain/pull/28970).

@eyurtsev it's the start of a PR series.
2025-01-07 11:00:04 -05:00
..
blob_loaders multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
loaders multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
parsers Langchain_Community: SQL LanguageParser (#28430) 2024-12-19 20:30:57 +00:00
sample_documents
test_docs community: Fix CSVLoader columns is None (#20701) 2024-05-22 12:57:46 -07:00
__init__.py
test_airbyte.py
test_arcgis_loader.py
test_assemblyai.py
test_bibtex.py
test_bshtml.py
test_confluence.py community: support Confluence cookies (#28760) 2024-12-17 12:16:36 -05:00
test_couchbase.py
test_csv_loader.py community[patch]: added content_columns option to CSVLoader (#23809) 2024-09-02 20:25:53 +00:00
test_cube_semantic.py
test_detect_encoding.py
test_directory_loader.py community: Fix CSVLoader columns is None (#20701) 2024-05-22 12:57:46 -07:00
test_directory.py community: glob multiple patterns when using DirectoryLoader (#22852) 2024-06-18 09:24:50 -07:00
test_evernote_loader.py
test_generic_loader.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_git.py
test_github.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
test_hugging_face_model.py
test_hugging_face.py
test_imports.py community[patch]: Refactoring PDF loaders: 01 prepare (#29062) 2025-01-07 11:00:04 -05:00
test_json_loader.py community: fixes json loader not getting texts with json standard (#27327) 2024-12-12 19:33:45 +00:00
test_lakefs.py community[patch]: Add missing annotations (#24890) 2024-07-31 18:13:44 +00:00
test_mediawikidump.py
test_mhtml.py
test_mongodb.py community: Enhance MongoDBLoader with flexible metadata and optimized field extraction (#23376) 2024-09-17 10:23:17 -04:00
test_needle.py community: add Needle retriever and document loader integration (#28157) 2024-12-03 22:06:25 +00:00
test_notebook.py
test_notiondb_loader.py community: Correctly handle multi-element rich text (#25762) 2024-12-16 20:20:27 +00:00
test_obsidian.py
test_onenote.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
test_oracleadb.py
test_pdf.py community[patch]: add to pypdf tests and run in CI (#26663) 2024-09-19 14:45:49 +00:00
test_pebblo.py [community] Added PebbloTextLoader for loading text data in PebbloSafeLoader (#26582) 2024-09-19 09:59:04 -04:00
test_psychic.py community[patch]: Add missing annotations (#24890) 2024-07-31 18:13:44 +00:00
test_readthedoc.py
test_recursive_url_loader.py community[patch]: recursive url loader fix and unit tests (#22521) 2024-06-05 17:56:20 -07:00
test_rspace_loader.py community[patch]: Add missing annotations (#24890) 2024-07-31 18:13:44 +00:00
test_rss.py
test_trello.py
test_web_base.py community: Corrected aload func to be asynchronous from webBaseLoader (#28337) 2024-12-20 14:42:52 -05:00
test_youtube.py community[patch]: Load YouTube transcripts (captions) as fixed-duration chunks with start times (#21710) 2024-06-11 17:44:36 +00:00