langchain/libs/community/tests/integration_tests/document_loaders
Philippe PRADOS f3fb5a9c68
community[minor]: Fix json._validate_metadata_func() (#22842)
JSONparse, in _validate_metadata_func(), checks the consistency of the
_metadata_func() function. To do this, it invokes it and makes sure it
receives a dictionary in response. However, during the call, it does not
respect future calls, as shown on line 100. This generates errors if,
for example, the function is like this:
```python
        def generate_metadata(json_node:Dict[str,Any],kwargs:Dict[str,Any]) -> Dict[str,Any]:
             return {
                "source": url,
                "row": kwargs['seq_num'],
                "question":json_node.get("question"),
            }
        loader = JSONLoader(
            file_path=file_path,
            content_key="answer",
            jq_schema='.[]',
            metadata_func=generate_metadata,
            text_content=False)
```
To avoid this, the verification must comply with the specifications.
This patch does just that.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-12-13 21:24:20 +00:00
..
parsers infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
__init__.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_arxiv.py community[patch]: Skip unexpected 404 HTTP Error in Arxiv download (#21042) 2024-04-30 18:29:22 +00:00
test_astradb.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_bigquery.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_bilibili.py community[patch]: fix bugs for bilibili Loader (#18036) 2024-03-28 16:39:38 -07:00
test_blockchain.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
test_cassandra.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_confluence.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_couchbase.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
test_csv_loader.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_dataframe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_dedoc.py community[minor]: added new document loaders based on dedoc library (#24303) 2024-07-23 02:04:53 +00:00
test_docusaurus.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_duckdb.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_email.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_etherscan.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_excel.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_facebook_chat.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_fauna.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_figma.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_geodataframe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_gitbook.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
test_github.py community[patch]: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval (#16934) 2024-02-12 18:30:36 -08:00
test_google_speech_to_text.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_ifixit.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_joplin.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_json_loader.py community[minor]: Fix json._validate_metadata_func() (#22842) 2024-12-13 21:24:20 +00:00
test_lakefs.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_language.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_larksuite.py community[minor]: Add LarkSuite wiki document loader. (#21016) 2024-04-29 10:37:50 -04:00
test_llmsherpa.py community[minor]: add support for llmsherpa (#19741) 2024-03-29 16:04:57 -07:00
test_mastodon.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_max_compute.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_modern_treasury.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_news.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_nuclia.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_odt.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_oracleds.py community[minor]: Oraclevs integration (#21123) 2024-05-04 03:15:35 +00:00
test_org_mode.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_pdf.py community[patch]: add to pypdf tests and run in CI (#26663) 2024-09-19 14:45:49 +00:00
test_polars_dataframe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_pubmed.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
test_pyspark_dataframe_loader.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_python.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_quip.py community[patch]: Add missing annotations (#24890) 2024-07-31 18:13:44 +00:00
test_recursive_url_loader.py community[patch]: fix integrated test case test_recursive_url_loader.py assertions (issue-20919) (#20920) 2024-04-26 10:00:08 -04:00
test_rocksetdb.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_rss.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_rst.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_sitemap.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_slack.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_spreedly.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_sql_database.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_stripe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_telegram.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
test_tensorflow_datasets.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
test_tidb.py community[minor]: Add tidb loader support (#17788) 2024-02-21 16:42:33 -08:00
test_tsv.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_unstructured.py community[patch]: Load list of files using UnstructuredFileLoader (#16216) 2024-01-23 19:37:37 -08:00
test_url_playwright.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_url.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_whatsapp_chat.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_wikipedia.py infra: update mypy 1.10, ruff 0.5 (#23721) 2024-07-03 10:33:27 -07:00
test_xml.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
test_xorbits.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00