langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-06 19:48:26 +00:00

Author	SHA1	Message	Date
Bagatur	11df1b2b8d	core[patch]: Release 0.3.9 (#27117 )	2024-10-04 18:35:33 +00:00
Scott Hurrey	558fb4d66d	box: Add citation support to langchain_box.retrievers.BoxRetriever when used with Box AI (#27012 ) Thank you for contributing to LangChain! Description: Box AI can return responses, but it can also be configured to return citations. This change allows the developer to decide if they want the answer, the citations, or both. Regardless of the combination, this is returned as a single List[Document] object. Dependencies: Updated to the latest Box Python SDK, v1.5.1 Twitter handle: BoxPlatform - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-04 18:32:34 +00:00
Bagatur	1e768a9ec7	anthropic[patch]: correctly handle tool msg with empty list (#27109 )	2024-10-04 11:30:50 -07:00
Bagatur	4935a14314	core,integrations[minor]: Dont error on fields in model_kwargs (#27110 ) Given the current erroring behavior, every time we've moved a kwarg from model_kwargs and made it its own field that was a breaking change. Updating this behavior to support the old instantiations / serializations. Assuming build_extra_kwargs was not something that itself is being used externally and needs to be kept backwards compatible	2024-10-04 11:30:27 -07:00
Bagatur	0495b7f441	anthropic[patch]: add usage_metadata details (#27087 ) fixes https://github.com/langchain-ai/langchain/pull/27087	2024-10-04 08:46:49 -07:00
Erick Friis	e8e5d67a8d	openai: fix None token detail (#27091 ) happens in Azure	2024-10-04 01:25:38 +00:00
Erick Friis	ab4dab9a0c	core: fix batch race condition in FakeListChatModel (#26924 ) fixed #26273	2024-10-03 23:14:31 +00:00
Bagatur	87fc5ce688	core[patch]: exclude model cache from ser (#27086 )	2024-10-03 22:00:31 +00:00
Bagatur	c09da53978	openai[patch]: add usage metadata details (#27080 )	2024-10-03 14:01:03 -07:00
Bagatur	546dc44da5	core[patch]: add UsageMetadata details (#27072 )	2024-10-03 20:36:17 +00:00
Tibor Reiss	47a9199fa6	community[patch]: Fix missing protected_namespaces (#27076 ) Fixes #26861 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-03 20:12:11 +00:00
Bagatur	2a54448a0a	langchain[patch]: Release 0.3.2 (#27073 )	2024-10-03 18:13:23 +00:00
Bharat Ramanathan	103e573f9b	community[patch]: chore warn deprecate the wandb callback handler (#27062 ) - Description:: This PR deprecates the wandb callback handler in favor of the new [WeaveTracer](https://weave-docs.wandb.ai/guides/integrations/langchain#using-weavetracer) in W&B - Dependencies: No dependencies, just a deprecation warning. - Twitter handle: @parambharat @baskaryan	2024-10-03 11:59:20 -04:00
Eugene Yurtsev	635c55c039	core[patch]: Release 0.3.8 (#27046 ) 0.3.8 release for core	2024-10-02 16:58:38 +00:00
Eugene Yurtsev	74bf620e97	core[patch]: Support injected tool args that are arbitrary types (#27045 ) This adds support for inject tool args that are arbitrary types when used with pydantic 2. We'll need to add similar logic on the v1 path, and potentially mirror the config from the original model when we're doing the subset.	2024-10-02 12:50:58 -04:00
Bagatur	099235da01	Revert "huggingface[patch]: make HuggingFaceEndpoint serializable (#2… (#27032 ) …7027)" This reverts commit `b5e28d3a6d`.	2024-10-01 21:26:38 +00:00
Bagatur	5f2e93ffea	huggingface[patch]: xfail test (#27031 )	2024-10-01 21:14:07 +00:00
Bagatur	b5e28d3a6d	huggingface[patch]: make HuggingFaceEndpoint serializable (#27027 )	2024-10-01 13:16:10 -07:00
ccurme	9d10151123	core[patch]: fix init of RunnableAssign (#26903 ) Example in API ref currently raises ValidationError. Resolves https://github.com/langchain-ai/langchain/issues/26862	2024-10-01 14:21:54 -04:00
Erick Friis	95a87291fd	community: deprecate community ollama integrations (#26733 )	2024-10-01 09:18:07 -07:00
ZhangShenao	e317d457cf	Bug-Fix[Community] Fix `FastEmbedEmbeddings` (#26764 ) #26759 - Fix https://github.com/langchain-ai/langchain/issues/26759 - Change `model` param from private to public, which may not be initiated. - Add test case	2024-09-30 21:23:08 -04:00
Erick Friis	a8e1577f85	milvus: mv to external repo (#26920 )	2024-10-01 00:38:30 +00:00
Erick Friis	35f6393144	unstructured: mv to external repo (#26923 )	2024-09-30 17:38:21 -07:00
Erick Friis	7ecd720120	multiple: update docs urls to latest 2 (#26837 )	2024-09-30 17:37:07 -07:00
Tomaz Bratanic	446144e7c6	Update neo4j vector procedures (#26775 )	2024-09-30 14:45:09 -07:00
federico-pisanu	2538963945	core[patch]: improve index/aindex api when batch_size<n_docs (#25754 ) - Description: prevent index function to re-index entire source document even if nothing has changed. - Issue: #22135 I worked on a solution to this issue that is a compromise between being cheap and being fast. In the previous code, when batch_size is greater than the number of docs from a certain source almost the entire source is deleted (all documents from that source except for the documents in the first batch) My solution deletes documents from vector store and record manager only if at least one document has changed for that source. Hope this can help! --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-30 20:57:41 +00:00
Eugene Yurtsev	7fde2791dc	core[patch]: Add kwargs to Runnable (#27008 ) Fixes #26685 --------- Co-authored-by: Tibor Reiss <tibor.reiss@gmail.com>	2024-09-30 16:45:29 -04:00
Christophe Bornet	2a6abd3f0a	community[patch]: Add docstring for Links (#25969 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-30 20:33:50 +00:00
Mohammad Mohtashim	e12f570ead	Merge pull request #26794 * [chore]: Agent Observation should be casted to string to avoid errors * Merge branch 'master' into fix_observation_type_streaming * [chore]: Using Json.dumps * [chore]: Exact same logic as when casting agent oobservation to string	2024-09-30 15:54:51 -04:00
Bagatur	34bd718fe1	core[patch]: Release 0.3.7 (#27004 )	2024-09-30 18:52:42 +00:00
Bagatur	248be02259	core[patch]: fix structured prompt template format (#27003 ) template_format is an init argument on ChatPromptTemplate but not an attribute on the object so was getting shoved into StructuredPrompt.structured_ouptut_kwargs	2024-09-30 11:47:46 -07:00
Bagatur	0078493a80	fireworks[patch]: allow tool_choice with multiple tools (#26999 ) https://docs.fireworks.ai/api-reference/post-chatcompletions	2024-09-30 11:28:43 -07:00
Bagatur	c7120d87dd	groq[patch]: support tool_choice=any/required (#27000 ) https://console.groq.com/docs/api-reference#chat-create	2024-09-30 11:28:35 -07:00
Christophe Bornet	db8845a62a	core: Add ruff rules for pycodestyle Warning (W) (#26964 ) All auto-fixes.	2024-09-30 09:31:43 -04:00
Bagatur	9404e7af9d	openai[patch]: exclude http client (#26891 ) httpx clients aren't serializable	2024-09-29 11:16:27 -07:00
Ben Chambers	29bf89db25	community: Add conversions from GVS to networkx (#26906 ) These allow converting linked documents (such as those used with GraphVectorStore) to networkx for rendering and/or in-memory graph algorithms such as community detection.	2024-09-27 16:48:55 -04:00
Christophe Bornet	7809b31b95	core[patch]: Add ruff rules for flake8-simplify (SIM) (#26848 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-27 20:13:23 +00:00
Eugene Yurtsev	de0b48c41a	docs: Upgrade examples with RunnableWithMessageHistory to langgraph memory (#26855 ) This PR updates the documentation examples that used RunnableWithMessageHistory to show how to achieve the same implementation with langgraph memory. Some of the underlying PRs (not all of them): - docs[patch]: update chatbot tutorial and migration guide (#26780) - docs[patch]: update chatbot memory how-to (#26790) - docs[patch]: update chatbot tools how-to (#26816) - docs: update chat history in rag how-to (#26821) - docs: update trim messages notebook (#26793) - docs: clean up imports in how to guide for rag qa with chat history (#26825) - docs[patch]: update conversational rag tutorial (#26814) --------- Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Vadym Barda <vadym@langchain.dev> Co-authored-by: mercyspirit <ziying.qiu@gmail.com> Co-authored-by: aqiu7 <aqiu7@gatech.edu> Co-authored-by: John <43506685+Coniferish@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Subhrajyoty Roy <subhrajyotyroy@gmail.com> Co-authored-by: Rajendra Kadam <raj.725@outlook.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com> Co-authored-by: Devin Gaffney <itsme@devingaffney.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-09-27 20:04:30 +00:00
Christophe Bornet	f4e738bb40	core: Add ruff rules for PIE (#26939 ) All auto-fixes.	2024-09-27 12:08:35 -04:00
ccurme	39987ebd91	openai[patch]: update deprecation target in API ref (#26921 )	2024-09-27 08:42:31 -04:00
Subhrajyoty Roy	7f37fd8b80	community[patch]: callback before yield for cloudflare (#26927 ) Description: Moves yield to after callback for `_stream` function for the cloudfare workersai model in the community llm package Issue: #16913	2024-09-27 08:42:01 -04:00
Youshin Kim	2d9a09dfa4	Fix typo in mlflow code example in mlflow.py (#26931 ) - [x] PR title: Fix typo in code example in mlflow.py - In libs/community/langchain_community/chat_models/mlflow.py	2024-09-27 12:41:39 +00:00
Subhrajyoty Roy	7037ba0f06	community[patch]: callback before yield for mlx pipeline (#26928 ) Description: Moves yield to after callback for `_stream` function for the MLX pipeline model in the community llm package Issue: #16913	2024-09-27 08:41:34 -04:00
Subhrajyoty Roy	adcfecdb67	community[patch]: callback before yield for textgen (#26929 ) Description: Moves callback to before yield for `_stream` and `_astream` function for the textgen model in the community llm package Issue: #16913	2024-09-27 08:41:13 -04:00
Subhrajyoty Roy	5f2cc4ecb2	community[patch]: callback before yield for titan takeoff (#26930 ) Description: Moves yield to after callback for `_stream` function for the titan takeoff model in the community llm package Issue: #16913	2024-09-27 08:40:22 -04:00
Emmanuel Sciara	c6350d636e	core[fix]: using async rate limiter methods in async code (#26914 ) Description: Replaced blocking (sync) rate_limiter code in async methods. Issue: #26913 Dependencies: N/A Twitter handle: no need 🤗	2024-09-26 20:44:28 +00:00
Abhi Agarwal	696114e145	community: add sqlite-vec vectorstore (#25003 ) Description: Adds a vector store integration with [sqlite-vec](https://alexgarcia.xyz/sqlite-vec/), the successor to sqlite-vss that is a single C file with no external dependencies. Pretty straightforward, just copy-pasted the sqlite-vss integration and made a few tweaks and added integration tests. Only question is whether all documentation should be directed away from sqlite-vss if it is defacto deprecated (cc @asg017). --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: philippe-oger <philippe.oger@adevinta.com>	2024-09-26 17:37:10 +00:00
Erick Friis	8bc12df2eb	voyageai: new models (#26907 ) Co-authored-by: fzowl <zoltan@voyageai.com> Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>	2024-09-26 17:07:10 +00:00
Subhrajyoty Roy	ba467f1a36	community[patch]: callback before yield for gigachat (#26881 ) Description: Moves yield to after callback for `_stream` and `_astream` function for the gigachat model in the community llm package Issue: #16913	2024-09-26 12:47:28 -04:00
Subhrajyoty Roy	11e703a97e	community[patch]: callback before yield for google palm (#26882 ) Description: Moves yield to after callback for `_stream` function for the google palm model in the community package Issue: #16913	2024-09-26 12:47:05 -04:00
Julius Stopforth	121e79b1f0	core: Fix `IndexError` when `trim_messages` invoked with empty list (#26896 ) This prevents `trim_messages` from raising an `IndexError` when invoked with `include_system=True`, `strategy="last"`, and an empty message list. Fixes #26895 Dependencies: none	2024-09-26 11:29:58 -04:00
ccurme	7091a1a798	openai[patch]: increase token limit in azure integration tests (#26901 ) `test_json_mode` occasionally runs into this	2024-09-26 14:31:33 +00:00
Erick Friis	2ea5f60cc5	experimental: migrate to external repo (#26879 ) security scanners can't distinguish monorepo sources from each other. this will resolve issues for folks trying to use e.g. langchain-core but getting security issues from experimental flagged!	2024-09-25 19:02:19 -07:00
Erick Friis	6f3c8313ba	community: bump langchain version (#26876 )	2024-09-25 12:58:24 -07:00
Erick Friis	e068407f18	community: bump core versoin (#26875 )	2024-09-25 12:57:16 -07:00
Eugene Yurtsev	25cb44c9ee	0.3.1 release community (#26872 ) Release for 0.3.1 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-25 19:38:53 +00:00
Erick Friis	9a31ad6f60	langchain: release 0.3.1 (#26868 )	2024-09-25 11:43:54 -07:00
Erick Friis	ef2ab26113	core: release 0.3.6 (#26863 )	2024-09-25 11:05:53 -07:00
Bagatur	eaffa92c1d	openai[patch]: Release 0.2.1 (#26858 )	2024-09-25 15:55:49 +00:00
Rajendra Kadam	51c4393298	community[patch]: Fix validation error in SettingsConfigDict across multiple Langchain modules (#26852 ) - Description: This pull request addresses the validation error in `SettingsConfigDict` due to extra fields in the `.env` file. The issue is prevalent across multiple Langchain modules. This fix ensures that extra fields in the `.env` file are ignored, preventing validation errors. Changes include: - Applied fixes to modules using `SettingsConfigDict`. - Issue: NA, similar https://github.com/langchain-ai/langchain/issues/26850 - Dependencies: NA	2024-09-25 10:02:14 -04:00
Christophe Bornet	3a1b9259a7	core: Add ruff rules for comprehensions (C4) (#26829 )	2024-09-25 09:34:17 -04:00
Rajendra Kadam	7e5a9c317f	community[minor]: [Pebblo] Enhance PebbloSafeLoader to take anonymize flag (#26812 ) - Description: The flag is named `anonymize_snippets`. When set to true, the Pebblo server will anonymize snippets by redacting all personally identifiable information (PII) from the snippets going into VectorDB and the generated reports - Issue: NA - Dependencies: NA - docs: Updated	2024-09-25 09:33:06 -04:00
Rajendra Kadam	92003b3724	community[patch]: [SharePointLoader] Fix validation error in _O365Settings due to extra fields in .env file (#26851 ) Description: Fix validation error in _O365Settings by ignoring extra fields in .env file Issue: https://github.com/langchain-ai/langchain/issues/26850 Dependencies: NA	2024-09-25 09:31:59 -04:00
Subhrajyoty Roy	b61fb98466	community[patch]: callback before yield for friendli (#26842 ) Description: Moves yield to after callback for `_stream` and `_astream` function for the friendli model in the community package Issue: #16913	2024-09-25 09:31:12 -04:00
ccurme	13acf9e6b0	langchain[patch]: add deprecation warnings (#26853 )	2024-09-25 09:26:44 -04:00
William FH	82b5b77940	[Core] Add more interops tests (#26841 ) To test that the client propagates both ways	2024-09-24 20:18:20 -07:00
William FH	9b6ac41442	[Core] Inherit tracing metadata & tags (#26838 )	2024-09-24 19:33:12 -07:00
Erick Friis	425c0f381f	experimental: release 0.3.1 (#26830 )	2024-09-24 15:03:05 -07:00
John	6c3ea262c8	partners/unstructured: release 0.1.5 (#26831 ) Description: update package version to support loading URLs #26670 Issue: #26697	2024-09-24 15:02:53 -07:00
mercyspirit	0414be4b80	experimental[major]: CVE-2024-46946 fix (#26783 ) Description: Resolve CVE-2024-46946 by switching out sympify with parse_expr with a very specific allowed set of operations. https://nvd.nist.gov/vuln/detail/cve-2024-46946 Sympify uses eval which makes it vulnerable to code execution. parse_expr is limited to specific expressions. Bandit results ![image](https://github.com/user-attachments/assets/170a6376-7028-4e70-a7ef-9acfb49c1d8a) --------- Co-authored-by: aqiu7 <aqiu7@gatech.edu> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-24 21:37:56 +00:00
Subhrajyoty Roy	b1da532522	community[patch]: callback before yield for deepsparse llm (#26822 ) Description: Moves yield to after callback for `_stream` and `_astream` function for the deepsparse model in the community package Issue: #16913	2024-09-24 13:55:52 -04:00
Nuno Campos	de70a64e3a	core: Run LangChainTracer inline (#26797 ) - this flag ensures the tracer always runs in the same thread as the run being traced for both sync and async runs - pro: less chance for ordering bugs and other oddities - blocking the event loop is not a concern given all code in the tracer holds the GIL anyway	2024-09-24 08:31:18 -07:00
Jorge Piedrahita Ortiz	408a930d55	community: Add Sambanova Cloud Chat model community integration (#26333 ) Description: : Add SambaNova Cloud Chat model community integration Includes - chat model integration (following Standardize ChatModel docstrings) - tests - docs usage notebook (following Standardize ChatModel integration docs) https://cloud.sambanova.ai/ --------- Co-authored-by: luisfucros <luisfucros@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-09-24 14:11:32 +00:00
Tom	2b83c7c3ab	community[patch]: Fix `tool_calls` parsing when streaming from DeepInfra (#26813 ) - Description: This PR fixes the response parsing logic for `ChatDeepInfra`, more specifially `_convert_delta_to_message_chunk()`, which is invoked when streaming via `ChatDeepInfra`. - Issue: Streaming from DeepInfra via `ChatDeepInfra` is currently broken because the response parsing logic doesn't handle that `tool_calls` can be `None`. (There is no GitHub issue for this problem yet.) - Dependencies: – - Twitter handle: – Keeping this here as a reminder: > If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-09-24 13:47:36 +00:00
Subhrajyoty Roy	997d95c8f8	community[patch]: callback before yield for bedrock llm (#26804 ) Description: Moves yield to after callback for `_prepare_input_and_invoke_stream` and `_aprepare_input_and_invoke_stream` for bedrock llm in community package. Issue: #16913	2024-09-24 12:14:59 +00:00
ccurme	2a4c5713cd	openai[patch]: fix azure integration tests (#26791 )	2024-09-23 17:49:15 -04:00
Mohammad Mohtashim	154a5ff7ca	core[patch]: On Chain Start Fix for `Chain` Class (#26593 ) - Issue: #26588 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-23 19:30:59 +00:00
ccurme	bba7af903b	core[patch]: set default on Blob (#26787 ) Resolves https://github.com/langchain-ai/langchain/issues/26781	2024-09-23 18:55:56 +00:00
ccurme	97b27f0930	langchain[patch]: fix extended tests (#26788 ) Broken by addition of `disabled_params`	2024-09-23 18:52:09 +00:00
Bagatur	e1e4f88b3e	openai[patch]: enable Azure structured output, parallel_tool_calls=Fa… (#26599 ) …lse, tool_choice=required response_format=json_schema, tool_choice=required, parallel_tool_calls are all supported for gpt-4o on azure.	2024-09-22 22:25:22 -07:00
Gabriel Altay	bb40a0fb32	Remove pydantic restricted namespaces from HuggingFaceInferenceAPIEmbedings (#26744 ) without this `model_config` importing this package produces warnings about "model_name" having conflicts with protected namespace "model_". Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-22 08:05:37 -04:00
Gor Hayrapetyan	f97ac92f00	community[patch]: Handle empty PR body in get_pull_request in Github utility (#26739 ) Description: When PR body is empty `get_pull_request` method fails with bellow exception. Issue: ``` TypeError('expected string or buffer')Traceback (most recent call last): File ".../.venv/lib/python3.9/site-packages/langchain_core/tools/base.py", line 661, in run response = context.run(self._run, tool_args, tool_kwargs) File ".../.venv/lib/python3.9/site-packages/langchain_community/tools/github/tool.py", line 52, in _run return self.api_wrapper.run(self.mode, query) File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 816, in run return json.dumps(self.get_pull_request(int(query))) File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 495, in get_pull_request add_to_dict(response_dict, "body", pull.body) File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 487, in add_to_dict tokens = get_tokens(value) File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 483, in get_tokens return len(tiktoken.get_encoding("cl100k_base").encode(text)) File "....venv/lib/python3.9/site-packages/tiktoken/core.py", line 116, in encode if match := _special_token_regex(disallowed_special).search(text): TypeError: expected string or buffer ``` Twitter:* __gorros__	2024-09-22 01:56:24 +00:00
Erick Friis	238a31bbd9	core: release 0.3.5 (#26737 )	2024-09-21 00:26:39 +00:00
William FH	55af6fbd02	[LangChainTracer] Omit Chunk (#26602 ) in events / new llm token	2024-09-20 17:10:34 -07:00
Anton Dubovik	3e2cb4e8a4	openai: embeddings: supported chunk_size when check_embedding_ctx_length is disabled (#23767 ) Chunking of the input array controlled by `self.chunk_size` is being ignored when `self.check_embedding_ctx_length` is disabled. Effectively, the chunk size is assumed to be equal 1 in such a case. This is suprising. The PR takes into account `self.chunk_size` passed by the user. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 16:58:45 -07:00
William FH	864020e592	[Tracer] add project name to run from tracer (#26736 )	2024-09-20 16:48:37 -07:00
Nithish Raghunandanan	2d21274bf6	couchbase: Add ttl support to caches & chat_message_history (#26214 ) Description: Add support to delete documents automatically from the caches & chat message history by adding a new optional parameter, `ttl`. - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 23:44:29 +00:00
Krishna Kulkarni	c6c508ee96	Refining Skip Count Calculation by Filtering Documents with `session_id` (#26020 ) In the previous implementation, `skip_count` was counting all the documents in the collection. Instead, we want to filter the documents by `session_id` and calculate `skip_count` by subtracting `history_size` from the filtered count. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-20 23:40:56 +00:00
Tibor Reiss	a8b24135a2	fix[experimental]: Fix text splitter with gradient (#26629 ) Fixes #26221 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 23:35:50 +00:00
Alejandro Rodríguez	4ac9a6f52c	core: fix "template" not allowed as prompt param (#26060 ) - Description: fix "template" not allowed as prompt param - Issue: #26058 - Dependencies: none - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 23:33:06 +00:00
Christophe Bornet	58f339a67c	community: Fix links in GraphVectorStore pydoc (#25959 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 23:17:53 +00:00
Christophe Bornet	e49c413977	core: Add docstring for GraphVectorStoreRetriever (#26224 ) Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-09-20 23:16:37 +00:00
Lucain	a2023a1e96	huggingface; fix huggingface_endpoint.py (initialize clients only with supported kwargs) (#26378 ) ## Description By default, `HuggingFaceEndpoint` instantiates both the `InferenceClient` and the `AsyncInferenceClient` with the `"server_kwargs"` passed as input. This is an issue as both clients might not support exactly the same kwargs. This has been highlighted in https://github.com/huggingface/huggingface_hub/issues/2522 by @morgandiverrez with the `trust_env` parameter. In order to make `langchain` integration future-proof, I do think it's wiser to forward only the supported parameters to each client. Parameters that are not supported are simply ignored with a warning to the user. From a `huggingface_hub` maintenance perspective, this allows us much more flexibility as we are not constrained to support the exact same kwargs in both clients. ## Issue https://github.com/huggingface/huggingface_hub/issues/2522 ## Dependencies None ## Twitter https://x.com/Wauplin --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 16:05:24 -07:00
ccurme	f2285376a5	community[patch]: add web loader tests (#26728 )	2024-09-20 18:29:54 -04:00
Erick Friis	4a2745064a	core: release 0.3.4 (#26729 )	2024-09-20 14:47:15 -07:00
Nuno Campos	345edeb1f0	core: In astream_events propagate cancellation reason to inner task (#26727 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-09-20 14:42:10 -07:00
Erick Friis	465e43cd43	core: release 0.3.3 (#26713 )	2024-09-20 13:54:19 -07:00
Eugene Yurtsev	4fc69d61ad	core[patch]: Fix defusedxml import (#26718 ) Fix defusedxml import. Haven't investigated what's actually going on under the hood -- defusedxml probably does some weird things in __init__	2024-09-20 16:53:24 -04:00
Eugene Yurtsev	79b224f6f3	core/langchain: fix version used in deprecation (#26724 ) in core deprecation should be version 0.3.3 instead of 0.3.4 in langchain deprecation should be version 0.3.1 instead of 0.3.4	2024-09-20 16:47:18 -04:00
Eugene Yurtsev	91f4711e53	core[patch],langchain[patch]: deprecate memory and entity abstractions and implementations (#26717 ) This PR deprecates the old memory, entity abstractions and implementations	2024-09-20 15:06:25 -04:00
William FH	19ce95d3c9	Avoid copying runs (#26689 ) Also, re-unify run trees. Use a single shared client.	2024-09-20 10:57:41 -07:00
Eric	90031b1b3e	support epsilla cloud vector database in langchain (#26065 ) Description - support epsilla cloud in langchain --------- Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-20 17:14:23 +00:00
ZhangShenao	baef7639fd	Improvement[text-splitter] Fix import of `ExperimentalMarkdownSyntaxTextSplitter` (#26703 ) #26028 Export `ExperimentalMarkdownSyntaxTextSplitter` in __init__ Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 17:04:31 +00:00
stein1988	91594928c5	fix:fix ChatZhipuAI tool call bug (#26693 ) - [ ] PR title: "community:fix ChatZhipuAI tool call bug" - [ ] Description: ZhipuAI api response as follows: {'id': '20240920132549e379a9152a6a4d7c', 'created': 1726809949, 'model': 'glm-4-flash', 'choices': [{'index': 0, 'finish_reason': 'tool_calls', 'delta': {'role': 'assistant', 'tool_calls': [{'id': 'call_20240920132549e379a9152a6a4d7c', 'index': 0, 'type': 'function', 'function': {'name': 'get_datetime_offline', 'arguments': '{}'}}]}}]} so, tool_calls = dct.get("tool_call", None) in _convert_delta_to_message_chunk should be "tool_calls"	2024-09-20 13:06:42 +00:00
Bagatur	f7bb3640f1	core[patch]: support js chat model namespaces (#26688 )	2024-09-19 16:14:20 -07:00
Bagatur	c453b76579	core[patch]: Release 0.3.2 (#26686 )	2024-09-19 14:58:45 -07:00
Piyush Jain	f087ab43fd	core[patch]: Fix load of ChatBedrock (#26679 ) Complementary PR to master for #26643. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-09-19 21:57:20 +00:00
Bagatur	409f35363b	core[patch]: support load from path for default namespaces (#26675 )	2024-09-19 14:47:27 -07:00
ccurme	eef18dec44	unstructured[patch]: support loading URLs (#26670 ) `unstructured.partition.auto.partition` supports a `url` kwarg, but `url` in `UnstructuredLoader.__init__` is reserved for the server URL. Here we add a `web_url` kwarg that is passed to the partition kwargs: ```python self.unstructured_kwargs["url"] = web_url ```	2024-09-19 11:40:25 -07:00
Erick Friis	311f861547	core, community: move graph vectorstores to community (#26678 ) remove beta namespace from core, add to community	2024-09-19 11:38:14 -07:00
Serena Ruan	c77c28e631	[community] Fix WorkspaceClient error with pydantic validation (#26649 ) Thank you for contributing to LangChain! Fix error like <img width="1167" alt="image" src="https://github.com/user-attachments/assets/2e219b26-ec7e-48ef-8111-e0ff2f5ac4c0"> After the fix: <img width="584" alt="image" src="https://github.com/user-attachments/assets/48f36fe7-628c-48b6-81b2-7fe741e4ca85"> - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: serena-ruan <serena.rxy@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-19 18:25:33 +00:00
ccurme	7d49ee9741	unstructured[patch]: add to integration tests (#26666 ) - Add to tests on parsed content; - Add tests for async + lazy loading; - Add a test for `strategy="hi_res"`.	2024-09-19 13:43:34 -04:00
ccurme	f91bdd12d2	community[patch]: add to pypdf tests and run in CI (#26663 )	2024-09-19 14:45:49 +00:00
Rajendra Kadam	60dc19da30	[community] Added PebbloTextLoader for loading text data in PebbloSafeLoader (#26582 ) - Description: Added PebbloTextLoader for loading text in PebbloSafeLoader. - Since PebbloSafeLoader wraps document loaders, this new loader enables direct loading of text into Documents using PebbloSafeLoader. - Issue: NA - Dependencies: NA - [x] Tests: Added/Updated tests	2024-09-19 09:59:04 -04:00
Jorge Piedrahita Ortiz	55b641b761	community: fix error in sambastudio embeddings (#26260 ) fix error in samba studio embeddings result unpacking	2024-09-19 09:57:04 -04:00
Jorge Piedrahita Ortiz	37b72023fe	community: remove sambaverse (#26265 ) removing Sambaverse llm model and references given is not available after Sep/10/2024 <img width="1781" alt="image" src="https://github.com/user-attachments/assets/4dcdb5f7-5264-4a03-b8e5-95c88304e059">	2024-09-19 09:56:30 -04:00
Martin Triska	3fc0ea510e	community : [bugfix] Use document ids as keys in AzureSearch vectorstore (#25486 ) # Description [Vector store base class](`4cdaca67dc/libs/core/langchain_core/vectorstores/base.py (L65)`) currently expects `ids` to be passed in and that is what it passes along to the AzureSearch vector store when attempting to `add_texts()`. However AzureSearch expects `keys` to be passed in. When they are not present, AzureSearch `add_embeddings()` makes up new uuids. This is a problem when trying to run indexing. [Indexing code expects](`b297af5482/libs/core/langchain_core/indexing/api.py (L371)`) the documents to be uploaded using provided ids. Currently AzureSearch ignores `ids` passed from `indexing` and makes up new ones. Later when `indexer` attempts to delete removed file, it uses the `id` it had stored when uploading the document, however it was uploaded under different `id`. Twitter handle: @martintriska1	2024-09-19 09:37:18 -04:00
Tomaz Bratanic	a8561bc303	Fix async parsing for llm graph transformer (#26650 )	2024-09-19 09:15:33 -04:00
Erik	4e0a6ebe7d	community: Add warning when page_content is empty (#25955 ) Page content sometimes is empty when PyMuPDF can not find text on pages. For example, this can happen when the text of the PDF is not copyable "by hand". Then an OCR solution is need - which is not integrated here. This warning should accurately warn the user that some pages are lost during this process. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-19 05:22:09 +00:00
Christophe Bornet	fd21ffe293	core: Add N(naming) ruff rules (#25362 ) Public classes/functions are not renamed and rule is ignored for them. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-19 05:09:39 +00:00
Daniel Cooke	7835c0651f	langchain_chroma: Pass through kwargs to Chroma collection.delete (#25970 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-19 04:21:24 +00:00
Tibor Reiss	85caaa773f	docs[community]: Fix raw string in docstring (#26350 ) Fixes #26212: replaced the raw string with backslashes. Alternative: raw-stringif the full docstring. --------- Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-09-19 04:18:56 +00:00
Erick Friis	8fb643a6e8	partners/box: release 0.2.1 (#26644 )	2024-09-19 04:02:06 +00:00
Tomaz Bratanic	03b9aca55d	community: Retry retriable errors in Neo4j (#26211 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-19 04:01:07 +00:00
Scott Hurrey	acbb4e4701	box: Add searchoptions for BoxRetriever, documentation for BoxRetriever as agent tool (#26181 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" Added search options for BoxRetriever and added documentation to demonstrate how to use BoxRetriever as an agent tool - @BoxPlatform - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-09-18 21:00:06 -07:00
Erick Friis	9909354cd0	core: use ruff.target-version instead (#26634 ) tested on one of the replacement cases and seems to work! ![ScreenShot 2024-09-18 at 02 02 43PM](https://github.com/user-attachments/assets/7170975a-2542-43ed-a203-d4126c6a2c81)	2024-09-18 21:06:14 +00:00
Erick Friis	84b831356c	core: remove [project] tag from pyproject (#26633 ) makes core incompatible with uv installs	2024-09-18 20:39:49 +00:00
Christophe Bornet	a47b332841	core: Put Python version as a project requirement so it is considered by ruff (#26608 ) Ruff doesn't know about the python version in `[tool.poetry.dependencies]`. It can get it from `project.requires-python`. Notes: * poetry seems to have issues getting the python constraints from `requires-python` and using `python` in per dependency constraints. So I had to duplicate the info. I will open an issue on poetry. * `inspect.isclass()` doesn't work correctly with `GenericAlias` (`list[...]`, `dict[..., ...]`) on Python <3.11 so I added some `not isinstance(type, GenericAlias)` checks: Python 3.11 ```pycon >>> import inspect >>> inspect.isclass(list) True >>> inspect.isclass(list[str]) False ``` Python 3.9 ```pycon >>> import inspect >>> inspect.isclass(list) True >>> inspect.isclass(list[str]) True ``` Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-18 14:37:57 +00:00
ZhangShenao	c3b3f46cb8	Improvement[Community] Improve api doc of `BeautifulSoupTransformer` (#26423 ) - Add missing args Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-17 22:00:07 +00:00
ogawa	e2245fac82	community[patch]: o1-preview and o1-mini costs (#26411 ) updated OpenAI cost definitions according to the following: https://openai.com/api/pricing/ Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-17 21:59:46 +00:00
ZhangShenao	1a8e9023de	Improvement[Community] Improve `streamlit_callback_handler` (#26373 ) - add decorator for static methods Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-17 21:54:37 +00:00
Bagatur	1a62f9850f	anthropic[patch]: Release 0.2.1 (#26592 )	2024-09-17 14:44:21 -07:00
Bagatur	5ced41bf50	anthropic[patch]: fix tool call and tool res image_url handling (#26587 ) Co-authored-by: ccurme <chester.curme@gmail.com>	2024-09-17 14:30:07 -07:00
Christophe Bornet	c6bdd6f482	community: Fix references in link extractors docstrings (#26314 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-17 21:26:25 +00:00
Christophe Bornet	3a99467ccb	core[patch]: Add ruff rule UP006(use PEP585 annotations) (#26574 ) * Added rules `UPD006` now that Pydantic is v2+ --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-17 21:22:50 +00:00
wlleiiwang	2ef4c9466f	community: modify document links for tencent vectordb (#26316 ) - modify document links for create a tencent vectordb database instance. Co-authored-by: wlleiiwang <wlleiiwang@tencent.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-17 21:11:10 +00:00
Erick Friis	194adc485c	docs: pypi readme image links (#26590 )	2024-09-17 20:41:34 +00:00
Bagatur	97b05d70e6	docs: anthropic api ref nit (#26591 )	2024-09-17 20:39:53 +00:00
Bagatur	e1d113ea84	core,openai,grow,fw[patch]: deprecate bind_functions, update chat mod… (#26584 ) …el api ref	2024-09-17 11:32:39 -07:00
ccurme	7c05f71e0f	milvus[patch]: fix vectorstore integration tests (#26583 ) Resolves https://github.com/langchain-ai/langchain/issues/26564	2024-09-17 14:17:05 -04:00
Bagatur	145a49cca2	core[patch]: Release 0.3.1 (#26581 )	2024-09-17 17:34:09 +00:00
Nuno Campos	5fc44989bf	core[patch]: Fix "argument of type 'NoneType' is not iterable" error in LangChainTracer (#26576 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-17 10:29:46 -07:00
Isaac Francisco	06cde06a20	core[minor]: remove beta from RemoveMessage (#26579 )	2024-09-17 17:09:58 +00:00
RUO	0a177ec2cc	community: Enhance MongoDBLoader with flexible metadata and optimized field extraction (#23376 ) ### Description: This pull request significantly enhances the MongodbLoader class in the LangChain community package by adding robust metadata customization and improved field extraction capabilities. The updated class now allows users to specify additional metadata fields through the metadata_names parameter, enabling the extraction of both top-level and deeply nested document attributes as metadata. This flexibility is crucial for users who need to include detailed contextual information without altering the database schema. Moreover, the include_db_collection_in_metadata flag offers optional inclusion of database and collection names in the metadata, allowing for even greater customization depending on the user's needs. The loader's field extraction logic has been refined to handle missing or nested fields more gracefully. It now employs a safe access mechanism that avoids the KeyError previously encountered when a specified nested field was absent in a document. This update ensures that the loader can handle diverse and complex data structures without failure, making it more resilient and user-friendly. ### Issue: This pull request addresses a critical issue where the MongodbLoader class in the LangChain community package could throw a KeyError when attempting to access nested fields that may not exist in some documents. The previous implementation did not handle the absence of specified nested fields gracefully, leading to runtime errors and interruptions in data processing workflows. This enhancement ensures robust error handling by safely accessing nested document fields, using default values for missing data, thus preventing KeyError and ensuring smoother operation across various data structures in MongoDB. This improvement is crucial for users working with diverse and complex data sets, ensuring the loader can adapt to documents with varying structures without failing. ### Dependencies: Requires motor for asynchronous MongoDB interaction. ### Twitter handle: N/A ### Add tests and docs Tests: Unit tests have been added to verify that the metadata inclusion toggle works as expected and that the field extraction correctly handles nested fields. Docs: An example notebook demonstrating the use of the enhanced MongodbLoader is included in the docs/docs/integrations directory. This notebook includes setup instructions, example usage, and outputs. (Here is the notebook link : [colab link](https://colab.research.google.com/drive/1tp7nyUnzZa3dxEFF4Kc3KS7ACuNF6jzH?usp=sharing)) Lint and test Before submitting, I ran make format, make lint, and make test as per the contribution guidelines. All tests pass, and the code style adheres to the LangChain standards. ```python import unittest from unittest.mock import patch, MagicMock import asyncio from langchain_community.document_loaders.mongodb import MongodbLoader class TestMongodbLoader(unittest.TestCase): def setUp(self): """Setup the MongodbLoader test environment by mocking the motor client and database collection interactions.""" # Mocking the AsyncIOMotorClient self.mock_client = MagicMock() self.mock_db = MagicMock() self.mock_collection = MagicMock() self.mock_client.get_database.return_value = self.mock_db self.mock_db.get_collection.return_value = self.mock_collection # Initialize the MongodbLoader with test data self.loader = MongodbLoader( connection_string="mongodb://localhost:27017", db_name="testdb", collection_name="testcol" ) @patch('langchain_community.document_loaders.mongodb.AsyncIOMotorClient', return_value=MagicMock()) def test_constructor(self, mock_motor_client): """Test if the constructor properly initializes with the correct database and collection names.""" loader = MongodbLoader( connection_string="mongodb://localhost:27017", db_name="testdb", collection_name="testcol" ) self.assertEqual(loader.db_name, "testdb") self.assertEqual(loader.collection_name, "testcol") def test_aload(self): """Test the aload method to ensure it correctly queries and processes documents.""" # Setup mock data and responses for the database operations self.mock_collection.count_documents.return_value = asyncio.Future() self.mock_collection.count_documents.return_value.set_result(1) self.mock_collection.find.return_value = [ {"_id": "1", "content": "Test document content"} ] # Run the aload method and check responses loop = asyncio.get_event_loop() results = loop.run_until_complete(self.loader.aload()) self.assertEqual(len(results), 1) self.assertEqual(results[0].page_content, "Test document content") def test_construct_projection(self): """Verify that the projection dictionary is constructed correctly based on field names.""" self.loader.field_names = ['content', 'author'] self.loader.metadata_names = ['timestamp'] expected_projection = {'content': 1, 'author': 1, 'timestamp': 1} projection = self.loader._construct_projection() self.assertEqual(projection, expected_projection) if __name__ == '__main__': unittest.main() ``` ### Additional Example for Documentation Sample Data: ```json [ { "_id": "1", "title": "Artificial Intelligence in Medicine", "content": "AI is transforming the medical industry by providing personalized medicine solutions.", "author": { "name": "John Doe", "email": "john.doe@example.com" }, "tags": ["AI", "Healthcare", "Innovation"] }, { "_id": "2", "title": "Data Science in Sports", "content": "Data science provides insights into player performance and strategic planning in sports.", "author": { "name": "Jane Smith", "email": "jane.smith@example.com" }, "tags": ["Data Science", "Sports", "Analytics"] } ] ``` Example Code: ```python loader = MongodbLoader( connection_string="mongodb://localhost:27017", db_name="example_db", collection_name="articles", filter_criteria={"tags": "AI"}, field_names=["title", "content"], metadata_names=["author.name", "author.email"], include_db_collection_in_metadata=True ) documents = loader.load() for doc in documents: print("Page Content:", doc.page_content) print("Metadata:", doc.metadata) ``` Expected Output: ``` Page Content: Artificial Intelligence in Medicine AI is transforming the medical industry by providing personalized medicine solutions. Metadata: {'author_name': 'John Doe', 'author_email': 'john.doe@example.com', 'database': 'example_db', 'collection': 'articles'} ``` Thank you. --- Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-09-17 10:23:17 -04:00
Bagatur	d8952b8e8c	langchain[patch]: infer mistral provider in init_chat_model (#26557 )	2024-09-17 00:35:54 +00:00
Bagatur	99abd254fb	docs: clean up init_chat_model (#26551 )	2024-09-16 22:08:22 +00:00
Tomaz Bratanic	3bcd641bc1	Add check for prompt based approach in llm graph transformer (#26519 )	2024-09-16 15:01:09 -07:00
Eugene Yurtsev	c2588b334f	unstructured: release 0.1.4 (#26540 ) Release to work with langchain 0.3	2024-09-16 17:38:38 +00:00
Eugene Yurtsev	8b985a42e9	milvus: 0.1.6 release (#26538 ) Release to work with langchain 0.3	2024-09-16 13:33:09 -04:00
Eugene Yurtsev	5b4206acd8	box: 0.2.0 release (#26539 ) Release to work with langchain 0.3	2024-09-16 13:32:59 -04:00
ccurme	0592c29e9b	qdrant[patch]: release 0.1.4 (#26534 ) `langchain-qdrant` imports pydantic but was importing pydantic proper before 0.3 release: `042e84170b/libs/partners/qdrant/langchain_qdrant/sparse_embeddings.py (L5-L8)`	2024-09-16 13:04:12 -04:00
Eugene Yurtsev	88891477eb	langchain-cli: release 0.0.31 (#26533 ) langchain-cli 0.0.31 release	2024-09-16 12:57:24 -04:00
ccurme	88bc15d69b	standard-tests[patch]: add async test for structured output (#26527 )	2024-09-16 11:15:23 -04:00
Erick Friis	1ab181f514	voyageai: release 0.1.2 (#26512 )	2024-09-16 03:11:15 +00:00
Erick Friis	ee4e11379f	nomic: release 0.1.3, core 0.3 compat but not required (#26511 )	2024-09-15 20:10:25 -07:00
Erick Friis	4131be63af	multiple: 0.3.0 not dev version (#26502 )	2024-09-15 18:26:50 +00:00
Eugene Yurtsev	77ccb4b1cf	cli[patch]: Update the migration script message (#26490 ) Update the migration script message	2024-09-14 14:40:35 -04:00
Bagatur	b47f4cfe51	mongodb[minor]: Release 0.2.0 (#26484 )	2024-09-13 19:17:36 -07:00
Bagatur	4e6620ecdd	chroma[patch]: Release 0.1.4 (#26470 )	2024-09-13 17:31:34 -07:00
Bagatur	543a80569c	prompty[minor]: Release 0.1.0 (#26481 )	2024-09-13 23:32:01 +00:00
ccurme	9c88037dbc	huggingface[patch]: xfail test (#26479 )	2024-09-13 23:16:06 +00:00
Bagatur	a2bfa41216	azure-dynamic-sessions[minor]: Release 0.2.0 (#26478 )	2024-09-13 23:09:48 +00:00
ccurme	8abc7ff55a	experimental: release 0.3 (#26477 )	2024-09-13 23:07:35 +00:00
Bagatur	6abb23ca97	exa[minor]: Release 0.2.0 (#26476 )	2024-09-13 23:04:10 +00:00
ccurme	900115a568	community: release 0.3 (#26472 )	2024-09-13 22:55:56 +00:00
Bagatur	17b397ef93	pinecone[minor]: Release 0.2.0 (#26474 )	2024-09-13 22:55:35 +00:00
Erick Friis	ca304ae046	robocorp: rm package (now langchain-sema4) (#26471 )	2024-09-13 15:54:00 -07:00
Erick Friis	537f6924dc	partners/ollama: release 0.2.0 (#26468 )	2024-09-13 15:48:48 -07:00
Erick Friis	995dfc6b05	partners/fireworks: release 0.2.0 (#26467 )	2024-09-13 22:48:16 +00:00
Erick Friis	832bc834b1	partners/anthropic: release 0.2.0 (#26469 ) 0.3.0 version was a mistake! not released - bumping version back to 0.2.0 here	2024-09-13 22:47:09 +00:00
Erick Friis	6997731729	partners/anthropic: release 0.3.0 (#26466 )	2024-09-13 22:44:11 +00:00
Bagatur	64bfe1ff23	groq[minor]: Release 0.2.0 (#26465 )	2024-09-13 22:43:11 +00:00
Erick Friis	58c7414e10	langchain: release 0.3.0 (#26462 )	2024-09-13 22:40:37 +00:00
ccurme	125c9896a8	huggingface: release 0.1 (#26463 )	2024-09-13 22:39:49 +00:00
Bagatur	f7ae12fa1f	openai[minor]: Release 0.2.0 (#26464 )	2024-09-13 15:38:10 -07:00
ccurme	d1462badaf	text-splitters: release 0.3 (#26460 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-13 22:31:06 +00:00
ccurme	9b30bdceb6	mistralai: release 0.2 (#26458 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-13 18:27:51 -04:00
Erick Friis	d46ab19954	core: release 0.3.0 (#26453 )	2024-09-13 21:45:45 +00:00
Erick Friis	c2a3021bb0	multiple: pydantic 2 compatibility, v0.3 (#26443 ) Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com> Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com> Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: ZhangShenao <15201440436@163.com> Co-authored-by: Friso H. Kingma <fhkingma@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Morgante Pell <morgantep@google.com>	2024-09-13 14:38:45 -07:00
Bagatur	d9813bdbbc	openai[patch]: Release 0.1.25 (#26439 )	2024-09-13 12:00:01 -07:00
liuhetian	7fc9e99e21	openai[patch]: get output_type when using with_structured_output (#26307 ) - This allows pydantic to correctly resolve annotations necessary when using openai new param `json_schema` Resolves issue: #26250 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-13 11:42:01 -07:00
Bagatur	0f2b32ffa9	core[patch]: Release 0.2.40 (#26435 )	2024-09-13 09:57:09 -07:00
Bagatur	e32adad17a	community[patch]: Release 0.2.17 (#26432 )	2024-09-13 09:56:39 -07:00
langchain-infra	8a02fd9c01	core: add additional import mappings to loads (#26406 ) Support using additional import mapping. This allows users to override old mappings/add new imports to the loads function. - [x ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-09-13 09:39:58 -07:00
Erick Friis	1d98937e8d	partners/openai: release 0.1.24 (#26417 )	2024-09-12 21:54:13 -07:00
Harrison Chase	28ad244e77	community, openai: support nested dicts (#26414 ) needed for thinking tokens --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-12 21:47:47 -07:00
Erick Friis	c0dd293f10	partners/groq: release 0.1.10 (#26393 )	2024-09-12 17:41:11 +00:00
Erick Friis	54c85087e2	groq: add back streaming tool calls (#26391 ) api no longer throws an error https://console.groq.com/docs/tool-use#streaming	2024-09-12 10:29:45 -07:00
Bagatur	feb351737c	core[patch]: fix empty OpenAI tools when strict=True (#26287 ) Fix #26232 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-11 16:06:03 -07:00
ccurme	398718e1cb	core[patch]: fix regression in convert_to_openai_tool with instances of Tool (#26327 ) ```python from langchain_core.tools import Tool from langchain_core.utils.function_calling import convert_to_openai_tool def my_function(x: int) -> int: return x + 2 tool = Tool( name="tool_name", func=my_function, description="test description", ) convert_to_openai_tool(tool) ``` Current: ``` {'type': 'function', 'function': {'name': 'tool_name', 'description': 'test description', 'parameters': {'type': 'object', 'properties': {'args': {'type': 'array', 'items': {}}, 'config': {'type': 'object', 'properties': {'tags': {'type': 'array', 'items': {'type': 'string'}}, 'metadata': {'type': 'object'}, 'callbacks': {'anyOf': [{'type': 'array', 'items': {}}, {}]}, 'run_name': {'type': 'string'}, 'max_concurrency': {'type': 'integer'}, 'recursion_limit': {'type': 'integer'}, 'configurable': {'type': 'object'}, 'run_id': {'type': 'string', 'format': 'uuid'}}}, 'kwargs': {'type': 'object'}}, 'required': ['config']}}} ``` Here: ``` {'type': 'function', 'function': {'name': 'tool_name', 'description': 'test description', 'parameters': {'properties': {'__arg1': {'title': '__arg1', 'type': 'string'}}, 'required': ['__arg1'], 'type': 'object'}}} ```	2024-09-11 15:51:10 -04:00
이규민	7feae62ad7	core[patch]: Support non ASCII characters in tool output if user doesn't output string (#26319 ) ### simple modify core: add supporting non english character target issue is #26315 same issue on langgraph - https://github.com/langchain-ai/langgraph/issues/1504	2024-09-11 15:21:00 +00:00
William FH	b993172702	Keyword-like runnable config (#26295 )	2024-09-11 07:44:47 -07:00
Bagatur	17659ca2cd	core[patch]: Release 0.2.39 (#26279 )	2024-09-10 20:11:27 +00:00
Nuno Campos	212c688ee0	core[minor]: Remove serialized manifest from tracing requests for non-llm runs (#26270 ) - This takes a long time to compute, isn't used, and currently called on every invocation of every chain/retriever/etc	2024-09-10 12:58:24 -07:00
ccurme	979232257b	huggingface[patch]: add integration tests for embeddings (#26272 )	2024-09-10 14:57:16 -04:00
ccurme	4ffd27c4d0	huggingface[patch]: add integration tests (#26269 ) Add standard tests for ChatHuggingFace. About half of these fail currently.	2024-09-10 18:31:51 +00:00
Christophe Bornet	9cf7ae0a52	community: Add docstring for HtmlLinkExtractor (#26213 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-10 00:27:37 +00:00
Christophe Bornet	56580b5fff	community: Add docstring for GLiNERLinkExtractor (#26218 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-10 00:27:23 +00:00
Christophe Bornet	e235a572a0	community: Add docstring for KeybertLinkExtractor (#26210 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-10 00:26:29 +00:00
Vadym Barda	bab9de581c	core[patch]: wrap mermaid node names w/ markdown in <p> tag (#26235 ) This fixes the issue where `__start__` and `__end__` node labels are being interpreted as markdown, as of the most recent Mermaid update	2024-09-09 20:11:00 -04:00
Tomaz Bratanic	181e4fc0e0	Add session expired retry to neo4j graph (#26182 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-08 11:40:43 -07:00
Sebastian Cherny	b3c7ed4913	Adding bind_tools in ChatOctoAI (#26168 ) The object extends from langchain_community.chat_models.openai.ChatOpenAI which doesn't have `bind_tools` defined. I tried extending from `langchain_openai.ChatOpenAI` in https://github.com/langchain-ai/langchain/pull/25975 but that PR got closed because this is not correct. So adding our own `bind_tools` (which for now copying from ChatOpenAI is good enough) will solve the tool calling issue we are having now. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-08 18:38:43 +00:00
John	97a8e365ec	partners/unstructured: update unstructured client version (#26105 ) Users are having version conflicts with `unstructured-client` as described here: https://unstructuredw-kbe4326.slack.com/archives/C06JJHC9G4U/p1725557970546199?thread_ts=1725035247.162819&cid=C06JJHC9G4U This PR fixes that issue and should update the version to "0.1.3" as well for a clean-slate version for users to install Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-08 18:32:34 +00:00
Vadym Barda	1b3bd52e0e	core[patch]: fix edge labels for mermaid graphs (#26201 )	2024-09-08 14:35:25 +00:00
Marcelo Machado	9bd4f1dfa8	docs: small improvement ChatOllama setup description (#26043 ) Small improvement on ChatOllama description --------- Co-authored-by: Marcelo Machado <mmachado@ibm.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-08 00:15:05 +00:00
Erick Friis	6e82d2184b	partners/mongodb: release 0.1.9 (#26193 )	2024-09-07 23:20:25 +00:00
William FH	262e19b15d	infra: Clear cache for env-var checks (#26073 )	2024-09-06 21:29:29 +00:00
ChengZi	a03141ac51	partners[milvus]: fix integration test issues (#26136 ) fix some integration test issues: https://github.com/langchain-ai/langchain/actions/runs/10688447230/job/29628412258 Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-06 16:52:36 +00:00
Erick Friis	5c1ebd3086	partners/unstructured: release 0.1.3 (#26119 )	2024-09-06 16:22:53 +00:00
Bagatur	1241a004cb	fmt	2024-09-04 11:44:59 -07:00
Bagatur	4ba14ae9e5	fmt	2024-09-04 11:34:59 -07:00
Bagatur	dba308447d	fmt	2024-09-04 11:28:04 -07:00
Bagatur	fdf6fbde18	fmt	2024-09-04 11:12:11 -07:00
Bagatur	576574c82c	fmt	2024-09-04 11:05:36 -07:00
Bagatur	7bf54636ff	make	2024-09-04 10:24:42 -07:00
Bagatur	3ec93c2817	standard-tests[patch]: add Ser/Des test	2024-09-04 10:24:06 -07:00
Friso H. Kingma	af11fbfbf6	langchain_openai: Make sure the response from the async client in the astream method of ChatOpenAI is properly awaited in case of "include_response_headers=True" (#26031 ) - Description: This is a one line change. the `self.async_client.with_raw_response.create(payload)` call is not properly awaited within the `_astream` method. In `_agenerate` this is done already, but likely forgotten in the other method. - Issue: Not applicable - Dependencies:** No dependencies required. (If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-04 13:26:48 +00:00
ZhangShenao	c812237217	Improvement[Community] Improve args description in api doc of `DocArrayInMemorySearch` (#26024 ) - Add missing arg - Remove redundant arg	2024-09-04 09:26:26 -04:00
Tomaz Bratanic	c649b449d7	Add the option to ignore structured output method to LLM graph transf… (#26013 ) Open source models like Llama3.1 have function calling, but it's not that great. Therefore, we introduce the option to ignore model's function calling and just use the prompt-based approach	2024-09-04 09:15:43 -04:00
Bagatur	4b99426a4f	openai[patch]: add back azure embeddings api_version alias	2024-09-03 17:25:03 -07:00
Eugene Yurtsev	bc3b851f08	openai[patch]: Upgrade @root_validators in preparation for pydantic 2 migration (#25491 ) * Upgrade @root_validator in openai pkg * Ran notebooks for all but AzureAI embeddings --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-03 14:42:24 -07:00
Tom Daniel Grande	0207dc1431	community: delta in openai choice can be None, creates handler for that (#25954 ) Thank you for contributing to LangChain! - [X ] PR title - [X ] PR message: Description: adds a handler for when delta choice is None Issue: Fixes #25951 Dependencies: Not applicable - [ X] Add tests and docs: Not applicable - [X ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-03 20:30:03 +00:00
Bagatur	9eb9ff52c0	experimental[patch]: Release 0.0.65 (#25987 )	2024-09-03 19:15:48 +00:00
Bagatur	bc3b02651c	standard-tests[patch]: test init from env vars (#25983 )	2024-09-03 19:05:39 +00:00
Bagatur	0af447c90b	community[patch]: Release 0.2.16 (#25982 )	2024-09-03 18:34:18 +00:00
Dan O'Donovan	f49da71e87	community[patch]: change default Neo4j username/password (#25226 ) Description: Change the default Neo4j username/password (when not supplied as environment variable or in code) from `None` to `""`. Neo4j has an option to [disable auth](https://neo4j.com/docs/operations-manual/current/configuration/configuration-settings/#config_dbms.security.auth_enabled) which is helpful when developing. When auth is disabled, the username / password through the `neo4j` module should be `""` (ie an empty string). Empty strings get marked as false in `langchain_core.utils.env.get_from_dict_or_env` -- changing this code / behaviour would have a wide impact and is undesirable. In order to both _allow_ access to Neo4j with auth disabled and _not_ impact `langchain_core` this patch is presented. The downside would be that if a user forgets to set NEO4J_USERNAME or NEO4J_PASSWORD they would see an invalid credentials error rather than missing credentials error. This could be mitigated but would result in a less elegant patch! Issue: Fix issue where langchain cannot communicate with Neo4j if Neo4j auth is disabled.	2024-09-03 11:24:18 -07:00
Bagatur	035d8cf51b	milvus[patch]: Release 0.1.5 (#25981 )	2024-09-03 18:19:51 +00:00
Bagatur	1dfc8c01af	langchain[patch]: Release 0.2.16 (#25977 )	2024-09-03 18:10:21 +00:00
Bagatur	fb642e1e27	text-splitters[patch]: Release 0.2.4 (#25979 )	2024-09-03 18:09:43 +00:00
Bagatur	7457949619	mistralai[patch]: Release 0.1.13 (#25978 )	2024-09-03 18:03:15 +00:00
Bagatur	0c69c9fb3f	core[patch]: Release 0.2.38 (#25974 )	2024-09-03 17:31:41 +00:00
Eugene Yurtsev	fa8402ea09	core[minor]: Add support for multiple env keys for secrets_from_env (#25971 ) - Add support to look up secret using more than one env variable - Add overload to help mypy Needed for https://github.com/langchain-ai/langchain/pull/25491	2024-09-03 11:39:54 -04:00
Maximilian Schulz	fdeaff4149	`langchain-mistralai` - make base URL possible to set via env variable for `ChatMistralAI` (#25956 ) Thank you for contributing to LangChain! Description: Similar to other packages (`langchain_openai`, `langchain_anthropic`) it would be beneficial if that `ChatMistralAI` model could fetch the API base URL from the environment. This PR allows this via the following order: - provided value - then whatever `MISTRAL_API_URL` is set to - then whatever `MISTRAL_BASE_URL` is set to - if `None`, then default is ` "https://api.mistral.com/v1"` - [x] Add tests and docs: Added unit tests, docs I feel are unnecessary, as this is just aligning with other packages that do the same? - [x] Lint and test: Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-03 14:32:35 +00:00
Jorge Piedrahita Ortiz	c7154a4045	community: sambastudio llms api v2 support (#25063 ) - Description: SambaStudio GenericV2 API support	2024-09-03 10:18:15 -04:00
ZhangShenao	8d784db107	docs: Add missing args in api doc of `WebResearchRetriever` (#25949 ) Add missing args in api doc of `WebResearchRetriever`	2024-09-03 01:24:23 -07:00
Bagatur	da113f6363	docs: ChatOpenAI.with_structured_output nits (#25952 )	2024-09-03 08:20:58 +00:00
Bagatur	5b99bb2437	docs: fix bullet list spacing (#25950 ) Fix #25935	2024-09-03 08:12:58 +00:00
Isaac Francisco	4833375200	community[patch]: added option to change how duckduckgosearchresults tool converts api outputs into string (#22580 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-02 22:42:19 +00:00
JonZeolla	78ff51ce83	community[patch]: update the default hf bge embeddings (#22627 ) Description: This updates the langchain_community > huggingface > default bge embeddings ([the current default recommends this change](https://huggingface.co/BAAI/bge-large-en)) Issue: None Dependencies: None Twitter handle: @jonzeolla --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-02 22:10:21 +00:00
Leonid Ganeline	150251fd49	docs: `integrations` reference updates 13 (#25711 ) Added missed provider pages and links. Fixed inconsistent formatting. Added arxiv references to docstirngs. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-09-02 22:08:50 +00:00
Bagatur	933bc0d6ff	core[patch]: support additional kwargs on StructuredPrompt (#25645 )	2024-09-02 14:55:26 -07:00
Yash Parmar	51dae57357	community[minor]: jina search tools integrating (jina reader) (#23339 ) - PR title: "community: add Jina Search tool" - Description: Added the Jina Search tool for querying the Jina search API. This includes the implementation of the JinaSearchAPIWrapper and the JinaSearch tool, along with a Jupyter notebook example demonstrating its usage. - Issue: N/A - Dependencies: N/A - Twitter handle: [Twitter handle](https://x.com/yashp3020?t=7wM0gQ7XjGciFoh9xaBtqA&s=09) - [x] Add tests and docs: If you're adding a new integration, please include 1. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-02 14:52:14 -07:00
Matthew DeGenaro	66828f4ecc	text-splitters[patch]: Modified SpacyTextSplitter to fully keep whitespace when strip_whitespace is false (#23272 ) Previously, regardless of whether or not strip_whitespace was set to true or false, the strip text method in the SpacyTextSplitter class used `sent.text` to get the sentence. I modified this to include a ternary such that if strip_whitespace is false, it uses `sent.text_with_ws` I also modified the project.toml to include the spacy pipeline package and to lock the numpy version, as higher versions break spacy. - Issue: N/a - Dependencies: None	2024-09-02 21:15:56 +00:00
Qingchuan Hao	3145995ed9	community[patch]: BingSearchResults returns raw snippets as artifact(#23304 ) Returns an array of results which is more specific and easier for later use. Tested locally: ``` resp = tool.invoke("what's the weather like in Shanghai?") for item in resp: print(item) ``` returns ``` {'snippet': '<b>Shanghai</b>, <b>Shanghai</b>, China <b>Weather</b> Forecast, with current conditions, wind, air quality, and what to expect for the next 3 days.', 'title': 'Shanghai, Shanghai, China Weather Forecast \| AccuWeather', 'link': 'https://www.accuweather.com/en/cn/shanghai/106577/weather-forecast/106577'} {'snippet': '5. 99 / 87 °F. 6. 99 / 86 °F. 7. Detailed forecast for 14 days. Need some help? Current <b>weather</b> <b>in Shanghai</b> and forecast for today, tomorrow, and next 14 days.', 'title': 'Weather for Shanghai, Shanghai Municipality, China - timeanddate.com', 'link': 'https://www.timeanddate.com/weather/china/shanghai'} {'snippet': '<b>Shanghai</b> - <b>Weather</b> warnings issued 14-day forecast. <b>Weather</b> warnings issued. Forecast - <b>Shanghai</b>. Day by day forecast. Last updated Friday at 01:05. Tonight, ... Temperature feels <b>like</b> 34 ...', 'title': 'Shanghai - BBC Weather', 'link': 'https://www.bbc.com/weather/1796236'} {'snippet': 'Current <b>weather</b> <b>in Shanghai</b>, <b>Shanghai</b>, China. Check current conditions <b>in Shanghai</b>, <b>Shanghai</b>, China with radar, hourly, and more.', 'title': 'Shanghai, Shanghai, China Current Weather \| AccuWeather', 'link': 'https://www.accuweather.com/en/cn/shanghai/106577/current-weather/106577'} 13-Day Beijing, Xi'an, Chengdu, <b>Shanghai</b> Chinese Language and Culture Immersion Tour. <b>Shanghai</b> in September. Average daily temperature range: 23–29°C (73–84°F) Average rainy days: 10. Average sunny days: 20. September ushers in pleasant autumn <b>weather</b>, making it one of the best months to visit <b>Shanghai</b>. <b>Weather</b> in <b>Shanghai</b>: Climate, Seasons, and Average Monthly Temperature. <b>Shanghai</b> has a subtropical maritime monsoon climate, meaning high humidity and lots of rain. Hot muggy summers, cool falls, cold winters with little snow, and warm springs are the norm. Midsummer through early fall is the best time to visit <b>Shanghai</b>. <b>Shanghai</b>, <b>Shanghai</b>, China <b>Weather</b> Forecast, with current conditions, wind, air quality, and what to expect for the next 3 days. 1165. 45.9. 121. Winter, from December to February, is quite cold: the average January temperature is 5 °C (41 °F). There may be cold periods, with highs around 5 °C (41 °F) or below, and occasionally, even snow can fall. The temperature dropped to -10 °C (14 °F) in January 1977 and to -7 °C (19.5 °F) in January 2016. 5. 99 / 87 °F. 6. 99 / 86 °F. 7. Detailed forecast for 14 days. Need some help? Current <b>weather</b> in <b>Shanghai</b> and forecast for today, tomorrow, and next 14 days. Everything you need to know about today's <b>weather</b> in <b>Shanghai</b>, <b>Shanghai</b>, China. High/Low, Precipitation Chances, Sunrise/Sunset, and today's Temperature History. <b>Shanghai</b> - <b>Weather</b> warnings issued 14-day forecast. <b>Weather</b> warnings issued. Forecast - <b>Shanghai</b>. Day by day forecast. Last updated Friday at 01:05. Tonight, ... Temperature feels <b>like</b> 34 ... <b>Shanghai</b> 14 Day Extended Forecast. <b>Weather</b> Today <b>Weather</b> Hourly 14 Day Forecast Yesterday/Past <b>Weather</b> Climate (Averages) Currently: 84 °F. Passing clouds. (<b>Weather</b> station: <b>Shanghai</b> Hongqiao Airport, China). See more current <b>weather</b>. Current <b>weather</b> in <b>Shanghai</b>, <b>Shanghai</b>, China. Check current conditions in <b>Shanghai</b>, <b>Shanghai</b>, China with radar, hourly, and more. <b>Shanghai</b> <b>Weather</b> Forecasts. <b>Weather Underground</b> provides local & long-range <b>weather</b> forecasts, weatherreports, maps & tropical <b>weather</b> conditions for the <b>Shanghai</b> area. ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-02 21:11:32 +00:00
Alexander KIRILOV	6a8f8a56ac	community[patch]: added content_columns option to CSVLoader (#23809 ) Description: Adding a new option to the CSVLoader that allows us to implicitly specify the columns that are used for generating the Document content. Currently these are implicitly set as "all fields not part of the metadata_columns". In some cases however it is useful to have a field both as a metadata and as part of the document content.	2024-09-02 20:25:53 +00:00
Bruno Alvisio	ab527027ac	community: Resolve refs recursively when generating openai_fn from OpenAPI spec (#19002 ) - Description: This PR is intended to improve the generation of payloads for OpenAI functions when converting from an OpenAPI spec file. The solution is to recursively resolve `$refs`. Currently when converting OpenAPI specs into OpenAI functions using `openapi_spec_to_openai_fn`, if the schemas have nested references, the generated functions contain `$ref` that causes the LLM to generate payloads with an incorrect schema. For example, for the for OpenAPI spec: ``` text = """ { "openapi": "3.0.3", "info": { "title": "Swagger Petstore - OpenAPI 3.0", "termsOfService": "http://swagger.io/terms/", "contact": { "email": "apiteam@swagger.io" }, "license": { "name": "Apache 2.0", "url": "http://www.apache.org/licenses/LICENSE-2.0.html" }, "version": "1.0.11" }, "externalDocs": { "description": "Find out more about Swagger", "url": "http://swagger.io" }, "servers": [ { "url": "https://petstore3.swagger.io/api/v3" } ], "tags": [ { "name": "pet", "description": "Everything about your Pets", "externalDocs": { "description": "Find out more", "url": "http://swagger.io" } }, { "name": "store", "description": "Access to Petstore orders", "externalDocs": { "description": "Find out more about our store", "url": "http://swagger.io" } }, { "name": "user", "description": "Operations about user" } ], "paths": { "/pet": { "post": { "tags": [ "pet" ], "summary": "Add a new pet to the store", "description": "Add a new pet to the store", "operationId": "addPet", "requestBody": { "description": "Create a new pet in the store", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/Pet" } } }, "required": true }, "responses": { "200": { "description": "Successful operation", "content": { "application/json": { "schema": { "$ref": "#/components/schemas/Pet" } } } } } } } }, "components": { "schemas": { "Tag": { "type": "object", "properties": { "id": { "type": "integer", "format": "int64" }, "model_type": { "type": "number" } } }, "Category": { "type": "object", "required": [ "model", "year", "age" ], "properties": { "year": { "type": "integer", "format": "int64", "example": 1 }, "model": { "type": "string", "example": "Ford" }, "age": { "type": "integer", "example": 42 } } }, "Pet": { "required": [ "name" ], "type": "object", "properties": { "id": { "type": "integer", "format": "int64", "example": 10 }, "name": { "type": "string", "example": "doggie" }, "category": { "$ref": "#/components/schemas/Category" }, "tags": { "type": "array", "items": { "$ref": "#/components/schemas/Tag" } }, "status": { "type": "string", "description": "pet status in the store", "enum": [ "available", "pending", "sold" ] } } } } } } """ ``` Executing: ``` spec = OpenAPISpec.from_text(text) pet_openai_functions, pet_callables = openapi_spec_to_openai_fn(spec) response = model.invoke("Create a pet named Scott", functions=pet_openai_functions) ``` `pet_open_functions` contains unresolved `$refs`: ``` [ { "name": "addPet", "description": "Add a new pet to the store", "parameters": { "type": "object", "properties": { "json": { "properties": { "id": { "type": "integer", "schema_format": "int64", "example": 10 }, "name": { "type": "string", "example": "doggie" }, "category": { "ref": "#/components/schemas/Category" }, "tags": { "items": { "ref": "#/components/schemas/Tag" }, "type": "array" }, "status": { "type": "string", "enum": [ "available", "pending", "sold" ], "description": "pet status in the store" } }, "type": "object", "required": [ "name", "photoUrls" ] } } } } ] ``` and the generated JSON has an incorrect schema (e.g. category is filled with `id` and `name` instead of `model`, `year` and `age`: ``` { "id": 1, "name": "Scott", "category": { "id": 1, "name": "Dogs" }, "tags": [ { "id": 1, "name": "tag1" } ], "status": "available" } ``` With this change, the generated JSON by the LLM becomes, `pet_openai_functions` becomes: ``` [ { "name": "addPet", "description": "Add a new pet to the store", "parameters": { "type": "object", "properties": { "json": { "properties": { "id": { "type": "integer", "schema_format": "int64", "example": 10 }, "name": { "type": "string", "example": "doggie" }, "category": { "properties": { "year": { "type": "integer", "schema_format": "int64", "example": 1 }, "model": { "type": "string", "example": "Ford" }, "age": { "type": "integer", "example": 42 } }, "type": "object", "required": [ "model", "year", "age" ] }, "tags": { "items": { "properties": { "id": { "type": "integer", "schema_format": "int64" }, "model_type": { "type": "number" } }, "type": "object" }, "type": "array" }, "status": { "type": "string", "enum": [ "available", "pending", "sold" ], "description": "pet status in the store" } }, "type": "object", "required": [ "name" ] } } } } ] ``` and the JSON generated by the LLM is: ``` { "id": 1, "name": "Scott", "category": { "year": 2022, "model": "Dog", "age": 42 }, "tags": [ { "id": 1, "model_type": 1 } ], "status": "available" } ``` which has the intended schema. - Twitter handle:: @brunoalvisio --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-09-02 13:17:39 -07:00
Nuno Campos	464dae8ac2	core: Include global variables in variables found by get_function_nonlocals (#25936 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-09-02 11:49:25 -07:00
Luiz F. G. dos Santos	36bbdc776e	community: fix bug to support for `file_search` tool from OpenAI (#25927 ) - Description: The function `_is_assistants_builtin_tool` didn't had support for `file_search` from OpenAI. This was creating conflict and blocking the usage of such. OpenAI Assistant changed from`retrieval` to `file_search`. The following code ``` agent = OpenAIAssistantV2Runnable.create_assistant( name="Data Analysis Assistant", instructions=prompt[0].content, tools={'type': 'file_search'}, model=self.chat_config.connection.deployment_name, client=llm, as_agent=True, tool_resources={ "file_search": { "vector_store_ids": vector_store_id } } ) ``` Was throwing the following error ``` Traceback (most recent call last): File "/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py", line 500, in get_response return await super().get_response(post, context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py", line 96, in get_response response = await self.inner_chat.get_response(post, context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py", line 96, in get_response response = await self.inner_chat.get_response(post, context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py", line 96, in get_response response = await self.inner_chat.get_response(post, context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [Previous line repeated 4 more times] File "/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/azure_open_ai_chat.py", line 147, in get_response chain = chain_factory.get_chain(prompts, post.conversation.id, overrides, context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/llm_connections/chains.py", line 1324, in get_chain agent = OpenAIAssistantV2Runnable.create_assistant( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_community/agents/openai_assistant/base.py", line 256, in create_assistant tools=[_get_assistants_tool(tool) for tool in tools], # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_community/agents/openai_assistant/base.py", line 256, in <listcomp> tools=[_get_assistants_tool(tool) for tool in tools], # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_community/agents/openai_assistant/base.py", line 119, in _get_assistants_tool return convert_to_openai_tool(tool) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_core/utils/function_calling.py", line 255, in convert_to_openai_tool function = convert_to_openai_function(tool) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_core/utils/function_calling.py", line 230, in convert_to_openai_function raise ValueError( ValueError: Unsupported function {'type': 'file_search'} Functions must be passed in as Dict, pydantic.BaseModel, or Callable. If they're a dict they must either be in OpenAI function format or valid JSON schema with top-level 'title' and 'description' keys. ``` With the proposed changes, this is fixed and the function will have support for `file_search`. This was the only place missing the support for `file_search`. Reference doc https://platform.openai.com/docs/assistants/tools/file-search - Twitter handle: luizf0992 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-09-02 18:21:51 +00:00
xander-art	6cd452d985	Feature/update hunyuan (#25779 ) Description: - Add system templates and user templates in integration testing - initialize the response id field value to request_id - Adjust the default model to hunyuan-pro - Remove the default values of Temperature and TopP - Add SystemMessage all the integration tests have passed. 1、Execute integration tests for the first time <img width="1359" alt="71ca77a2-e9be-4af6-acdc-4d665002bd9b" src="https://github.com/user-attachments/assets/9298dc3a-aa26-4bfa-968b-c011a4e699c9"> 2、Run the integration test a second time <img width="1501" alt="image" src="https://github.com/user-attachments/assets/61335416-4a67-4840-bb89-090ba668e237"> Issue: None Dependencies: None Twitter handle: None --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-02 12:55:08 +00:00
Yuwen Hu	566e9ba164	community: add Intel GPU support to `ipex-llm` llm integration (#22458 ) Description: [IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency. This PR adds Intel GPU support to `ipex-llm` llm integration. Dependencies: `ipex-llm` Contribution maintainer: @ivy-lv11 @Oscilloscope98 tests and docs: - Add: langchain/docs/docs/integrations/llms/ipex_llm_gpu.ipynb - Update: langchain/docs/docs/integrations/llms/ipex_llm_gpu.ipynb - Update: langchain/libs/community/tests/llms/test_ipex_llm.py --------- Co-authored-by: ivy-lv11 <zhicunlv@gmail.com>	2024-09-02 08:49:08 -04:00

... 3 4 5 6 7 ...

5957 Commits