langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-16 16:11:02 +00:00

Author	SHA1	Message	Date
Chen Xiabin	63c60a31f0	[fix] baidu qianfan AiMessage with usage_metadata (#24389 ) make AIMessage usage_metadata has error	2024-07-18 09:28:16 -04:00
ZhangShenao	916b813107	community[patch]: Fix spelling error in ConversationVectorStoreTokenBufferMemory doc-string (#24385 ) Fix word spelling error in `ConversationVectorStoreTokenBufferMemory`	2024-07-18 12:27:36 +00:00
Rajendra Kadam	1c65529fd7	community[minor]: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders (#24393 ) Thank you for contributing to LangChain! - [x] PR title: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders - Description: Minor fixes in the PebbloSafeLoader: - Renamed the loader type from `remote_db` to `cloud_folder`. - Added `SharePointLoader` to the list of loaders supported by PebbloSafeLoader. - Issue: NA - Dependencies: NA - [x] Add tests and docs: NA	2024-07-18 08:23:12 -04:00
Eugene Yurtsev	6182a402f1	experimental[patch]: block a few more things from PALValidator (#24379 ) * Please see security warning already in existing class. * The approach here is fundamentally insecure as it's relying on a block approach rather than an approach based on only running allowed nodes. So users should only use this code if its running from a properly sandboxed environment.	2024-07-18 08:22:45 -04:00
Paolo Ráez	0dec72cab0	Community[patch]: Missing "stream" parameter in cloudflare_workersai (#23987 ) ### Description Missing "stream" parameter. Without it, you'd never receive a stream of tokens when using stream() or astream() ### Issue No existing issue available	2024-07-18 02:09:39 +00:00
Eugene Yurtsev	570566b858	core[patch]: Update API reference for astream events (#24359 ) Update the API reference for astream events to include information about custom events.	2024-07-17 21:48:53 -04:00
Bagatur	a4c101ae97	core[patch]: Release 0.2.21 (#24372 )	2024-07-17 22:44:35 +00:00
William FH	c5a07e2dd8	core[patch]: add InjectedToolArg annotation (#24279 ) ```python from typing_extensions import Annotated from langchain_core.tools import tool, InjectedToolArg from langchain_anthropic import ChatAnthropic @tool def multiply(x: int, y: int, not_for_model: Annotated[dict, InjectedToolArg]) -> str: """multiply.""" return x * y ChatAnthropic(model='claude-3-sonnet-20240229',).bind_tools([multiply]).invoke('5 times 3').tool_calls ''' -> [{'name': 'multiply', 'args': {'x': 5, 'y': 3}, 'id': 'toolu_01Y1QazYWhu4R8vF4hF4z9no', 'type': 'tool_call'}] ''' ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-17 15:28:40 -07:00
Erick Friis	80f3d48195	openai: release 0.1.18 (#24369 )	2024-07-17 22:26:33 +00:00
Bagatur	7d83189b19	openai[patch]: use model_name in AzureOpenAI.ls_model_name (#24366 )	2024-07-17 15:24:05 -07:00
Nithish Raghunandanan	eb26b5535a	couchbase: Add chat message history (#24356 ) Description: : Add support for chat message history using Couchbase - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>	2024-07-17 15:22:42 -07:00
Eugene Yurtsev	96bac8e20d	core[patch]: Fix regression requiring input_variables in few chat prompt templates (#24360 ) * Fix regression that requires users passing input_variables=[]. * Regression introduced by my own changes to this PR: https://github.com/langchain-ai/langchain/pull/22851	2024-07-17 18:14:57 -04:00
Brice Fotzo	034a8c7c1b	community: support advanced text extraction options for pdf documents (#20265 ) Description: - Updated constructors in PyPDFParser and PyPDFLoader to handle `extraction_mode` and additional kwargs, aligning with the capabilities of `PageObject.extract_text()` from pypdf. - Added `test_pypdf_loader_with_layout` along with a corresponding example text file to validate layout extraction from PDFs. Issue: fixes #19735 Dependencies: This change requires updating the pypdf dependency from version 3.4.0 to at least 4.0.0. Additional changes include the addition of a new test test_pypdf_loader_with_layout and an example text file to ensure the functionality of layout extraction from PDFs aligns with the new capabilities. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 20:47:09 +00:00
hmasdev	a402de3dae	langchain[patch]: fix wrong `dict` key in `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser` (#23967 ) # Description This PR aims to solve a bug in `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser` The bug is that the wrong keyword argument was given to `retry_chain`. The correct keyword argument is 'completion', but 'input' is used. This pull request makes the following changes: 1. correct a `dict` key given to `retry_chain`; 2. add a test when using the default prompt. - `NAIVE_FIX_PROMPT` for `OutputFixingParser`; - `NAIVE_RETRY_PROMPT` for `RetryOutputParser`; - `NAIVE_RETRY_WITH_ERROR_PROMPT` for `RetryWithErrorOutputParser`; 3. ~~add comments on `retry_chain` input and output types~~ clarify `InputType` and `OutputType` of `retry_chain` # Issue The bug is pointed out in https://github.com/langchain-ai/langchain/pull/19792#issuecomment-2196512928 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 20:34:46 +00:00
Casey Clements	a47f69a120	partners/mongodb : Significant MongoDBVectorSearch ID enhancements (#23535 ) ## Description This pull-request improves the treatment of document IDs in `MongoDBAtlasVectorSearch`. Class method signatures of add_documents, add_texts, delete, and from_texts now include an `ids:Optional[List[str]]` keyword argument permitting the user greater control. Note that, as before, IDs may also be inferred from `Document.metadata['_id']` if present, but this is no longer required, IDs can also optionally be returned from searches. This PR closes the following JIRA issues. * [PYTHON-4446](https://jira.mongodb.org/browse/PYTHON-4446) MongoDBVectorSearch delete / add_texts function rework * [PYTHON-4435](https://jira.mongodb.org/browse/PYTHON-4435) Add support for "Indexing" * [PYTHON-4534](https://jira.mongodb.org/browse/PYTHON-4534) Ensure datetimes are json-serializable --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 13:26:20 -07:00
Erick Friis	cc2cbfabfc	milvus: release 0.1.2 (#24365 )	2024-07-17 19:42:44 +00:00
Eugene Yurtsev	9e4a0e76f6	core[patch]: Fix one unit test for chat prompt template (#24362 ) Minor change that fixes a unit test that had missing assertions.	2024-07-17 18:56:48 +00:00
Erick Friis	81639243e2	openai: release 0.1.17 (#24361 )	2024-07-17 18:50:42 +00:00
Erick Friis	61976a4147	pinecone: release 0.1.2 (#24355 )	2024-07-17 17:09:07 +00:00
Bagatur	b5360e2e5f	community[patch]: Release 0.2.8 (#24354 )	2024-07-17 17:07:27 +00:00
ccurme	4cf67084d3	openai[patch]: fix key collision and _astream (#24345 ) Fixes small issues introduced in https://github.com/langchain-ai/langchain/pull/24150 (unreleased).	2024-07-17 12:59:26 -04:00
Luis Moros	bcb5f354ad	community: Fix SQLDatabse.from_databricks issue when ran from Job (#24346 ) - Description: When SQLDatabase.from_databricks is ran from a Databricks Workflow job, line 205 (default_host = context.browserHostName) throws an ``AttributeError`` as the ``context`` object has no ``browserHostName`` attribute. The fix handles the exception and sets the ``default_host`` variable to null --------- Co-authored-by: lmorosdb <lmorosdb> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-17 12:40:12 -04:00
Bagatur	24e9b48d15	langchain[patch]: Release 0.2.9 (#24327 )	2024-07-17 09:39:57 -07:00
Rafael Pereira	cf28708e7b	Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725 ) Description: At the moment neo4j wrapper is using setVectorProperty, which is deprecated ([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)). I replaced with the non-deprecated version. Neo4j recently introduced a new cypher method to associate embeddings into relations using "setRelationshipVectorProperty" method. In this PR I also implemented a new method to perform this association maintaining the same format used in the "add_embeddings" method which is used to associate embeddings into Nodes. I also included a test case for this new method.	2024-07-17 12:37:47 -04:00
maang-h	2a3288b15d	docs: Add ChatBaichuan docstrings (#24348 ) - Description: Add ChatBaichuan rich docstrings. - Issue: the issue #22296	2024-07-17 12:00:16 -04:00
Rafael Pereira	fc41730e28	neo4j: Fix test for order-insensitive comparison and floating-point precision issues (#24338 ) Description: This PR addresses two main issues in the `test_neo4jvector.py`: 1. Order-insensitive Comparison: Modified the `test_retrieval_dictionary` to ensure that it passes regardless of the order of returned values by parsing `page_content` into a structured format (dictionary) before comparison. 2. Floating-point Precision: Updated `test_neo4jvector_relevance_score` to handle minor floating-point precision differences by using the `isclose` function for comparing relevance scores with a relative tolerance. Errors addressed: - test_neo4jvector_relevance_score: ``` AssertionError: assert [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512)] == [(Document(page_content='foo', metadata={'page': '0'}), 1.0), (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706)] At index 0 diff: (Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746) != (Document(page_content='foo', metadata={'page': '0'}), 1.0) Full diff: - [(Document(page_content='foo', metadata={'page': '0'}), 1.0), + [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), ? +++++++++++++++ - (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), ? ^^^ ------ + (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), ? ^^^^^^^^^ - (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706), ? ---------- + (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512), ? ++++++++++ ] ``` - test_retrieval_dictionary: ``` AssertionError: assert [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n')] == [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] At index 0 diff: Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n') != Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n') Full diff: - [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] ? --------- + [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: John\nage: 30\n')] ? +++++++++ ```	2024-07-17 09:28:25 -04:00
Bagatur	80e7cd6cff	core[patch]: Release 0.2.20 (#24322 )	2024-07-16 15:04:36 -07:00
Eugene Yurtsev	616196c620	Docs: Add how to dispatch custom callback events (#24278 ) * Add how-to guide for dispatching custom callback events. * Add links from index to the how to guide * Add link from streaming from within a tool * Update versionadded to correct release https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D0.2.15	2024-07-16 17:38:32 -04:00
Leonid Ganeline	5ccf8ebfac	core: docstrings `vectorstores` update (#24281 ) Added missed docstrings. Formatted docstrings to the consistent form. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-16 16:58:11 +00:00
Erick Friis	1e9cc02ed8	openai: raw response headers (#24150 )	2024-07-16 09:54:54 -07:00
Bagatur	dc42279eb5	core[patch]: fix Typing.cast import (#24313 ) Fixes #24287	2024-07-16 16:53:48 +00:00
bovlb	5caa381177	community[minor]: Add ApertureDB as a vectorstore (#24088 ) Thank you for contributing to LangChain! - [X] ApertureDB as vectorstore: "community: Add ApertureDB as a vectorestore" - Description:* this change provides a new community integration that uses ApertureData's ApertureDB as a vector store. - Issue: none - Dependencies: depends on ApertureDB Python SDK - Twitter handle: ApertureData - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Integration tests rely on a local run of a public docker image. Example notebook additionally relies on a local Ollama server. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ All lint tests pass. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Gautam <gautam@aperturedata.io>	2024-07-16 09:32:59 -07:00
frob	c59e663365	community[patch]: Fix docstring for ollama parameter "keep_alive" (#23973 ) Fix doc-string for ollama integration	2024-07-16 14:48:38 +00:00
Leonid Ganeline	5fcf2ef7ca	core: docstrings `documents` (#23506 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-16 10:43:54 -04:00
Rahul Raghavendra Choudhury	f5a38772a8	community[patch]: Update TavilySearch to use TavilyClient instead of the deprecated Client (#24270 ) On using TavilySearchAPIRetriever with any conversation chain getting error : `TypeError: Client.__init__() got an unexpected keyword argument 'api_key'` It is because the retreiver class is using the depreciated `Client` class, `TavilyClient` need to be used instead. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-16 13:35:28 +00:00
Shenhai Ran	5f2dea2b20	core[patch]: Add encoding options when create prompt template from a file (#24054 ) - Uses default utf-8 encoding for loading prompt templates from file --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-16 09:35:09 -04:00
Chen Xiabin	69b1603173	baidu qianfan AiMessage with usage_metadata (#24288 ) add usage_metadata to qianfan AIMessage. Thanks	2024-07-16 09:30:50 -04:00
Leonid Ganeline	198b85334f	core[patch]: docstrings `langchain_core/` files update (#24285 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-16 09:21:51 -04:00
Dobiichi-Origami	7aeaa1974d	community[patch]: change the class of `qianfan_ak` and `qianfan_sk` parameters (#24293 ) - Description: we changed the class of two parameters to fix a bug, which causes validation failure when using QianfanEmbeddingEndpoint	2024-07-16 09:17:48 -04:00
Tibor Reiss	1c753d1e81	core[patch]: Update typing for template format to include jinja2 as a Literal (#24144 ) Fixes #23929 via adjusting the typing	2024-07-16 09:09:42 -04:00
Erick Friis	d2f671271e	langchain: fix extended test (#24282 )	2024-07-15 15:29:48 -07:00
Lage Ragnarsson	a3c10fc6ce	community: Add support for specifying hybrid search for Databricks vector search (#23528 ) Description: Databricks Vector Search recently added support for hybrid keyword-similarity search. See [usage examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint) from their documentation. This PR updates the Langchain vectorstore interface for Databricks to enable the user to pass the query_type parameter to similarity_search to make use of this functionality. By default, there will not be any changes for existing users of this interface. To use the new hybrid search feature, it is now possible to do ```python # ... dvs = DatabricksVectorSearch(index) dvs.similarity_search("my search query", query_type="HYBRID") ``` Or using the retriever: ```python retriever = dvs.as_retriever( search_kwargs={ "query_type": "HYBRID", } ) retriever.invoke("my search query") ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-15 22:14:08 +00:00
Christopher Tee	5171ffc026	community(you): Integrate You.com conversational APIs (#23046 ) You.com is releasing two new conversational APIs — Smart and Research. This PR: - integrates those APIs with Langchain, as an LLM - streaming is supported If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-15 17:46:58 -04:00
maang-h	6c7d9f93b9	feat: Add ChatTongyi structured output (#24187 ) - Description: Add `with_structured_output` method to ChatTongyi to support structured output.	2024-07-15 15:57:21 -04:00
Chen Xiabin	8f4620f4b8	baidu qianfan streaming token_usage (#24117 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-15 19:52:31 +00:00
maang-h	9d97de34ae	community[patch]: Improve ChatBaichuan init args and role (#23878 ) - Description: Improve ChatBaichuan init args and role - ChatBaichuan adds `system` role - alias: `baichuan_api_base` -> `base_url` - `with_search_enhance` is deprecated - Add `max_tokens` argument	2024-07-15 15:17:00 -04:00
Erick Friis	56cca23745	openai: remove some params from default serialization (#24280 )	2024-07-15 18:53:36 +00:00
mrugank-wadekar	66bebeb76a	partners: add similarity search by image functionality to langchain_chroma partner package (#22982 ) - Description: This pull request introduces two new methods to the Langchain Chroma partner package that enable similarity search based on image embeddings. These methods enhance the package's functionality by allowing users to search for images similar to a given image URI. Also introduces a notebook to demonstrate it's use. - Issue: N/A - Dependencies: None - Twitter handle: @mrugank9009 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-15 18:48:22 +00:00
pm390	b0aa915dea	community[patch]: use asyncio.sleep instead of sleep in OpenAI Assistant async (#24275 ) Description: Implemented async sleep using asyncio instead of synchronous sleep in openAI Assistants Issue: 24194 Dependencies: asyncio Twitter handle: pietromald60939	2024-07-15 18:14:39 +00:00
wenngong	a001037319	retrievers: MultiVectorRetriever similarity_score_threshold search type (#23539 ) Description: support MultiVectorRetriever similarity_score_threshold search type. Issue: #23387 #19404 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-07-15 13:31:34 -04:00
Carlos André Antunes	20151384d7	fix azure_openai.py: some keys do not exists (#24158 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method Thank you for contributing to LangChain! - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-15 17:17:05 +00:00
blueoom	d895614d19	text_splitters: add request parameters for function HTMLHeaderTextSplitter.split_text… (#24178 ) Description: The `split_text_from_url` method of `HTMLHeaderTextSplitter` does not include parameters like `timeout` when using `requests` to send a request. Therefore, I suggest adding a `kwargs` parameter to the function, which can be passed as arguments to `requests.get()` internally, allowing control over the `get` request. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-15 16:43:56 +00:00
Bagatur	9d0c1d2dc9	docs: specify init_chat_model version (#24274 )	2024-07-15 16:29:06 +00:00
Bagatur	c9473367b1	langchain[patch]: Release 0.2.8 (#24273 )	2024-07-15 16:05:51 +00:00
JP-Ellis	f77659463a	core[patch]: allow message utils to work with lcel (#23743 ) The functions `convert_to_messages` has had an expansion of the arguments it can take: 1. Previously, it only could take a `Sequence` in order to iterate over it. This has been broadened slightly to an `Iterable` (which should have no other impact). 2. Support for `PromptValue` and `BaseChatPromptTemplate` has been added. These are generated when combining messages using the overloaded `+` operator. Functions which rely on `convert_to_messages` (namely `filter_messages`, `merge_message_runs` and `trim_messages`) have had the type of their arguments similarly expanded. Resolves #23706. <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --> --------- Signed-off-by: JP-Ellis <josh@jpellis.me> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-07-15 08:58:05 -07:00
Harold Martin	ccdaf14eff	docs: Spell check fixes (#24217 ) Description: Spell check fixes for docs, comments, and a couple of strings. No code change e.g. variable names. Issue: none Dependencies: none Twitter handle: hmartin	2024-07-15 15:51:43 +00:00
Leonid Ganeline	cacdf96f9c	core docstrings `tracers` update (#24211 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-15 11:37:09 -04:00
Leonid Ganeline	36ee083753	core: docstrings `utils` update (#24213 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-15 11:36:00 -04:00
thehunmonkgroup	e8a21146d3	community[patch]: upgrade default model for ChatAnyscale (#24232 ) Old default `meta-llama/Llama-2-7b-chat-hf` no longer supported.	2024-07-15 11:34:59 -04:00
Bagatur	620b118c70	core[patch]: Release 0.2.19 (#24272 )	2024-07-15 07:51:30 -07:00
ccurme	888fbc07b5	core[patch]: support passing `args_schema` through `as_tool` (#24269 ) Note: this allows the schema to be passed in positionally. ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.runnables import RunnableLambda class Add(BaseModel): """Add two integers together.""" a: int = Field(..., description="First integer") b: int = Field(..., description="Second integer") def add(input: dict) -> int: return input["a"] + input["b"] runnable = RunnableLambda(add) as_tool = runnable.as_tool(Add) as_tool.args_schema.schema() ``` ``` {'title': 'Add', 'description': 'Add two integers together.', 'type': 'object', 'properties': {'a': {'title': 'A', 'description': 'First integer', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'Second integer', 'type': 'integer'}}, 'required': ['a', 'b']} ```	2024-07-15 07:51:05 -07:00
ccurme	ab2d7821a7	fireworks[patch]: use firefunction-v2 in standard tests (#24264 )	2024-07-15 13:15:08 +00:00
ccurme	6fc7610b1c	standard-tests[patch]: update test_bind_runnables_as_tools (#24241 ) Reduce number of tool arguments from two to one.	2024-07-15 08:35:07 -04:00
Bagatur	0da5078cad	langchain[minor]: Generic configurable model (#23419 ) alternative to [23244](https://github.com/langchain-ai/langchain/pull/23244). allows you to use chat model declarative methods ![Screenshot 2024-06-25 at 1 07 10 PM](https://github.com/langchain-ai/langchain/assets/22008038/910d1694-9b7b-46bc-bc2e-3792df9321d6)	2024-07-15 01:11:01 +00:00
Bagatur	d0728b0ba0	core[patch]: add tool name to tool message (#24243 ) Copying current ToolNode behavior	2024-07-15 00:42:40 +00:00
Bagatur	5c3e2612da	core[patch]: Release 0.2.18 (#24230 )	2024-07-13 09:14:43 -07:00
Bagatur	65321bf975	core[patch]: fix ToolCall "type" when streaming (#24218 )	2024-07-13 08:59:03 -07:00
Anush	a653b209ba	qdrant: test new QdrantVectorStore (#24165 ) ## Description This PR adds integration tests to follow up on #24164. By default, the tests use an in-memory instance. To run the full suite of tests, with both in-memory and Qdrant server: ``` $ docker run -p 6333:6333 qdrant/qdrant $ make test $ make integration_test ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 23:59:30 +00:00
Roman Solomatin	f071581aea	openai[patch]: update openai params (#23691 ) Description: Explicitly add parameters from openai API - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 16:53:33 -07:00
Leonid Ganeline	f0a7581b50	milvus: docstring (#23151 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference) --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 23:25:31 +00:00
Christian D. Glissov	474b88326f	langchain_qdrant: Added method "_asimilarity_search_with_relevance_scores" to Qdrant class (#23954 ) I stumbled upon a bug that led to different similarity scores between the async and sync similarity searches with relevance scores in Qdrant. The reason being is that _asimilarity_search_with_relevance_scores is missing, this makes langchain_qdrant use the method of the vectorstore baseclass leading to drastically different results. To illustrate the magnitude here are the results running an identical search in a test vectorstore. Output of asimilarity_search_with_relevance_scores: [0.9902903374601824, 0.9472135924938804, 0.8535534011299859] Output of similarity_search_with_relevance_scores: [0.9805806749203648, 0.8944271849877607, 0.7071068022599718] Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 23:25:20 +00:00
Bagatur	bdc03997c9	standard-tests[patch]: check for ToolCall["type"] (#24209 )	2024-07-12 16:17:34 -07:00
Miroslav	aee55eda39	community: Skip Login to HuggubgFaceHub when token is not set (#21561 ) Thank you for contributing to LangChain! - [ ] HuggingFaceEndpoint: "Skip Login to HuggingFaceHub" - Where: langchain, community, llm, huggingface_endpoint - [ ] PR message: *Delete this entire checklist* and replace with - Description: Skip login to huggingface hub when when `huggingfacehub_api_token` is not set. This is needed when using custom `endpoint_url` outside of HuggingFaceHub. - Issue: the issue # it fixes https://github.com/langchain-ai/langchain/issues/20342 and https://github.com/langchain-ai/langchain/issues/19685 - Dependencies: None - [ ] Add tests and docs: 1. Tested with locally available TGI endpoint 2. Example Usage ```python from langchain_community.llms import HuggingFaceEndpoint llm = HuggingFaceEndpoint( endpoint_url='http://localhost:8080', server_kwargs={ "headers": {"Content-Type": "application/json"} } ) resp = llm.invoke("Tell me a joke") print(resp) ``` Also tested against HF Endpoints ```python from langchain_community.llms import HuggingFaceEndpoint huggingfacehub_api_token = "hf_xyz" repo_id = "mistralai/Mistral-7B-Instruct-v0.2" llm = HuggingFaceEndpoint( huggingfacehub_api_token=huggingfacehub_api_token, repo_id=repo_id, ) resp = llm.invoke("Tell me a joke") print(resp) ``` Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 22:10:32 +00:00
Anush	d09dda5a08	qdrant: Bump patch version (#24168 ) # Description To release a new version of `langchain-qdrant` after #24165 and #24166.	2024-07-12 14:48:50 -07:00
Bagatur	12950cc602	standard-tests[patch]: improve runnable tool description (#24210 )	2024-07-12 21:33:56 +00:00
Erick Friis	e8ee781a42	ibm: move to external repo (#24208 )	2024-07-12 21:14:24 +00:00
Bagatur	02e71cebed	together[patch]: Release 0.1.4 (#24205 )	2024-07-12 13:59:58 -07:00
Bagatur	259d4d2029	anthropic[patch]: Release 0.1.20 (#24204 )	2024-07-12 13:59:15 -07:00
Bagatur	3aed74a6fc	fireworks[patch]: Release 0.1.5 (#24203 )	2024-07-12 13:58:58 -07:00
Bagatur	13b0d7ec8f	openai[patch]: Release 0.1.16 (#24202 )	2024-07-12 13:58:39 -07:00
Bagatur	71cd6e6feb	groq[patch]: Release 0.1.7 (#24201 )	2024-07-12 13:58:19 -07:00
Bagatur	99054e19eb	mistralai[patch]: Release 0.1.10 (#24200 )	2024-07-12 13:57:58 -07:00
Bagatur	7a1321e2f9	ibm[patch]: Release 0.1.10 (#24199 )	2024-07-12 13:57:38 -07:00
Bagatur	cb5031f22f	integrations[patch]: require core >=0.2.17 (#24207 )	2024-07-12 20:54:01 +00:00
Nithish Raghunandanan	f1618ec540	couchbase: Add standard and semantic caches (#23607 ) Thank you for contributing to LangChain! Description: Add support for caching (standard + semantic) LLM responses using Couchbase - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 20:30:03 +00:00
Eugene Yurtsev	8d82a0d483	core[patch]: Mark GraphVectorStore as beta (#24195 ) * This PR marks graph vectorstore as beta	2024-07-12 14:28:06 -04:00
Bagatur	0a1e475a30	core[patch]: Release 0.2.17 (#24189 )	2024-07-12 17:08:29 +00:00
Bagatur	6166ea67a8	core[minor]: rename ToolMessage.raw_output -> artifact (#24185 )	2024-07-12 09:52:44 -07:00
Jean Nshuti	d77d9bfc00	community[patch]: update typo document content returned from semanticscholar (#24175 ) Update "astract" -> abstract	2024-07-12 15:40:47 +00:00
Leonid Ganeline	aa3e3cfa40	core[patch]: docstrings `runnables` update (#24161 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-12 11:27:06 -04:00
Tomaz Bratanic	d3a2b9fae0	Fix neo4j type error on missing constraint information (#24177 ) If you use `refresh_schema=False`, then the metadata constraint doesn't exist. ATM, we used default `None` in the constraint check, but then `any` fails because it can't iterate over None value	2024-07-12 06:39:29 -04:00
Anush	7014d07cab	qdrant: new Qdrant implementation (#24164 )	2024-07-12 04:52:02 +02:00
Xander Dumaine	35784d1c33	langchain[minor]: add document_variable_name to create_stuff_documents_chain (#24083 ) - Description: `StuffDocumentsChain` uses `LLMChain` which is deprecated by langchain runnables. `create_stuff_documents_chain` is the replacement, but needs support for `document_variable_name` to allow multiple uses of the chain within a longer chain. - Issue: none - Dependencies: none	2024-07-12 02:31:46 +00:00
Eugene Yurtsev	8858846607	milvus[patch]: Fix Milvus vectorstore for newer versions of langchain-core (#24152 ) Fix for: https://github.com/langchain-ai/langchain/issues/24116 This keeps the old behavior of add_documents and add_texts	2024-07-11 18:51:18 -07:00
thedavgar	ffe6ca986e	community: Fix Bug in Azure Search Vectorstore search asyncronously (#24081 ) Thank you for contributing to LangChain! Description: This PR fixes a bug described in the issue in #24064, when using the AzureSearch Vectorstore with the asyncronous methods to do search which is also the method used for the retriever. The proposed change includes just change the access of the embedding as optional because is it not used anywhere to retrieve documents. Actually, the syncronous methods of retrieval do not use the embedding neither. With this PR the code given by the user in the issue works. ```python vectorstore = AzureSearch( azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"), azure_search_key=os.getenv("AI_SEARCH_API_KEY"), index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"), fields=fields, embedding_function=encoder, ) retriever = vectorstore.as_retriever(search_type="hybrid", k=2) await vectorstore.avector_search("what is the capital of France") await retriever.ainvoke("what is the capital of France") ``` Issue: The Azure Search Vectorstore is not working when searching for documents with asyncronous methods, as described in issue #24064 Dependencies: There are no extra dependencies required for this change. --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-11 18:32:19 -07:00
Anush	7790d67f94	qdrant: New sparse embeddings provider interface - PART 1 (#24015 ) ## Description This PR introduces a new sparse embedding provider interface to work with the new Qdrant implementation that will follow this PR. Additionally, an implementation of this interface is provided with https://github.com/qdrant/fastembed. This PR will be followed by https://github.com/Anush008/langchain/pull/3.	2024-07-11 17:07:25 -07:00
Erick Friis	1132fb801b	core: release 0.2.16 (#24159 )	2024-07-11 23:59:41 +00:00
Nuno Campos	1d37aa8403	core: Remove extra newline (#24157 )	2024-07-11 23:55:36 +00:00
ccurme	cb95198398	standard-tests[patch]: add tests for runnables as tools and streaming usage metadata (#24153 )	2024-07-11 18:30:05 -04:00
Bagatur	8d100c58de	core[patch]: Tool accept RunnableConfig (#24143 ) Relies on #24038 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-11 22:13:17 +00:00
Bagatur	5fd1e67808	core[minor], integrations...[patch]: Support ToolCall as Tool input and ToolMessage as Tool output (#24038 ) Changes: - ToolCall, InvalidToolCall and ToolCallChunk can all accept a "type" parameter now - LLM integration packages add "type" to all the above - Tool supports ToolCall inputs that have "type" specified - Tool outputs ToolMessage when a ToolCall is passed as input - Tools can separately specify ToolMessage.content and ToolMessage.raw_output - Tools emit events for validation errors (using on_tool_error and on_tool_end) Example: ```python @tool("structured_api", response_format="content_and_raw_output") def _mock_structured_tool_with_raw_output( arg1: int, arg2: bool, arg3: Optional[dict] = None ) -> Tuple[str, dict]: """A Structured Tool""" return f"{arg1} {arg2}", {"arg1": arg1, "arg2": arg2, "arg3": arg3} def test_tool_call_input_tool_message_with_raw_output() -> None: tool_call: Dict = { "name": "structured_api", "args": {"arg1": 1, "arg2": True, "arg3": {"img": "base64string..."}}, "id": "123", "type": "tool_call", } expected = ToolMessage("1 True", raw_output=tool_call["args"], tool_call_id="123") tool = _mock_structured_tool_with_raw_output actual = tool.invoke(tool_call) assert actual == expected tool_call.pop("type") with pytest.raises(ValidationError): tool.invoke(tool_call) actual_content = tool.invoke(tool_call["args"]) assert actual_content == expected.content ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-11 14:54:02 -07:00
Bagatur	eeb996034b	core[patch]: Release 0.2.15 (#24149 )	2024-07-11 21:34:25 +00:00
Nuno Campos	03fba07d15	core[patch]: Update styles for mermaid graphs (#24147 )	2024-07-11 14:19:36 -07:00
ccurme	8ee8ca7c83	core[patch]: propagate `parse_docstring` to tool decorator (#24123 ) Disabled by default. ```python from langchain_core.tools import tool @tool(parse_docstring=True) def foo(bar: str, baz: int) -> str: """The foo. Args: bar: this is the bar baz: this is the baz """ return bar foo.args_schema.schema() ``` ```json { "title": "fooSchema", "description": "The foo.", "type": "object", "properties": { "bar": { "title": "Bar", "description": "this is the bar", "type": "string" }, "baz": { "title": "Baz", "description": "this is the baz", "type": "integer" } }, "required": [ "bar", "baz" ] } ```	2024-07-11 20:11:45 +00:00
Jacob Lee	f1f1f75782	community[patch]: Make AzureML endpoint return AI messages for type assistant (#24085 )	2024-07-11 21:45:30 +02:00
Eugene Yurtsev	4ba14adec6	core[patch]: Clean up indexing test code (#24139 ) Refactor the code to use the existing InMemroyVectorStore. This change is needed for another PR that moves some of the imports around (and messes up the mock.patch in this file)	2024-07-11 18:54:46 +00:00
Atul R	457677c1b7	community: Fixes use of ImagePromptTemplate with Ollama (#24140 ) Description: ImagePromptTemplate for Multimodal llms like llava when using Ollama Twitter handle: https://x.com/a7ulr Details: When using llava models / any ollama multimodal llms and passing images in the prompt as urls, langchain breaks with this error. ```python image_url_components = image_url.split(",") ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'dict' object has no attribute 'split' ``` From the looks of it, there was bug where the condition did check for a `url` field in the variable but missed to actually assign it. This PR fixes ImagePromptTemplate for Multimodal llms like llava when using Ollama specifically. @hwchase17	2024-07-11 11:31:48 -07:00
Matt	8327925ab7	community:support additional Azure Search Options (#24134 ) - Description: Support additional kwargs options for the Azure Search client (Described here https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#configurations) - Issue: N/A - Dependencies: No additional Dependencies ---------	2024-07-11 18:22:36 +00:00
ccurme	122e80e04d	core[patch]: add versionadded to `as_tool` (#24138 )	2024-07-11 18:08:08 +00:00
Erick Friis	c4417ea93c	core: release 0.2.14, remove poetry 1.7 incompatible flag from root (#24137 )	2024-07-11 17:59:51 +00:00
Isaac Francisco	7a62d3dbd6	standard-tests[patch]: test that bind_tools can accept regular python function (#24135 )	2024-07-11 17:42:17 +00:00
Nuno Campos	2428984205	core: Add metadata to graph json repr (#24131 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-11 17:23:52 +00:00
Harley Gross	ea3cd1ebba	community[minor]: added support for C in RecursiveCharacterTextSplitter (#24091 ) Description: Added support for C in RecursiveCharacterTextSplitter by reusing the separators for C++	2024-07-11 16:47:48 +00:00
Nuno Campos	3e454d7568	core: fix docstring (#24129 )	2024-07-11 16:38:14 +00:00
Eugene Yurtsev	08638ccc88	community[patch]: QianfanLLMEndpoint fix type information for the keys (#24128 ) Fix for issue: https://github.com/langchain-ai/langchain/issues/24126	2024-07-11 16:24:26 +00:00
Nuno Campos	ee3fe20af4	core: mermaid: Render metadata key-value pairs when drawing mermaid graph (#24103 ) - if node is runnable binding with metadata attached	2024-07-11 16:22:23 +00:00
Eugene Yurtsev	1e7d8ba9a6	ci[patch]: Update community linter to provide a helpful error message (#24127 ) Update community import linter to explain what's wrong	2024-07-11 16:22:08 +00:00
maang-h	16e178a8c2	docs: Add MiniMaxChat docstrings (#24026 ) - Description: Add MiniMaxChat rich docstrings. - Issue: the issue #22296	2024-07-11 10:55:02 -04:00
Christophe Bornet	5fc5ef2b52	community[minor]: Add graph store extractors (#24065 ) This adds an extractor interface and an implementation for HTML pages. Extractors are used to create GraphVectorStore Links on loaded content. Twitter handle: cbornet_	2024-07-11 10:35:31 -04:00
maang-h	9bcf8f867d	docs: Add SQLChatMessageHistory docstring (#23978 ) - Description: Add SQLChatMessageHistory docstring. - Issue: the issue #21983 Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-11 14:24:28 +00:00
Rafael Pereira	092e9ee0e6	community[minor]: Neo4j Fixed similarity docs (#23913 ) Description: There was missing some documentation regarding the `filter` and `params` attributes in similarity search methods. --------- Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-11 10:16:48 -04:00
Eugene Yurtsev	dc131ac42a	core[minor]: Add dispatching for custom events (#24080 ) This PR allows dispatching adhoc events for a given run. # Context This PR allows users to send arbitrary data to the callback system and to the astream events API from within a given runnable. This can be extremely useful to surface custom information to end users about progress etc. Integration with langsmith tracer will be done separately since the data cannot be currently visualized. It'll be accommodated using the events attribute of the Run # Examples with astream events ```python from langchain_core.callbacks import adispatch_custom_event from langchain_core.tools import tool @tool async def foo(x: int) -> int: """Foo""" await adispatch_custom_event("event1", {"x": x}) await adispatch_custom_event("event2", {"x": x}) return x + 1 async for event in foo.astream_events({'x': 1}, version='v2'): print(event) ``` ```python {'event': 'on_tool_start', 'data': {'input': {'x': 1}}, 'name': 'foo', 'tags': [], 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'metadata': {}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_tool_end', 'data': {'output': 2}, 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []} ``` ```python from langchain_core.callbacks import adispatch_custom_event from langchain_core.runnables import RunnableLambda @RunnableLambda async def foo(x: int) -> int: """Foo""" await adispatch_custom_event("event1", {"x": x}) await adispatch_custom_event("event2", {"x": x}) return x + 1 async for event in foo.astream_events(1, version='v2'): print(event) ``` ```python {'event': 'on_chain_start', 'data': {'input': 1}, 'name': 'foo', 'tags': [], 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'metadata': {}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_chain_stream', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'data': {'chunk': 2}, 'parent_ids': []} {'event': 'on_chain_end', 'data': {'output': 2}, 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []} ``` # Examples with handlers This is copy pasted from unit tests ```python class CustomCallbackManager(BaseCallbackHandler): def __init__(self) -> None: self.events: List[Any] = [] def on_custom_event( self, name: str, data: Any, , run_id: UUID, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, *kwargs: Any, ) -> None: assert kwargs == {} self.events.append( ( name, data, run_id, tags, metadata, ) ) callback = CustomCallbackManager() run_id = uuid.UUID(int=7) @RunnableLambda def foo(x: int, config: RunnableConfig) -> int: dispatch_custom_event("event1", {"x": x}) dispatch_custom_event("event2", {"x": x}, config=config) return x foo.invoke(1, {"callbacks": [callback], "run_id": run_id}) assert callback.events == [ ("event1", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}), ("event2", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}), ] ```	2024-07-11 02:25:12 +00:00
Erick Friis	71c2221f8c	openai: release 0.1.15 (#24097 )	2024-07-10 16:45:42 -07:00
Erick Friis	6ea6f9f7bc	core: release 0.2.13 (#24096 )	2024-07-10 16:39:15 -07:00
ccurme	975b6129f6	core[patch]: support conversion of runnables to tools (#23992 ) Open to other thoughts on UX. string input: ```python as_tool = retriever.as_tool() as_tool.invoke("cat") # [Document(...), ...] ``` typed dict input: ```python class Args(TypedDict): key: int def f(x: Args) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(f).as_tool( name="my tool", description="description", # name, description are inferred if not supplied ) as_tool.invoke({"key": 3}) # "6" ``` for untyped dict input, allow specification of parameters + types ```python def g(x: Dict[str, Any]) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(g).as_tool(arg_types={"key": int}) result = as_tool.invoke({"key": 3}) # "6" ``` Passing the `arg_types` is slightly awkward but necessary to ensure tool calls populate parameters correctly: ```python from typing import Any, Dict from langchain_core.runnables import RunnableLambda from langchain_openai import ChatOpenAI def f(x: Dict[str, Any]) -> str: return str(x["key"] * 2) runnable = RunnableLambda(f) as_tool = runnable.as_tool(arg_types={"key": int}) llm = ChatOpenAI().bind_tools([as_tool]) result = llm.invoke("Use the tool on 3.") tool_call = result.tool_calls[0] args = tool_call["args"] assert args == {"key": 3} as_tool.run(args) ``` Contrived (?) example with langgraph agent as a tool: ```python from typing import List, Literal from typing_extensions import TypedDict from langchain_openai import ChatOpenAI from langgraph.prebuilt import create_react_agent llm = ChatOpenAI(temperature=0) def magic_function(input: int) -> int: """Applies a magic function to an input.""" return input + 2 agent_1 = create_react_agent(llm, [magic_function]) class Message(TypedDict): role: Literal["human"] content: str agent_tool = agent_1.as_tool( arg_types={"messages": List[Message]}, name="Jeeves", description="Ask Jeeves.", ) agent_2 = create_react_agent(llm, [agent_tool]) ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-10 19:29:59 -04:00
Bagatur	6928f4c438	core[minor]: Add ToolMessage.raw_output (#23994 ) Decisions to discuss: 1. is a new attr needed or could additional_kwargs be used for this 2. is raw_output a good name for this attr 3. should raw_output default to {} or None 4. should raw_output be included in serialization 5. do we need to update repr/str to exclude raw_output	2024-07-10 20:11:10 +00:00
Eugene Yurtsev	c4e149d4f1	community[patch]: Add linter to catch @root_validator (#24070 ) - Add linter to prevent further usage of vanilla root validator - Udpate remaining root validators	2024-07-10 14:51:03 +00:00
ccurme	9c6efadec3	community[patch]: propagate cost information to OpenAI callback (#23996 ) This is enabled following https://github.com/langchain-ai/langchain/pull/22716.	2024-07-10 14:50:35 +00:00
William FH	1e1fd30def	[Core] Fix fstring in logger warning (#24043 )	2024-07-09 19:53:18 -07:00
Ethan Yang	13855ef0c3	[HuggingFace Pipeline] add streaming support (#23852 )	2024-07-09 17:02:00 -04:00
Nuno Campos	859e434932	core: Speed up json parse for large strings (#24036 ) for a large string: - old 4.657918874989264 - new 0.023724667000351474	2024-07-09 12:26:50 -07:00
Nuno Campos	160fc7f246	core: Move json parsing in base chat model / output parser to bg thread (#24031 ) - add version of AIMessageChunk.__add__ that can add many chunks, instead of only 2 - In agenerate_from_stream merge and parse chunks in bg thread - In output parse base classes do more work in bg threads where appropriate --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2024-07-09 12:26:36 -07:00
Nuno Campos	73966e693c	openai: Create msg chunk in bg thread (#24032 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-09 12:01:51 -07:00
Erick Friis	007c5a85d5	multiple: use modern installer in poetry (#23998 )	2024-07-08 18:50:48 -07:00
Erick Friis	e80c150c44	community: release 0.2.7 (prev was langchain) (#23997 )	2024-07-08 23:43:32 +00:00
Erick Friis	9f8fd08955	community: release 0.2.7 (#23993 )	2024-07-08 22:04:58 +00:00
Erick Friis	bedd893cd1	core: release 0.2.12 (#23991 )	2024-07-08 21:29:29 +00:00
Bagatur	1e957c0c23	docs: rm discord (#23985 )	2024-07-08 14:27:58 -07:00
Eugene Yurtsev	f765e8fa9d	core[minor],community[patch],standard-tests[patch]: Move InMemoryImplementation to langchain-core (#23986 ) This PR moves the in memory implementation to langchain-core. * The implementation remains importable from langchain-community. * Supporting utilities are marked as private for now.	2024-07-08 14:11:51 -07:00
Eugene Yurtsev	aa8c9bb4a9	community[patch]: Add constraint for pdfminer.six to unbreak CI (#23988 ) Something changed in pdfminer six. This PR unreaks CI without fixing the underlying PDF parser.	2024-07-08 20:55:19 +00:00
Eugene Yurtsev	2c180d645e	core[minor],community[minor]: Upgrade all @root_validator() to @pre_init (#23841 ) This PR introduces a @pre_init decorator that's a @root_validator(pre=True) but with all the defaults populated!	2024-07-08 16:09:29 -04:00
Eugene Yurtsev	9787552b00	core[patch]: Use InMemoryChatMessageHistory in unit tests (#23916 ) Update unit test to use the existing implementation of chat message history	2024-07-05 20:10:54 +00:00
Rajendra Kadam	8b84457b17	community[minor]: Support PGVector in PebbloRetrievalQA (#23874 ) - Description: Support PGVector in PebbloRetrievalQA - Identity and Semantic Enforcement support for PGVector - Refactor Vectorstore validation and name check - Clear the overridden identity and semantic enforcement filters - Issue: NA - Dependencies: NA - Tests: NA(already added) - Docs: Updated - Twitter handle: [@Raj__725](https://twitter.com/Raj__725)	2024-07-05 16:02:25 -04:00
Eugene Yurtsev	e0186df56b	core[patch]: Clarify upsert response semantics (#23921 )	2024-07-05 15:59:47 -04:00
Rajendra Kadam	ee8aa54f53	community[patch]: Fix source path mismatch in PebbloSafeLoader (#23857 ) Description: Fix for source path mismatch in PebbloSafeLoader. The fix involves storing the full path in the doc metadata in VectorDB Issue: NA, caught in internal testing Dependencies: NA Add tests: Updated tests	2024-07-05 15:24:17 -04:00
Eugene Yurtsev	5b7d5f7729	core[patch]: Add comment to clarify aadd_documents (#23920 ) Add comment to clarify how add documents works	2024-07-05 15:20:16 -04:00
Eugene Yurtsev	e0889384d9	standard-tests[minor]: add unit tests for testing get_by_ids, aget_by_ids, upsert, aupsert_by_ids (#23919 ) These standard unit tests provide standard tests for functionality introduced in these PRs: * https://github.com/langchain-ai/langchain/pull/23774 * https://github.com/langchain-ai/langchain/pull/23594	2024-07-05 19:11:54 +00:00
ccurme	74c7198906	core, anthropic[patch]: support streaming tool calls when function has no arguments (#23915 ) resolves https://github.com/langchain-ai/langchain/issues/23911 When an AIMessageChunk is instantiated, we attempt to parse tool calls off of the tool_call_chunks. Here we add a special-case to this parsing, where `""` will be parsed as `{}`. This is a reaction to how Anthropic streams tool calls in the case where a function has no arguments: ``` {'id': 'toolu_01J8CgKcuUVrMqfTQWPYh64r', 'input': {}, 'name': 'magic_function', 'type': 'tool_use', 'index': 1} {'partial_json': '', 'type': 'tool_use', 'index': 1} ``` The `partial_json` does not accumulate to a valid json string-- most other providers tend to emit `"{}"` in this case.	2024-07-05 18:57:41 +00:00
Mateusz Szewczyk	902b57d107	IBM: Added WatsonxChat passing params to invoke method (#23758 ) Thank you for contributing to LangChain! - [x] PR title: "IBM: Added WatsonxChat to chat models preview, update passing params to invoke method" - [x] PR message: - Description: Added WatsonxChat passing params to invoke method, added integration tests - Dependencies: `ibm_watsonx_ai` - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-05 18:07:50 +00:00
ccurme	1f5a163f42	langchain[patch]: deprecate QAGenerationChain (#23730 )	2024-07-05 18:06:19 +00:00
ccurme	25de47878b	langchain[patch]: deprecate AnalyzeDocumentChain (#23769 )	2024-07-05 14:00:23 -04:00
Christophe Bornet	42d049f618	core[minor]: Add Graph Store component (#23092 ) This PR introduces a GraphStore component. GraphStore extends VectorStore with the concept of links between documents based on document metadata. This allows linking documents based on a variety of techniques, including common keywords, explicit links in the content, and other patterns. This works with existing Documents, so it’s easy to extend existing VectorStores to be used as GraphStores. The interface can be implemented for any Vector Store technology that supports metadata, not only graph DBs. When retrieving documents for a given query, the first level of search is done using classical similarity search. Next, links may be followed using various traversal strategies to get additional documents. This allows documents to be retrieved that aren’t directly similar to the query but contain relevant information. 2 retrieving methods are added to the VectorStore ones : * traversal_search which gets all linked documents up to a certain depth * mmr_traversal_search which selects linked documents using an MMR algorithm to have more diverse results. If a depth of retrieval of 0 is used, GraphStore is effectively a VectorStore. It enables an easy transition from a simple VectorStore to GraphStore by adding links between documents as a second step. An implementation for Apache Cassandra is also proposed. See https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb for a notebook explaining how to use GraphStore and that shows that it can answer correctly to questions that a simple VectorStore cannot. Twitter handle: _cbornet	2024-07-05 12:24:10 -04:00
Leonid Ganeline	77f5fc3d55	core: docstrings `load` (#23787 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:23:19 -04:00
Eugene Yurtsev	6f08e11d7c	core[minor]: add upsert, streaming_upsert, aupsert, astreaming_upsert methods to the VectorStore abstraction (#23774 ) This PR rolls out part of the new proposed interface for vectorstores (https://github.com/langchain-ai/langchain/pull/23544) to existing store implementations. The PR makes the following changes: 1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert methods to the vectorstore. 2. Updates `add_texts` and `aadd_texts` to be non required with a default implementation that delegates to `upsert` and `aupsert` if those have been implemented. The original `add_texts` and `aadd_texts` methods are problematic as they spread object specific information across document and *kwargs. (e.g., ids are not a part of the document) 3. Adds a default implementation to `add_documents` and `aadd_documents` that delegates to `upsert` and `aupsert` respectively. 4. Adds standard unit tests to verify that a given vectorstore implements a correct read/write API. A downside of this implementation is that it creates `upsert` with a very similar signature to `add_documents`. The reason for introducing `upsert` is to: Remove any ambiguities about what information is allowed in `kwargs`. Specifically kwargs should only be used for information common to all indexed data. (e.g., indexing timeout). *Allow inheriting from an anticipated generalized interface for indexing that will allow indexing `BaseMedia` (i.e., allow making a vectorstore for images/audio etc.) `add_documents` can be deprecated in the future in favor of `upsert` to make sure that users have a single correct way of indexing content. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-05 12:21:40 -04:00
G Sreejith	3c752238c5	core[patch]: Fix typo in docstring (graphm -> graph) (#23910 ) Changes has been as per the request Replaced graphm with graph	2024-07-05 16:20:33 +00:00
Leonid Ganeline	12c92b6c19	core: docstrings `outputs` (#23889 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:18:17 -04:00
Leonid Ganeline	1eca98ec56	core: docstrings `prompts` (#23890 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:17:52 -04:00
Philippe PRADOS	289960bc60	community[patch]: Redis.delete should be a regular method not a static method (#23873 ) The `langchain_common.vectostore.Redis.delete()` must not be a `@staticmethod`. With the current implementation, it's not possible to have multiple instances of Redis vectorstore because all versions must share the `REDIS_URL`. It's not conform with the base class.	2024-07-05 12:04:58 -04:00
Mohammad Mohtashim	2274d2b966	core[patch]: Accounting for Optional Input Variables in BasePromptTemplate (#22851 ) Description: After reviewing the prompts API, it is clear that the only way a user can explicitly mark an input variable as optional is through the `MessagePlaceholder.optional` attribute. Otherwise, the user must explicitly pass in the `input_variables` expected to be used in the `BasePromptTemplate`, which will be validated upon execution. Therefore, to semantically handle a `MessagePlaceholder` `variable_name` as optional, we will treat the `variable_name` of `MessagePlaceholder` as a `partial_variable` if it has been marked as optional. This approach aligns with how the `variable_name` of `MessagePlaceholder` is already handled [here](https://github.com/keenborder786/langchain/blob/optional_input_variables/libs/core/langchain_core/prompts/chat.py#L991). Additionally, an attribute `optional_variable` has been added to `BasePromptTemplate`, and the `variable_name` of `MessagePlaceholder` is also made part of `optional_variable` when marked as optional. Moreover, the `get_input_schema` method has been updated for `BasePromptTemplate` to differentiate between optional and non-optional variables. Issue: #22832, #21425 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-05 15:49:40 +00:00
Klaudia Lemiec	a2082bc1f8	docs: Arxiv docs update (#23871 ) - [X] PR title - [X] PR message: *Delete this entire checklist* and replace with - Description: Update of docstrings and docpages - Issue: [22866](https://github.com/langchain-ai/langchain/issues/22866) - [X] Add tests and docs - [X] Lint and test	2024-07-05 11:43:51 -04:00
André Quintino	99b1467b63	community: add support for 'cloud' parameter in JiraAPIWrapper (#23057 ) - Description: Enhance JiraAPIWrapper to accept the 'cloud' parameter through an environment variable. This update allows more flexibility in configuring the environment for the Jira API. - Twitter handle: Andre_Q_Pereira --------- Co-authored-by: André Quintino <andre.quintino@tui.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-05 15:11:10 +00:00
wenngong	b1e90b3075	community: add model_name param valid for GPT4AllEmbeddings (#23867 ) Description: add model_name param valid for GPT4AllEmbeddings Issue: #23863 #22819 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-07-05 10:46:34 -04:00
volodymyr-memsql	a4eb6d0fb1	community: add SingleStoreDB semantic cache (#23218 ) This PR adds a `SingleStoreDBSemanticCache` class that implements a cache based on SingleStoreDB vector store, integration tests, and a notebook example. Additionally, this PR contains minor changes to SingleStoreDB vector store: - change add texts/documents methods to return a list of inserted ids - implement delete(ids) method to delete documents by list of ids - added drop() method to drop a correspondent database table - updated integration tests to use and check functionality implemented above CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2024-07-05 09:26:06 -04:00
Igor Drozdov	bb597b1286	feat(community): add bind_tools function for ChatLiteLLM (#23823 ) It's a follow-up to https://github.com/langchain-ai/langchain/pull/23765 Now the tools can be bound by calling `bind_tools` ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" # tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] tools = [GetWeather, GetPopulation] llm = ChatLiteLLM(model="claude-3-sonnet-20240229").bind_tools(tools) ai_msg = llm.invoke(prompt) print(ai_msg.tool_calls) ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-05 09:19:41 -04:00
Jiejun Tan	2be66a38d8	huggingface: Fix huggingface tei support (#22653 ) Update former pull request: https://github.com/langchain-ai/langchain/pull/22595. Modified `libs/partners/huggingface/langchain_huggingface/embeddings/huggingface_endpoint.py`, where the API call function does not match current [Text Embeddings Inference API](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/embed). One example is: ```json { "inputs": "string", "normalize": true, "truncate": false } ``` Parameters in `_model_kwargs` are not passed properly in the latest version. By the way, the issue [why cause 413? #50](https://github.com/huggingface/text-embeddings-inference/issues/50) might be solved.	2024-07-03 13:30:29 -07:00
Eugene Yurtsev	9ccc4b1616	core[patch]: Fix logic in BaseChatModel that processes the llm string that is used as a key for caching chat models responses (#23842 ) This PR should fix the following issue: https://github.com/langchain-ai/langchain/issues/23824 Introduced as part of this PR: https://github.com/langchain-ai/langchain/pull/23416 I am unable to reproduce the issue locally though it's clear that we're getting a `serialized` object which is not a dictionary somehow. The test below passes for me prior to the PR as well ```python def test_cache_with_sqllite() -> None: from langchain_community.cache import SQLiteCache from langchain_core.globals import set_llm_cache cache = SQLiteCache(database_path=".langchain.db") set_llm_cache(cache) chat_model = FakeListChatModel(responses=["hello", "goodbye"], cache=True) assert chat_model.invoke("How are you?").content == "hello" assert chat_model.invoke("How are you?").content == "hello" ```	2024-07-03 16:23:55 -04:00
Vadym Barda	9bb623381b	core[minor]: update conversion utils to handle RemoveMessage (#23840 )	2024-07-03 16:13:31 -04:00
Eugene Yurtsev	4ab78572e7	core[patch]: Speed up unit tests for imports (#23837 ) Speed up unit tests for imports	2024-07-03 15:55:15 -04:00
Nico Puhlmann	4a15fce516	langchain: update declarative_base import (#20056 ) Description: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (depreca ted since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9) --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-03 15:52:35 -04:00
Théo Deschamps	39b19cf764	core[patch]: extract input variables for `path` and `detail` keys in order to format an `ImagePromptTemplate` (#22613 ) - Description: Add support for `path` and `detail` keys in `ImagePromptTemplate`. Previously, only variables associated with the `url` key were considered. This PR allows for the inclusion of a local image path and a detail parameter as input to the format method. - Issues: - fixes #20820 - related to #22024 - Dependencies: None - Twitter handle: @DeschampsTho5 --------- Co-authored-by: tdeschamps <tdeschamps@kameleoon.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-03 18:58:42 +00:00
Bagatur	a4798802ef	cli[patch]: ruff 0.5 (#23833 )	2024-07-03 18:33:15 +00:00
Leonid Ganeline	55f6f91f17	core[patch]: docstrings `output_parsers` (#23825 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 14:27:40 -04:00
Philippe PRADOS	26cee2e878	partners[patch]: MongoDB vectorstore to return and accept string IDs (#23818 ) The mongdb have some errors. - `add_texts() -> List` returns a list of `ObjectId`, and not a list of string - `delete()` with `id` never remove chunks. --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-03 14:14:08 -04:00
Ikko Eltociear Ashimine	75734fbcf1	community: fix typo in unit tests for test_zenguard.py (#23819 ) enviroment -> environment - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM"	2024-07-03 14:05:42 -04:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
William FH	6cd56821dc	[Core] Unify function schema parsing (#23370 ) Use pydantic to infer nested schemas and all that fun. Include bagatur's convenient docstring parser Include annotation support Previously we didn't adequately support many typehints in the bind_tools() method on raw functions (like optionals/unions, nested types, etc.)	2024-07-03 09:55:38 -07:00
Oguz Vuruskaner	2a2c0d1a94	community[deepinfra]: fix tool call parsing. (#23162 ) This PR includes fix for DeepInfra tool call parsing.	2024-07-03 12:11:37 -04:00
maang-h	525109e506	feat: Implement ChatBaichuan asynchronous interface (#23589 ) - Description: Add interface to `ChatBaichuan` to support asynchronous requests - `_agenerate` method - `_astream` method --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-03 12:10:04 -04:00
Leonid Ganeline	716a316654	core: docstrings `indexing` (#23785 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:27:34 -04:00
Leonid Ganeline	30fdc2dbe7	core: docstrings `messages` (#23788 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:25:00 -04:00
ccurme	54e730f6e4	fireworks[patch]: read from tool calls attribute (#23820 )	2024-07-03 11:11:17 -04:00
Bagatur	ebb404527f	anthropic[patch]: Release 0.1.19 (#23783 )	2024-07-02 18:17:25 -04:00
Bagatur	6168c846b2	openai[patch]: Release 0.1.14 (#23782 )	2024-07-02 18:17:15 -04:00
Bagatur	cb9812593f	openai[patch]: expose model request payload (#23287 ) ![Screenshot 2024-06-21 at 3 12 12 PM](https://github.com/langchain-ai/langchain/assets/22008038/6243a01f-1ef6-4085-9160-2844d9f2b683)	2024-07-02 17:43:55 -04:00
Bagatur	ed200bf2c4	anthropic[patch]: expose payload (#23291 ) ![Screenshot 2024-06-21 at 4 56 02 PM](https://github.com/langchain-ai/langchain/assets/22008038/a2c6224f-3741-4502-9607-1a726a0551c9)	2024-07-02 17:43:47 -04:00
Bagatur	7a3d8e5a99	core[patch]: Release 0.2.11 (#23780 )	2024-07-02 17:35:57 -04:00
Bagatur	d677dadf5f	core[patch]: mark RemoveMessage beta (#23656 )	2024-07-02 21:27:21 +00:00
ccurme	1d54ac93bb	ai21[patch]: release 0.1.7 (#23781 )	2024-07-02 21:24:13 +00:00
Asaf Joseph Gardin	320dc31822	partners: AI21 Labs Jamba Streaming Support (#23538 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - [x] PR message: *Delete this entire checklist* and replace with - Description: Added support for streaming in AI21 Jamba Model - Twitter handle: https://github.com/AI21Labs - [x] Add tests and docs: If you're adding a new integration, please include - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-02 17:15:46 -04:00
Qingchuan Hao	5cd4083457	community: make bing web search as the only option (#23523 ) This PR make bing web search as the option for BingSearchAPIWrapper to facilitate and simply the user interface on Langchain. This is a follow-up work of https://github.com/langchain-ai/langchain/pull/23306.	2024-07-02 17:13:54 -04:00
ccurme	7c1cddf1b7	anthropic[patch]: release 0.1.18 (#23778 )	2024-07-02 16:46:47 -04:00
ccurme	c9dac59008	anthropic[patch]: fix model name in some integration tests (#23779 )	2024-07-02 20:45:52 +00:00
Bagatur	7a6c06cadd	anthropic[patch]: tool output parser fix (#23647 )	2024-07-02 16:33:22 -04:00
ccurme	46cbf0e4aa	anthropic[patch]: use core output parsers for structured output (#23776 ) Also add to standard tests for structured output.	2024-07-02 16:15:26 -04:00
kiarina	dc396835ed	langchain_anthropic: add stop_reason in ChatAnthropic stream result (#23689 ) `ChatAnthropic` can get `stop_reason` from the resulting `AIMessage` in `invoke` and `ainvoke`, but not in `stream` and `astream`. This is a different behavior from `ChatOpenAI`. It is possible to get `stop_reason` from `stream` as well, since it is needed to determine the next action after the LLM call. This would be easier to handle in situations where only `stop_reason` is needed. - Issue: NA - Dependencies: NA - Twitter handle: https://x.com/kiarina37	2024-07-02 15:16:20 -04:00
maang-h	e4e28a6ff5	community[patch]: Fix MiniMaxChat validate_environment error (#23770 ) - Description: Fix some issues in MiniMaxChat - Fix `minimax_api_host` not in `values` error - Remove `minimax_group_id` from reading environment variables, the `minimax_group_id` no longer use in MiniMaxChat - Invoke callback prior to yielding token, the issus #16913	2024-07-02 13:23:32 -04:00
SN	acc457f645	core[patch]: fix nested sections for mustache templating (#23747 ) The prompt template variable detection only worked for singly-nested sections because we just kept track of whether we were in a section and then set that to false as soon as we encountered an end block. i.e. the following: ``` {{#outerSection}} {{variableThatShouldntShowUp}} {{#nestedSection}} {{nestedVal}} {{/nestedSection}} {{anotherVariableThatShouldntShowUp}} {{/outerSection}} ``` Would yield `['outerSection', 'anotherVariableThatShouldntShowUp']` as input_variables (whereas it should just yield `['outerSection']`). This fixes that by keeping track of the current depth and using a stack.	2024-07-02 10:20:45 -07:00
Eugene Yurtsev	46ff0f7a3c	community[patch]: Update @root_validators to use explicit pre=True or pre=False (#23737 )	2024-07-02 10:47:21 -04:00
Igor Drozdov	b664dbcc36	feat(community): add support for tool_calls response (#23765 ) When `model_kwargs={"tools": tools}` are passed to `ChatLiteLLM`, they are executed, but the response is not recognized correctly Let's add `tool_calls` to the `additional_kwargs` Thank you for contributing to LangChain! ## ChatAnthropic I used the following example to verify the output of llm with tools: ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_anthropic import ChatAnthropic class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") llm = ChatAnthropic(model="claude-3-sonnet-20240229") llm_with_tools = llm.bind_tools([GetWeather, GetPopulation]) ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?") print(ai_msg.tool_calls) ``` I get the following response: ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01UfDA89knrhw3vFV9X47neT'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01NrYVRYae7m7z7tBgyPb3Gd'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01EPFEpDgzL6vV2dTpD9SVP5'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01B5J6tPJXgwwfhQX9BHP2dt'}] ``` ## LiteLLM Based on https://litellm.vercel.app/docs/completion/function_call ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool import litellm class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] response = litellm.completion(model="claude-3-sonnet-20240229", messages=[{'role': 'user', 'content': prompt}], tools=tools) print(response.choices[0].message.tool_calls) ``` ```python [ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetWeather'), id='toolu_01HeDWV5vP7BDFfytH5FJsja', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetWeather'), id='toolu_01EiLesUSEr3YK1DaE2jxsQv', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetPopulation'), id='toolu_01Xz26zvkBDRxEUEWm9pX6xa', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetPopulation'), id='toolu_01SDqKnsLjvUXuBsgAZdEEpp', type='function')] ``` ## ChatLiteLLM When I try the following ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] llm = ChatLiteLLM(model="claude-3-sonnet-20240229", model_kwargs={"tools": tools}) ai_msg = llm.invoke(prompt) print(ai_msg) print(ai_msg.tool_calls) ``` ```python content="Okay, let's find out the current weather and populations for Los Angeles and New York City:" response_metadata={'token_usage': Usage(prompt_tokens=329, completion_tokens=193, total_tokens=522), 'model': 'claude-3-sonnet-20240229', 'finish_reason': 'tool_calls'} id='run-748b7a84-84f4-497e-bba1-320bd4823937-0' [] ``` --- When I apply the changes of this PR, the output is ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_017D2tGjiaiakB1HadsEFZ4e'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01WrDpJfVqLkPejWzonPCbLW'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_016UKyYrVAV9Pz99iZGgGU7V'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01Sgv1imExFX1oiR1Cw88zKy'}] ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-02 10:42:08 -04:00
Eugene Yurtsev	338cef35b4	community[patch]: update @root_validator in utilities namespace (#23768 ) Update all utilities to use `pre=True` or `pre=False` https://github.com/langchain-ai/langchain/issues/22819	2024-07-02 14:33:01 +00:00
wenngong	ee5eedfa04	partners: support reading HuggingFace params from env (#23309 ) Description: 1. partners/HuggingFace module support reading params from env. Not adjust langchain_community/.../huggingfaceXX modules since they are deprecated. 2. pydantic 2 @root_validator migration. Issue: #22448 #22819 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-07-02 10:12:45 -04:00
antonpibm	ffde8a6a09	Milvus vectorstore: fix pass ids as argument after upsert (#23761 ) Description: Milvus vectorstore supports both `add_documents` via the base class and `upsert` method which deletes and re-adds documents based on their ids Issue: Due to mismatch in the interfaces the ids used by `upsert` are neglected in `add_documents`, as `ids` are passed as argument in `upsert` but via `kwargs` is `add_documents` This caused exceptions and inconsistency in the DB, tested with `auto_id=False` Fix: pass `ids` via `kwargs` to `add_documents`	2024-07-02 13:45:30 +00:00
Eugene Yurtsev	d084172b63	community[patch]: root validator set explicit pre=False or pre=True (#23764 ) See issue: https://github.com/langchain-ai/langchain/issues/22819	2024-07-02 09:42:05 -04:00
mattthomps1	cc55823486	docs: updated PPLX model (#23723 ) Description: updated pplx docs to reference a currently [supported model](https://docs.perplexity.ai/docs/model-cards). pplx-70b-online ->llama-3-sonar-small-32k-online --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-02 08:48:49 -04:00
Jacob Lee	7791d92711	community[patch]: Fix requests alias for load_tools (#23734 ) CC @baskaryan	2024-07-01 15:02:14 -07:00
Eugene Yurtsev	f24e38876a	community[patch]: Update root_validators to use explicit pre=True or pre=False (#23736 )	2024-07-01 17:13:23 -04:00
Yannick Stephan	5b1de2ae93	mistralai: Fixed streaming in MistralAI with ainvoke and callbacks (#22000 ) # Fix streaming in mistral with ainvoke - [x] PR title - [x] PR message - [x] Add tests and docs: 1. [x] Added a test for the fixed integration. 2. [x] An example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Ran `make format`, `make lint` and `make test` from the root of the package(s) I've modified. Hello * I Identified an issue in the mistral package where the callback streaming (see on_llm_new_token) was not functioning correctly when the streaming parameter was set to True and call with `ainvoke`. * The root cause of the problem was the streaming not taking into account. ( I think it's an oversight ) * To resolve the issue, I added the `streaming` attribut. * Now, the callback with streaming works as expected when the streaming parameter is set to True. ## How to reproduce ``` from langchain_mistralai.chat_models import ChatMistralAI chain = ChatMistralAI(streaming=True) # Add a callback chain.ainvoke(..) # Oberve on_llm_new_token # Now, the callback is given as streaming tokens, before it was in grouped format. ``` Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-01 20:53:09 +00:00
Eugene Yurtsev	5d2262af34	community[patch]: Update root_validators to use pre=True or pre=False (#23731 ) Update root_validators in preparation for pydantic 2 migration.	2024-07-01 20:10:15 +00:00
Eugene Yurtsev	ebcee4f610	core[patch]: Add versionadded to get_by_ids (#23728 )	2024-07-01 15:16:00 -04:00
Eugene Yurtsev	e800f6bb57	core[minor]: Create BaseMedia object (#23639 ) This PR implements a BaseContent object from which Document and Blob objects will inherit proposed here: https://github.com/langchain-ai/langchain/pull/23544 Alternative: Create a base object that only has an identifier and no metadata. For now decided against it, since that refactor can be done at a later time. It also feels a bit odd since our IDs are optional at the moment. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-01 15:07:30 -04:00
Chip Davis	04bc5f1a95	partners[azure]: fix having openai_api_base set for other packages (#22068 ) This fix is for #21726. When having other packages installed that require the `openai_api_base` environment variable, users are not able to instantiate the AzureChatModels or AzureEmbeddings. This PR adds a new value `ignore_openai_api_base` which is a bool. When set to True, it sets `openai_api_base` to `None` Two new tests were added for the `test_azure` and a new file `test_azure_embeddings` A different approach may be better for this. If you can think of better logic, let me know and I can adjust it. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-01 18:35:20 +00:00
Nuno Campos	b36e95caa9	core[patch]: use async messages where possible (#23718 ) Fix #23716 Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-01 18:33:05 +00:00
Spyros Avlonitis	8cfb2fa1b7	core[minor]: Add maxsize for InMemoryCache (#23405 ) This PR introduces a maxsize parameter for the InMemoryCache class, allowing users to specify the maximum number of items to store in the cache. If the cache exceeds the specified maximum size, the oldest items are removed. Additionally, comprehensive unit tests have been added to ensure all functionalities are thoroughly tested. The tests are written using pytest and cover both synchronous and asynchronous methods. Twitter: @spyrosavl --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-01 14:21:21 -04:00
maang-h	96af8f31ae	community[patch]: Invoke callback prior to yielding token (#23638 ) - Description: Invoke callback prior to yielding token in stream and astream methods for ChatZhipuAI. - Issue: the issue #16913	2024-07-01 18:12:24 +00:00
Eugene Yurtsev	b5aef4cf97	core[patch]: Fix llm string representation for serializable models (#23416 ) Fix LLM string representation for serializable objects. Fix for issue: https://github.com/langchain-ai/langchain/issues/23257 The llm string of serializable chat models is the serialized representation of the object. LangChain serialization dumps some basic information about non serializable objects including their repr() which includes an object id. This means that if a chat model has any non serializable fields (e.g., a cache), then any new instantiation of the those fields will change the llm representation of the chat model and cause chat misses. i.e., re-instantiating a postgres cache would result in cache misses!	2024-07-01 14:06:33 -04:00
nobbbbby	3904f2cd40	core: fix NameError (#23658 ) Description: In the chat_models module of the language model, the import statement for BaseModel has been moved from the conditionally imported section to the main import area, fixing `NameError `. Issue: fix `NameError `	2024-07-01 17:51:23 +00:00
Jordy Jackson Antunes da Rocha	a50eabbd48	experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling (#22793 ) - Description: Modified the prompt created by the function `create_unstructured_prompt` (which is called for LLMs that do not support function calling) by adding conditional checks that verify if restrictions on entity types and rel_types should be added to the prompt. If the user provides a sufficiently large text, the current prompt may fail to produce results in some LLMs. I have first seen this issue when I implemented a custom LLM class that did not support Function Calling and used Gemini 1.5 Pro, but I was able to replicate this issue using OpenAI models. By loading a sufficiently large text ```python from langchain_community.llms import Ollama from langchain_openai import ChatOpenAI, OpenAI from langchain_core.prompts import PromptTemplate import re from langchain_experimental.graph_transformers import LLMGraphTransformer from langchain_core.documents import Document with open("texto-longo.txt", "r") as file: full_text = file.read() partial_text = full_text[:4000] documents = [Document(page_content=partial_text)] # cropped to fit GPT 3.5 context window ``` And using the chat class (that has function calling) ```python chat_openai = ChatOpenAI(model="gpt-3.5-turbo", model_kwargs={"seed": 42}) chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai) graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents) ``` It works: ``` >>> print(graph_from_chat_gpt35[0].nodes) [Node(id="Jesu, Joy of Man's Desiring", type='Music'), Node(id='Godel', type='Person'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='clever way of encoding the complicated expressions as numbers', type='Concept')] ``` But if you try to use the non-chat LLM class (that does not support function calling) ```python openai = OpenAI( model="gpt-3.5-turbo-instruct", max_tokens=1000, ) gpt35_transformer = LLMGraphTransformer(llm=openai) graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents) ``` It uses the prompt that has issues and sometimes does not produce any result ``` >>> print(graph_from_gpt35[0].nodes) [] ``` After implementing the changes, I was able to use both classes more consistently: ```shell >>> chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai) >>> graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents) >>> print(graph_from_chat_gpt35[0].nodes) [Node(id="Jesu, Joy Of Man'S Desiring", type='Music'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='Godel', type='Person')] >>> gpt35_transformer = LLMGraphTransformer(llm=openai) >>> graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents) >>> print(graph_from_gpt35[0].nodes) [Node(id='I', type='Pronoun'), Node(id="JESU, JOY OF MAN'S DESIRING", type='Song'), Node(id='larger memory', type='Memory'), Node(id='this nice tree structure', type='Structure'), Node(id='how you can do it all with the numbers', type='Process'), Node(id='JOHANN SEBASTIAN BACH', type='Composer'), Node(id='type of structure', type='Characteristic'), Node(id='that', type='Pronoun'), Node(id='we', type='Pronoun'), Node(id='worry', type='Verb')] ``` The results are a little inconsistent because the GPT 3.5 model may produce incomplete json due to the token limit, but that could be solved (or mitigated) by checking for a complete json when parsing it.	2024-07-01 17:33:51 +00:00
Eugene Yurtsev	4f1821db3e	core[minor]: Add get_by_ids to vectorstore interface (#23594 ) This PR adds a part of the indexing API proposed in this RFC https://github.com/langchain-ai/langchain/pull/23544/files. It allows rolling out `get_by_ids` which should be uncontroversial to existing vectorstores without introducing new abstractions. The semantics for this method depend on the ability of identifying returned documents using the new optional ID field on documents: https://github.com/langchain-ai/langchain/pull/23411 Alternatives are: 1. Relax the sequence requirement ```python def get_by_ids(self, ids: Iterable[str], /) -> Iterable[Document]: ``` Rejected: - implementations are more likley to start batching with bad defaults - users would need to call list() or we'd need to introduce another convenience method 2. Support more kwargs ```python def get_by_ids(self, ids: Sequence[str], /, **kwargs) -> List[Document]: ... ``` Rejected: - No need for `batch` parameter since IDs is a sequence - Output cannot be customized since `Document` is fixed. (e.g., parameters could be useful to grab extra metadata like the vector that was indexed with the Document or to project a part of the document)	2024-07-01 13:04:33 -04:00
Valentin	bf402f902e	community: Fix LanceDB similarity search bug (#23591 ) Description: LanceDB didn't allow querying the database using similarity score thresholds because the metrics value was missing. This PR simply fixes that bug. Issue: not applicable Dependencies: none Twitter handle: not available --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-01 16:33:45 +00:00
Bagatur	389a568f9a	standard-tests[patch]: add anthropic format integration test (#23717 )	2024-07-01 11:06:04 -04:00
Rafael Pereira	4b9517db85	Jira: Allow Jira access using only the token (#23708 ) - Description: At the moment the Jira wrapper only accepts the the usage of the Username and Password/Token at the same time. However Jira allows the connection using only is useful for enterprise context. Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-01 13:13:51 +00:00
Tim Van Wassenhove	24916c6703	community: Register pandas df in duckdb when creating vector_store (#23690 ) - Description: Register pandas df in duckdb when creating vector_store - Issue: Resolves #23308 - Dependencies: None - Twitter handle: @timvw Co-authored-by: Tim Van Wassenhove <tim.van.wassenhove@telenetgroup.be>	2024-07-01 09:12:06 -04:00
Bagatur	29aa9d6750	groq[patch]: Release 0.1.6 (#23655 )	2024-06-29 07:35:23 -04:00
Bagatur	f2d0c13a15	fireworks[patch]: Release 0.1.4 (#23654 )	2024-06-29 07:35:16 -04:00
Bagatur	9a5e35d1ba	mistralai[patch]: Release 0.1.9 (#23653 )	2024-06-29 07:35:09 -04:00
Mateusz Szewczyk	a78ccb993c	ibm: Add support for Chat Models (#22979 )	2024-06-29 01:59:25 -07:00
Bagatur	af2c05e5f3	openai[patch]: Release 0.1.13 (#23651 )	2024-06-28 17:10:30 -07:00
Bagatur	b63c7f10bc	anthropic[patch]: Release 0.1.17 (#23650 )	2024-06-28 17:07:08 -07:00
Bagatur	fc8fd49328	openai, anthropic, ...: with_structured_output to pass in explicit tool choice (#23645 ) ...community, mistralai, groq, fireworks part of #23644	2024-06-28 16:39:53 -07:00
Bagatur	81064017a9	docs: azure openai docstring (#23643 ) part of #22296	2024-06-28 15:15:58 -07:00
Bagatur	381aedcc61	docs: standardize azure openai page (#23642 ) part of #22296	2024-06-28 15:15:41 -07:00
Vadym Barda	e8d77002ea	core: add RemoveMessage (#23636 ) This change adds a new message type `RemoveMessage`. This will enable `langgraph` users to manually modify graph state (or have the graph nodes modify the state) to remove messages by `id` Examples: * allow users to delete messages from state by calling ```python graph.update_state(config, values=[RemoveMessage(id=state.values[-1].id)]) ``` * allow nodes to delete messages ```python graph.add_node("delete_messages", lambda state: [RemoveMessage(id=state[-1].id)]) ```	2024-06-28 14:40:02 -07:00
ccurme	8fce8c6771	community: fix extended tests (#23640 )	2024-06-28 16:35:38 -04:00
ccurme	5d93916665	openai[patch]: release 0.1.12 (#23641 )	2024-06-28 19:51:16 +00:00
Jacob Lee	a032583b17	docs[patch]: Update diagrams (#23613 )	2024-06-28 12:36:00 -07:00
ccurme	390ee8d971	standard-tests: add test for structured output (#23631 ) - add test for structured output - fix bug with structured output for Azure - better testing on Groq (break out Mixtral + Llama3 and add xfails where needed)	2024-06-28 15:01:40 -04:00
j pradhan	5f21eab491	community:perplexity[patch]: standardize init args (#21794 ) updated request_timeout default alias value per related docstring. Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) Thank you for contributing to LangChain! --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-28 13:26:12 +00:00
mackong	11483b0fb8	community[patch]: set tool name for tongyi&qianfan llm (#22889 ) - Description: The name of ToolMessage is default to None, which makes tool message send to LLM likes ```json {"role": "tool", "tool_call_id": "", "content": "{\"time\": \"12:12\"}", "name": null} ``` But the name seems essential for some LLMs like TongYi Qwen. so we need to set the name use agent_action's tool value. - Issue: N/A - Dependencies: N/A	2024-06-28 09:17:05 -04:00
Leonid Ganeline	e4caa41aa9	community: docstrings `toolkits` (#23616 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-28 08:40:52 -04:00
ccurme	adf2dc13de	community: fix lint (#23611 )	2024-06-27 22:12:16 +00:00
Leonid Ganeline	75a44fe951	core: `chat_*` docstrings (#23412 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-27 17:29:38 -04:00
Bagatur	3b1fcb2a65	chroma[patch]: Release 0.1.2 (#23604 )	2024-06-27 13:58:24 -07:00
Eugene Yurtsev	68f348357e	community[patch]: Test InMemoryVectorStore with RWAPI test suite (#23603 ) Add standard test suite to InMemoryVectorStore implementation.	2024-06-27 16:43:43 -04:00
Eugene Yurtsev	da7beb1c38	core[patch]: Add unit test when catching generator exit (#23402 ) This pr adds a unit test for: https://github.com/langchain-ai/langchain/pull/22662 And narrows the scope where the exception is caught.	2024-06-27 20:36:07 +00:00
NG Sai Prasanth	5e6d23f27d	community: Standardise tool import for arxiv & semantic scholar (#23578 ) - Description: Fixing the way users have to import Arxiv and Semantic Scholar - Issue: Changed to use `from langchain_community.tools.arxiv import ArxivQueryRun` instead of `from langchain_community.tools.arxiv.tool import ArxivQueryRun` - Dependencies: None - Twitter handle: Nope	2024-06-27 16:35:50 -04:00
ccurme	d04f657424	langchain[patch]: deprecate ConversationChain (#23504 ) Would like some feedback on how to best incorporate legacy memory objects into `RunnableWithMessageHistory`.	2024-06-27 16:32:44 -04:00
Ayo Ayibiowu	c6f700b7cb	fix(community): allow support for disabling max_tokens args (#21534 ) This PR fixes an issue with not able to use unlimited/infinity tokens from the respective provider for the LiteLLM provider. This is an issue when working in an agent environment that the token usage can drastically increase beyond the initial value set causing unexpected behavior.	2024-06-27 16:28:59 -04:00
Leonid Ganeline	c0fdbaac85	langchain: docstrings in `agents` root (#23561 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-27 15:52:18 -04:00
Leonid Ganeline	b64c4b4750	langchain: docstrings `agents` nested (#23598 ) Added missed docstrings. Formatted docstrings to the consistent form. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-27 19:49:41 +00:00
mackong	70834cd741	community[patch]: support convert FunctionMessage for Tongyi (#23569 ) Description: For function call agent with Tongyi, cause the AgentAction will be converted to FunctionMessage by `47f69fe0d8/libs/core/langchain_core/agents.py (L188)` But now Tongyi's convert_message_to_dict doesn't support FunctionMessage `47f69fe0d8/libs/community/langchain_community/chat_models/tongyi.py (L184-L207)` Then next round conversation will be failed by the TypeError exception. This patch adds the support to convert FunctionMessage for Tongyi. Issue: N/A Dependencies: N/A	2024-06-27 15:49:26 -04:00
Bagatur	d45ece0e58	chroma[patch]: loosen py req (#23599 ) currently causes issues if you try adding to a project that supports py<4	2024-06-27 12:40:59 -07:00
Mohammad Mohtashim	4796b7eb15	[Community [HuggingFace]]: Small Fix for ChatHuggingFace. (#22925 ) - Description: A small fix where I moved the `available_endpoints` in order to avoid the token error in the below issue. Also I have added conftest file and updated the `scripy`,`numpy` versions to support newer python versions in poetry files. - Issue: #22804 --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-27 19:37:20 +00:00
ccurme	bffc3c24a0	openai[patch]: release 0.1.11 (#23596 )	2024-06-27 18:48:40 +00:00
ccurme	a1520357c8	openai[patch]: revert addition of "name" to supported properties for tool messages (#23600 )	2024-06-27 18:40:04 +00:00
joshc-ai21	16a293cc3a	Small bug fixes (#23353 ) Small bug fixes according to your comments --------- Signed-off-by: Joffref <mariusjoffre@gmail.com> Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Baskar Gopinath <73015364+baskargopinath@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Mathis Joffre <51022808+Joffref@users.noreply.github.com> Co-authored-by: Baur <baur.krykpayev@gmail.com> Co-authored-by: Nuradil <nuradil.maksut@icloud.com> Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com> Co-authored-by: Jacob Lee <jacoblee93@gmail.com> Co-authored-by: Rave Harpaz <rave.harpaz@oracle.com> Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: RUO <61719257+comsa33@users.noreply.github.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Luis Rueda <userlerueda@gmail.com> Co-authored-by: Jib <Jibzade@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: S M Zia Ur Rashid <smziaurrashid@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: yuncliu <lyc1990@qq.com> Co-authored-by: wenngong <76683249+wenngong@users.noreply.github.com> Co-authored-by: gongwn1 <gongwn1@lenovo.com> Co-authored-by: Mirna Wong <89008547+mirnawong1@users.noreply.github.com> Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: maang-h <55082429+maang-h@users.noreply.github.com> Co-authored-by: asafg <asafg@ai21.com> Co-authored-by: Asaf Joseph Gardin <39553475+Josephasafg@users.noreply.github.com>	2024-06-27 17:58:22 +00:00
ccurme	5536420bee	openai[patch]: add comment (#23595 ) Forgot to push this to https://github.com/langchain-ai/langchain/pull/23551	2024-06-27 16:47:14 +00:00
andrewmjc	9f0f3c7e29	partners[openai]: Add name field to tool message to match OpenAI spec (#23551 ) Discovered alongside @t968914 - Description: According to OpenAI docs, tool messages (response from calling tools) must have a 'name' field. https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models - Issue: N/A (as of right now) - Dependencies: N/A - Twitter handle: N/A Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-27 12:42:36 -04:00
Krista Pratico	85e36b0f50	partners[openai]: only add stream_options to kwargs if requested (#23552 ) - Description: This PR https://github.com/langchain-ai/langchain/pull/22854 added the ability to pass `stream_options` through to the openai service to get token usage information in the response. Currently OpenAI supports this parameter, but Azure OpenAI does not yet. For users who proxy their calls to both services through ChatOpenAI, this breaks when targeting Azure OpenAI (see related discussion opened in openai-python: https://github.com/openai/openai-python/issues/1469#issuecomment-2192658630). > Error code: 400 - {'error': {'code': None, 'message': 'Unrecognized request argument supplied: stream_options', 'param': None, 'type': 'invalid_request_error'}} This PR fixes the issue by only adding `stream_options` to the request if it's actually requested by the user (i.e. set to True). If I'm not mistaken, we have a test case that already covers this scenario: https://github.com/langchain-ai/langchain/blob/master/libs/partners/openai/tests/integration_tests/chat_models/test_base.py#L398-L399 - Issue: Issue opened in openai-python: https://github.com/openai/openai-python/issues/1469 - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-27 12:23:05 -04:00
Eugene Yurtsev	96b72edac8	core[minor]: Add optional ID field to Document schema (#23411 ) This PR adds an optional ID field to the document schema. # 1. Optional or Required - An optional field will will requrie additional checking for the type in user code (annoying). - However, vectorstores currently don't respect this field. So if we make it required and start returning random UUIDs that might be even more confusing to users. Proposal: Start with Optional and convert to Required (with default set to uuid4()) in 1-2 major releases. # 2. Override __str__ or generic solution in prompts Overriding __str__ as a simple way to avoid changing user code that relies on default str(document) in prompts. I considered rolling out a more general solution in prompts (https://github.com/langchain-ai/langchain/pull/8685), but to do that we need to: 1. Make things serializable 2. The more general solution would likely need to be backwards compatible as well 3. It's unclear that one wants to format a List[int] in the same way as List[Document]. The former should be `,` seperated (likely), the latter should be `---` separated (likely). Proposal Start with __str__ override and focus on the vectorstore APIs, we generalize prompts later	2024-06-27 12:15:58 -04:00
ccurme	5bfcb898ad	openai[patch]: bump sdk version (#23592 ) Tests failing with `TypeError: Completions.create() got an unexpected keyword argument 'parallel_tool_calls'`	2024-06-27 11:57:24 -04:00
Jacob Lee	60fc15a56b	docs[patch]: Update docs introduction and README (#23558 ) CC @hwchase17 @baskaryan	2024-06-27 08:51:43 -07:00
mackong	daf733b52e	langchain[minor]: fix comment typo (#23564 ) Description: fix typo of comment Issue: N/A Dependencies: N/A	2024-06-27 10:09:18 -04:00
Leonid Ganeline	2c9b84c3a8	core[patch]: docstrings `agents` (#23502 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:50:48 -04:00
Leonid Ganeline	2a5d59b3d7	core[patch]: `callbacks` docstrings (#23375 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:11:06 -04:00
Leonid Ganeline	1141b08eb8	core: docstrings `example_selectors` (#23542 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:10:40 -04:00
wenngong	3bf1d98dbf	langchain[patch]: update agent and chains modules root_validators (#23256 ) Description: update agent and chains modules Pydantic root_validators. Issue: the issue #22819 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-26 17:09:50 -04:00
Bagatur	a7ab93479b	anthropic[patch]: Release 0.1.16 (#23549 )	2024-06-26 20:49:13 +00:00
Jib	c0fcf76e93	LangChain-MongoDB: [Experimental] Driver-side index creation helper (#19359 ) ## Description Created a helper method to make vector search indexes via client-side pymongo. Recent Update -- Removed error suppressing/overwriting layer in favor of letting the original exception provide information. ## ToDo's - [x] Make _wait_untils for integration test delete index functionalities. - [x] Add documentation for its use. Highlight it's experimental - [x] Post Integration Test Results in a screenshot - [x] Get review from MongoDB internal team (@shaneharvey, @blink1073 , @NoahStapp , @caseyclements) - [x] Add tests and docs: If you're adding a new integration, please include 1. Added new integration tests. Not eligible for unit testing since the operation is Atlas Cloud specific. 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. ![image](https://github.com/langchain-ai/langchain/assets/2887713/a3fc8ee1-e04c-4976-accc-fea0eeae028a) - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-26 15:07:28 -04:00
maang-h	5070004e8a	docs: Update Tongyi ChatModel docstring (#23540 ) - Description: Update Tongyi ChatModel rich docstring - Issue: the issue #22296	2024-06-26 13:07:13 -04:00
yonarw	6d0ebbca1e	community: SAP HANA Vector Engine fix for latest HANA release (#23516 ) - Description: This PR fixes an issue with SAP HANA Cloud QRC03 version. In that version the number to indicate no length being set for a vector column changed from -1 to 0. The change in this PR support both behaviours (old/new). - Dependencies: No dependencies have been introduced. - Tests: The change is covered by previous unit tests.	2024-06-26 13:15:51 +00:00
Roman Solomatin	1e3e05b0c3	openai[patch]: add support for extra_body (#23404 ) Description: Add support passing extra_body parameter Some OpenAI compatible API's have additional parameters (for example [vLLM](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters)) that can be passed thought `extra_body`. Same question in https://github.com/openai/openai-python/issues/767 <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->	2024-06-26 13:11:59 +00:00
Alireza Kashani	c39521b70d	Update grobid.py (#23399 ) fixed potential `IndexError: list index out of range` in case there is no title Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-26 09:11:02 -04:00
Qingchuan Hao	ee282a1d2e	community: add missing link (#23526 )	2024-06-26 09:06:28 -04:00
Lincoln Stein	c314222796	Add a conversation memory that combines a (optionally persistent) vectorstore history with a token buffer (#22155 ) langchain: ConversationVectorStoreTokenBufferMemory -Description: This PR adds ConversationVectorStoreTokenBufferMemory. It is similar in concept to ConversationSummaryBufferMemory. It maintains an in-memory buffer of messages up to a preset token limit. After the limit is hit timestamped messages are written into a vectorstore retriever rather than into a summary. The user's prompt is then used to retrieve relevant fragments of the previous conversation. By persisting the vectorstore, one can maintain memory from session to session. -Issue: n/a -Dependencies: none -Twitter handle: Please no!!! - [X] Add tests and docs: I looked to see how the unit tests were written for the other ConversationMemory modules, but couldn't find anything other than a test for successful import. I need to know whether you are using pytest.mock or another fixture to simulate the LLM and vectorstore. In addition, I would like guidance on where to place the documentation. Should it be a notebook file in docs/docs? - [X] Lint and test: I am seeing some linting errors from a couple of modules unrelated to this PR. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Lincoln Stein <lstein@gmail.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-25 20:17:10 -07:00
Bagatur	32f8f39974	core[patch]: use args_schema doc for tool description (#23503 )	2024-06-25 15:26:35 -07:00
ccurme	6f7fe82830	text-splitters: release 0.2.2 (#23508 )	2024-06-25 18:26:05 -04:00
ccurme	62b16fcc6b	experimental: release 0.0.62 (#23507 )	2024-06-25 22:01:35 +00:00
ccurme	99ce84ef23	community: release 0.2.6 (#23501 )	2024-06-25 21:29:52 +00:00
ccurme	03c41e725e	langchain: release 0.2.6 (#23426 )	2024-06-25 21:03:41 +00:00
ccurme	86ca44d451	core: release 0.2.10 (#23420 )	2024-06-25 16:26:31 -04:00
Isaac Francisco	85f5d14cef	[docs]: split up tool docs (#22919 )	2024-06-25 13:15:08 -07:00
Nuradil	c93d9e66e4	Community: Update and fix ZenGuardTool docs and add ZenguardTool to init files (#23415 ) Thank you for contributing to LangChain! - [x] PR title: "community: update docs and add tool to init.py" - [x] PR message: - Description: Fixed some errors and comments in the docs and added our ZenGuardTool and additional classes to init.py for easy access when importing - Question: when will you update the langchain-community package in pypi to make our tool available? - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Thank you for review! --------- Co-authored-by: Baur <baur.krykpayev@gmail.com>	2024-06-25 19:26:32 +00:00
William FH	8955bc1866	[Core] Logging: Suppress missing parent warning (#23363 )	2024-06-25 14:57:23 -04:00
ccurme	730c551819	core[patch]: export tool output parsers from langchain_core.output_parsers (#23305 ) These currently read off AIMessage.tool_calls, and only fall back to OpenAI parsing if tool calls aren't populated. Importing these from `openai_tools` (e.g., in our [tool calling docs](https://python.langchain.com/v0.2/docs/how_to/tool_calling/#tool-calls)) can lead to confusion. After landing, would need to release core and update docs.	2024-06-25 14:40:42 -04:00
Eugene Yurtsev	7e9e69c758	core[patch]: Add unit test for str and repr for Document (#23414 )	2024-06-25 18:28:21 +00:00
Bagatur	92ac0fc9bd	openai[patch]: Release 0.1.10 (#23410 )	2024-06-25 17:40:02 +00:00
Bagatur	9d145b9630	openai[patch]: fix tool calling token counting (#23408 ) Resolves https://github.com/langchain-ai/langchain/issues/23388	2024-06-25 10:34:25 -07:00
Tomaz Bratanic	22fa32e164	LLM Graph transformer dealing with empty strings (#23368 ) Pydantic allows empty strings: ``` from langchain.pydantic_v1 import Field, BaseModel class Property(BaseModel): """A single property consisting of key and value""" key: str = Field(..., description="key") value: str = Field(..., description="value") x = Property(key="", value="") ``` Which can produce errors downstream. We simply ignore those records	2024-06-25 13:01:53 -04:00
Riccardo Schirone	4530d851e4	Merge pull request #22662 * core: runnables: special handling GeneratorExit because no error	2024-06-25 08:42:03 -04:00
Qingchuan Hao	ad50702934	community: add default value to bing_search_url (#23306 ) bing_search_url is an endpoint to requests bing search resource and is normally invariant to users, we can give it the default value to simply the uesages of this utility/tool	2024-06-25 08:08:41 -04:00
ccurme	68e0ae3286	langchain[patch]: update removal target for LLMChain (#23373 ) to 1.0 Also improve replacement example in docstring.	2024-06-24 21:51:29 +00:00
wenngong	b33d2346db	community: FlashrankRerank support loading customer client (#23350 ) Description: FlashrankRerank Document compressor support loading customer client Issue: #23338 Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-06-24 17:50:08 -04:00
maang-h	f58c40b4e3	docs: Update QianfanChatEndpoint ChatModel docstring (#23337 ) - Description: Update QianfanChatEndpoint ChatModel rich docstring - Issue: the issue #22296	2024-06-24 17:42:46 -04:00
Rahul Triptahi	9ef93ecd7c	community[minor]: Added classification_location parameter in PebbloSafeLoader. (#22565 ) Description: Add classifier_location feature flag. This flag enables Pebblo to decide the classifier location, local or pebblo-cloud. Unit Tests: N/A Documentation: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-24 17:30:38 -04:00
wenngong	af620db9c7	partners: add lint docstrings for azure-dynamic-sessions/together modules (#23303 ) Description: add lint docstrings for azure-dynamic-sessions/together modules Issue: #23188 @baskaryan test: ruff check passed. <img width="782" alt="image" src="https://github.com/langchain-ai/langchain/assets/76683249/bf11783d-65b3-4e56-a563-255eae89a3e4"> --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-06-24 16:26:54 -04:00
yuncliu	398b2b9c51	community[minor]: Add Ascend NPU optimized Embeddings (#20260 ) - Description: Add NPU support for embeddings --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 20:15:11 +00:00
Luis Rueda	168e9ed3a5	partners: add custom options to MongoDBChatMessageHistory (#22944 ) Description: Adds options for configuring MongoDBChatMessageHistory (no breaking changes): - session_id_key: name of the field that stores the session id - history_key: name of the field that stores the chat history - create_index: whether to create an index on the session id field - index_kwargs: additional keyword arguments to pass to the index creation Discussion: https://github.com/langchain-ai/langchain/discussions/22918 Twitter handle: @userlerueda --------- Co-authored-by: Jib <Jibzade@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-24 19:42:56 +00:00
Eugene Yurtsev	1e750f12f6	standard-tests[minor]: Add standard read write test suite for vectorstores (#23355 ) Add standard read write test suite for vectorstores	2024-06-24 19:40:56 +00:00
Eugene Yurtsev	3b3ed72d35	standard-tests[minor]: Add standard tests for BaseStore (#23360 ) Add standard tests to base store abstraction. These only work on [str, str] right now. We'll need to check if it's possible to add encoder/decoders to generalize	2024-06-24 19:38:50 +00:00
ccurme	e1190c8f3c	mongodb[patch]: fix CI for python 3.12 (#23369 )	2024-06-24 19:31:20 +00:00
RUO	2b87e330b0	community: fix issue with nested field extraction in MongodbLoader (#22801 ) Description: This PR addresses an issue in the `MongodbLoader` where nested fields were not being correctly extracted. The loader now correctly handles nested fields specified in the `field_names` parameter. Issue: Fixes an issue where attempting to extract nested fields from MongoDB documents resulted in `KeyError`. Dependencies: No new dependencies are required for this change. Twitter handle: (Optional, your Twitter handle if you'd like a mention when the PR is announced) ### Changes 1. Field Name Parsing: - Added logic to parse nested field names and safely extract their values from the MongoDB documents. 2. Projection Construction: - Updated the projection dictionary to include nested fields correctly. 3. Field Extraction: - Updated the `aload` method to handle nested field extraction using a recursive approach to traverse the nested dictionaries. ### Example Usage Updated usage example to demonstrate how to specify nested fields in the `field_names` parameter: ```python loader = MongodbLoader( connection_string=MONGO_URI, db_name=MONGO_DB, collection_name=MONGO_COLLECTION, filter_criteria={"data.job.company.industry_name": "IT", "data.job.detail": { "$exists": True }}, field_names=[ "data.job.detail.id", "data.job.detail.position", "data.job.detail.intro", "data.job.detail.main_tasks", "data.job.detail.requirements", "data.job.detail.preferred_points", "data.job.detail.benefits", ], ) docs = loader.load() print(len(docs)) for doc in docs: print(doc.page_content) ``` ### Testing Tested with a MongoDB collection containing nested documents to ensure that the nested fields are correctly extracted and concatenated into a single page_content string. ### Note This change ensures backward compatibility for non-nested fields and improves functionality for nested field extraction. ### Output Sample ```python print(docs[:3]) ``` ```shell # output sample: [ Document( # Here in this example, page_content is the combined text from the fields below # "position", "intro", "main_tasks", "requirements", "preferred_points", "benefits" page_content='all combined contents from the requested fields in the document', metadata={'database': 'Your Database name', 'collection': 'Your Collection name'} ), ... ] ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 19:29:11 +00:00
Tomaz Bratanic	aeeda370aa	Sanitize backticks from neo4j labels and types for import (#23367 )	2024-06-24 19:05:31 +00:00
Rave Harpaz	f5ff7f178b	Add OCI Generative AI new model support (#22880 ) - [x] PR title: community: Add OCI Generative AI new model support - [x] PR message: - Description: adding support for new models offered by OCI Generative AI services. This is a moderate update of our initial integration PR 16548 and includes a new integration for our chat models under /langchain_community/chat_models/oci_generative_ai.py - Issue: NA - Dependencies: No new Dependencies, just latest version of our OCI sdk - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests 2. we have updated our documentation including a new ipynb for our new chat integration - [x] Lint and test: `make format`, `make lint`, and `make test` run successfully --------- Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>	2024-06-24 14:48:23 -04:00
Baur	aa358f2be4	community: Add ZenGuard tool (#22959 ) Description This is the community integration of ZenGuard AI - the fastest guardrails for GenAI applications. ZenGuard AI protects against: - Prompts Attacks - Veering of the pre-defined topics - PII, sensitive info, and keywords leakage. - Toxicity - Etc. Twitter Handle : @zenguardai - [x] Add tests and docs: If you're adding a new integration, please include 1. Added an integration test 2. Added colab - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. --------- Co-authored-by: Nuradil <nuradil.maksut@icloud.com> Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com>	2024-06-24 17:40:56 +00:00
Mathis Joffre	60103fc4a5	community: Fix OVHcloud 401 Unauthorized on embedding. (#23260 ) They are now rejecting with code 401 calls from users with expired or invalid tokens (while before they were being considered anonymous). Thus, the authorization header has to be removed when there is no token. Related to: #23178 --------- Signed-off-by: Joffref <mariusjoffre@gmail.com>	2024-06-24 12:58:32 -04:00
Eugene Yurtsev	d90379210a	standard-tests[minor]: Add standard tests for cache (#23357 ) Add standard tests for cache abstraction	2024-06-24 15:15:03 +00:00
Leonid Ganeline	987099cfcd	community: `toolkits` docstrings (#23286 ) Added missed docstrings. Formatted docstrings to the consistent form. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-22 14:37:52 +00:00
Rahul Triptahi	0cd3f93361	Enhance metadata of sharepointLoader. (#22248 ) Description: 2 feature flags added to SharePointLoader in this PR: 1. load_auth: if set to True, adds authorised identities to metadata 2. load_extended_metadata, adds source, owner and full_path to metadata Unit tests:N/A Documentation: To be done. --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-21 17:03:38 -07:00
Bagatur	bcac6c3aff	openai[patch]: temp fix ignore lint (#23290 )	2024-06-21 16:52:52 -07:00
William FH	efb4c12abe	[Core] Add support for inferring Annotated types (#23284 ) in bind_tools() / convert_to_openai_function	2024-06-21 15:16:30 -07:00
Vadym Barda	9ac302cb97	core[minor]: update draw_mermaid node label processing (#23285 ) This fixes processing issue for nodes with numbers in their labels (e.g. `"node_1"`, which would previously be relabeled as `"node__"`, and now are correctly processed as `"node_1"`)	2024-06-21 21:35:32 +00:00
Rajendra Kadam	7ee2822ec2	community: Fix TypeError in PebbloRetrievalQA (#23170 ) Description: Fix "`TypeError: 'NoneType' object is not iterable`" when the auth_context is absent in PebbloRetrievalQA. The auth_context is optional; hence, PebbloRetrievalQA should work without it, but it throws an error at the moment. This PR fixes that issue. Issue: NA Dependencies: None Unit tests: NA --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-21 17:04:00 -04:00
Iurii Umnov	3b7b933aa2	community[minor]: OpenAPI agent. Add support for PUT, DELETE and PATCH (#22962 ) Description: Add PUT, DELETE and PATCH tools to tool list for OpenAPI agent if dangerous requests are allowed. Issue: https://github.com/langchain-ai/langchain/issues/20469	2024-06-21 20:44:23 +00:00
Guangdong Liu	3c42bf8d97	community(patch):Fix PineconeHynridSearchRetriever not having search_kwargs (#21577 ) - close #21521	2024-06-21 16:27:52 -04:00
Rahul Triptahi	4bb3d5c488	[community][quick-fix]: changed from blob.path to blob.path.name in 0365BaseLoader. (#22287 ) Description: file_metadata_ was not getting propagated to returned documents. Changed the lookup key to the name of the blob's path. Changed blob.path key to blob.path.name for metadata_dict key lookup. Documentation: N/A Unit tests: N/A Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-21 15:51:03 -04:00
Bagatur	f824f6d925	docs: fix merge message runs docstring (#23279 )	2024-06-21 19:50:50 +00:00
wenngong	f9aea3db07	partners: add lint docstrings for chroma module (#23249 ) Description: add lint docstrings for chroma module Issue: the issue #23188 @baskaryan test: ruff check passed. ![image](https://github.com/langchain-ai/langchain/assets/76683249/5e168a0c-32d0-464f-8ddb-110233918019) --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-06-21 19:49:24 +00:00
Bagatur	9eda8f2fe8	docs: fix trim_messages code blocks (#23271 )	2024-06-21 17:15:31 +00:00
Bagatur	4c97a9ee53	docs: fix message transformer docstrings (#23264 )	2024-06-21 16:10:03 +00:00
Vwake04	0deb98ac0c	pinecone: Fix multiprocessing issue in PineconeVectorStore (#22571 ) Description: Currently, the `langchain_pinecone` library forces the `async_req` (asynchronous required) argument to Pinecone to `True`. This design choice causes problems when deploying to environments that do not support multiprocessing, such as AWS Lambda. In such environments, this restriction can prevent users from successfully using `langchain_pinecone`. This PR introduces a change that allows users to specify whether they want to use asynchronous requests by passing the `async_req` parameter through `kwargs`. By doing so, users can set `async_req=False` to utilize synchronous processing, making the library compatible with AWS Lambda and other environments that do not support multithreading. Issue: This PR does not address a specific issue number but aims to resolve compatibility issues with AWS Lambda by allowing synchronous processing. Dependencies:** None, that I'm aware of. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-21 15:46:01 +00:00
ccurme	75c7c3a1a7	openai: release 0.1.9 (#23263 )	2024-06-21 11:15:29 -04:00
Brace Sproul	abe7566d7d	core[minor]: BaseChatModel with_structured_output implementation (#22859 )	2024-06-21 08:14:03 -07:00
mackong	360a70c8a8	core[patch]: fix no current event loop for sql history in async mode (#22933 ) - Description: When use RunnableWithMessageHistory/SQLChatMessageHistory in async mode, we'll get the following error: ``` Error in RootListenersTracer.on_chain_end callback: RuntimeError("There is no current event loop in thread 'asyncio_3'.") ``` which throwed by `ddfbca38df/libs/community/langchain_community/chat_message_histories/sql.py (L259)`. and no message history will be add to database. In this patch, a new _aexit_history function which will'be called in async mode is added, and in turn aadd_messages will be called. In this patch, we use `afunc` attribute of a Runnable to check if the end listener should be run in async mode or not. - Issue: #22021, #22022 - Dependencies: N/A	2024-06-21 10:39:47 -04:00
Philippe PRADOS	1c2b9cc9ab	core[minor]: Update pgvector transalor for langchain_postgres (#23217 ) The SelfQuery PGVectorTranslator is not correct. The operator is "eq" and not "$eq". This patch use a new version of PGVectorTranslator from langchain_postgres. It's necessary to release a new version of langchain_postgres (see [here](https://github.com/langchain-ai/langchain-postgres/pull/75) before accepting this PR in langchain.	2024-06-21 10:37:09 -04:00
Mu Yang	401d469a92	langchain: fix systax warning in create_json_chat_agent (#23253 ) fix systax warning in `create_json_chat_agent` ``` .../langchain/agents/json_chat/base.py:22: SyntaxWarning: invalid escape sequence '\ ' """Create an agent that uses JSON to format its logic, build for Chat Models. ```	2024-06-21 10:05:38 -04:00
mackong	b108b4d010	core[patch]: set schema format for AsyncRootListenersTracer (#23214 ) - Description: AsyncRootListenersTracer support on_chat_model_start, it's schema_format should be "original+chat". - Issue: N/A - Dependencies:	2024-06-21 09:30:27 -04:00
Bagatur	976b456619	docs: BaseChatModel key methods table (#23238 ) If we're moving documenting inherited params think these kinds of tables become more important ![Screenshot 2024-06-20 at 3 59 12 PM](https://github.com/langchain-ai/langchain/assets/22008038/722266eb-2353-4e85-8fae-76b19bd333e0)	2024-06-20 21:00:22 -07:00
ccurme	a7b4175091	standard tests: add test for tool calling (#23234 ) Including streaming	2024-06-20 17:20:11 -04:00
Bagatur	12e0c28a6e	docs: fix chat model methods table (#23233 ) rst table not md ![Screenshot 2024-06-20 at 12 37 46 PM](https://github.com/langchain-ai/langchain/assets/22008038/7a03b869-c1f4-45d0-8d27-3e16f4c6eb19)	2024-06-20 19:51:10 +00:00
Zheng Robert Jia	a349fce880	docs[minor],community[patch]: Minor tutorial docs improvement, minor import error quick fix. (#22725 ) minor changes to module import error handling and minor issues in tutorial documents. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-20 15:36:49 -04:00
Eugene Yurtsev	7545b1d29b	core[patch]: Fix doc-strings for code blocks (#23232 ) Code blocks need extra space around them to be rendered properly by sphinx	2024-06-20 19:34:52 +00:00
Luis Moros	d5be160af0	community[patch]: Fix sql_databse.from_databricks issue when ran from Job (#23224 ) Desscription: When the ``sql_database.from_databricks`` is executed from a Workflow Job, the ``context`` object does not have a "browserHostName" property, resulting in an error. This change manages the error so the "DATABRICKS_HOST" env variable value is used instead of stoping the flow Co-authored-by: lmorosdb <lmorosdb>	2024-06-20 19:34:15 +00:00
Cory Waddingham	cd6812342e	pinecone[patch]: Update Poetry requirements for pinecone-client >=3.2.2 (#22094 ) This change updates the requirements in `libs/partners/pinecone/pyproject.toml` to allow all versions of `pinecone-client` greater than or equal to 3.2.2. This change resolves issue [21955](https://github.com/langchain-ai/langchain/issues/21955). --------- Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-20 18:59:36 +00:00
Eugene Yurtsev	59d7adff8f	core[patch]: Add clarification about streaming to RunnableLambda (#23227 ) Add streaming clarification to runnable lambda docstring.	2024-06-20 16:47:16 +00:00
maang-h	bc4cd9c5cc	community[patch]: Update root_validators ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI (#22853 ) This PR updates root validators for: - ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI Issues #22819 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-20 16:36:41 +00:00
ChrisDEV	cb6cf4b631	Fix return value type of dumpd (#20123 ) The return type of `json.loads` is `Any`. In fact, the return type of `dumpd` must be based on `json.loads`, so the correction here is understandable. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-20 16:31:41 +00:00
Guangdong Liu	0bce28cd30	core(patch): Fix encoding problem of load_prompt method (#21559 ) - description: Add encoding parameters. - @baskaryan, @efriis, @eyurtsev, @hwchase17. ![54d25ac7b1d5c2e47741a56fe8ed8ba](https://github.com/langchain-ai/langchain/assets/48236177/ffea9596-2001-4e19-b245-f8a6e231b9f9)	2024-06-20 09:25:54 -07:00
Philippe PRADOS	8711c61298	core[minor]: Adds an in-memory implementation of RecordManager (#13200 ) Description: langchain offers three technologies to save data: - [vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/) - [docstore](https://js.langchain.com/docs/api/schema/classes/Docstore) - [record manager](https://python.langchain.com/docs/modules/data_connection/indexing) If you want to combine these technologies in a sample persistence stategy you need a common implementation for each. `DocStore` propose `InMemoryDocstore`. We propose the class `MemoryRecordManager` to complete the system. This is the prelude to another full-request, which needs a consistent combination of persistence components. Tag maintainer: @baskaryan Twitter handle: @pprados --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-20 12:19:10 -04:00
Leonid Ganeline	51e75cf59d	community: docstrings (#23202 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-20 11:08:13 -04:00
Julian Weng	6a1a0d977a	partners[minor]: Fix value error message for with_structured_output (#22877 ) Currently, calling `with_structured_output()` with an invalid method argument raises `Unrecognized method argument. Expected one of 'function_calling' or 'json_format'`, but the JSON mode option [is now referred to](https://python.langchain.com/v0.2/docs/how_to/structured_output/#the-with_structured_output-method) by `'json_mode'`. This fixes that. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-20 15:03:21 +00:00
Leonid Ganeline	41f7620989	huggingface: docstrings (#23148 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference) Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-20 13:22:40 +00:00
ccurme	066a5a209f	huggingface[patch]: fix CI for python 3.12 (#23197 )	2024-06-20 09:17:26 -04:00
xyd	9b3a025f9c	fix https://github.com/langchain-ai/langchain/issues/23215 (#23216 ) fix bug The ZhipuAIEmbeddings class is not working. Co-authored-by: xu yandong <shaonian@acsx1.onexmail.com>	2024-06-20 13:04:50 +00:00
Bagatur	ad7f2ec67d	standard-tests[patch]: test stop not stop_sequences (#23200 )	2024-06-19 18:07:33 -07:00
David DeCaprio	a4bcb45f65	core:Add optional max_messages to MessagePlaceholder (#16098 ) - Description: Add optional max_messages to MessagePlaceholder - Issue: [16096](https://github.com/langchain-ai/langchain/issues/16096) - Dependencies: None - Twitter handle: @davedecaprio Sometimes it's better to limit the history in the prompt itself rather than the memory. This is needed if you want different prompts in the chain to have different history lengths. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-19 23:39:51 +00:00
shaunakgodbole	7193634ae6	fireworks[patch]: fix api_key alias in Fireworks LLM (#23118 ) Thank you for contributing to LangChain! Description The current code snippet for `Fireworks` had incorrect parameters. This PR fixes those parameters. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-19 21:14:42 +00:00
Eugene Yurtsev	1fcf875fe3	core[patch]: Document agent schema (#23194 ) * Document agent schema * Refer folks to langgraph for more information on how to create agents.	2024-06-19 20:16:57 +00:00
Eugene Yurtsev	c2d43544cc	core[patch]: Document messages namespace (#23154 ) - Moved doc-strings below attribtues in TypedDicts -- seems to render better on APIReference pages. * Provided more description and some simple code examples	2024-06-19 15:00:00 -04:00
Eugene Yurtsev	3c917204dc	core[patch]: Add doc-strings to outputs, fix @root_validator (#23190 ) - Document outputs namespace - Update a vanilla @root_validator that was missed	2024-06-19 14:59:06 -04:00
Bagatur	8698cb9b28	infra: add more formatter rules to openai (#23189 ) Turns on https://docs.astral.sh/ruff/settings/#format_docstring-code-format and https://docs.astral.sh/ruff/settings/#format_skip-magic-trailing-comma ```toml [tool.ruff.format] docstring-code-format = true skip-magic-trailing-comma = true ```	2024-06-19 11:39:58 -07:00
Michał Krassowski	710197e18c	community[patch]: restore compatibility with SQLAlchemy 1.x (#22546 ) - Description: Restores compatibility with SQLAlchemy 1.4.x that was broken since #18992 and adds a test run for this version on CI (only for Python 3.11) - Issue: fixes #19681 - Dependencies: None - Twitter handle: `@krassowski_m` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-19 17:58:57 +00:00
Erick Friis	48d6ea427f	upstage: move to external repo (#22506 )	2024-06-19 17:56:07 +00:00
Bagatur	0a4ee864e9	openai[patch]: image token counting (#23147 ) Resolves #23000 --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-19 10:41:47 -07:00
Jorge Piedrahita Ortiz	b3e53ffca0	community[patch]: sambanova llm integration improvement (#23137 ) - Description: sambanova sambaverse integration improvement: removed input parsing that was changing raw user input, and was making to use process prompt parameter as true mandatory	2024-06-19 10:30:14 -07:00
Jorge Piedrahita Ortiz	e162893d7f	community[patch]: update sambastudio embeddings (#23133 ) Description: update sambastudio embeddings integration, now compatible with generic endpoints and CoE endpoints	2024-06-19 10:26:56 -07:00
Philippe PRADOS	db6f46c1a6	langchain[small]: Change type to BasePromptTemplate (#23083 ) ```python Change from_llm( prompt: PromptTemplate ... ) ``` to ```python Change from_llm( prompt: BasePromptTemplate ... ) ```	2024-06-19 13:19:36 -04:00
Sergey Kozlov	94452a94b1	core[patch[: add exceptions propagation test for astream_events v2 (#23159 ) Description: `astream_events(version="v2")` didn't propagate exceptions in `langchain-core<=0.2.6`, fixed in the #22916. This PR adds a unit test to check that exceptions are propagated upwards. Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>	2024-06-19 13:00:25 -04:00
Leonid Ganeline	50484be330	prompty: docstring (#23152 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference) --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-19 12:50:58 -04:00
chenxi	505a2e8743	fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176 ) Close #23174 Co-authored-by: tianming <tianming@bytenew.com>	2024-06-19 16:28:24 +00:00
Bagatur	677408bfc9	core[patch]: fix chat history circular import (#23182 )	2024-06-19 09:08:36 -07:00
Eugene Yurtsev	883e90d06e	core[patch]: Add an example to the Document schema doc-string (#23131 ) Add an example to the document schema	2024-06-19 11:35:30 -04:00
ccurme	2b08e9e265	core[patch]: update test to catch circular imports (#23172 ) This raises ImportError due to a circular import: ```python from langchain_core import chat_history ``` This does not: ```python from langchain_core import runnables from langchain_core import chat_history ``` Here we update `test_imports` to run each import in a separate subprocess. Open to other ways of doing this!	2024-06-19 15:24:38 +00:00
Eugene Yurtsev	ae4c0ed25a	core[patch]: Add documentation to load namespace (#23143 ) Document some of the modules within the load namespace	2024-06-19 15:21:41 +00:00
Eugene Yurtsev	a34e650f8b	core[patch]: Add doc-string to document compressor (#23085 )	2024-06-19 11:03:49 -04:00
Eugene Yurtsev	1007a715a5	community[patch]: Prevent unit tests from making network requests (#23180 ) * Prevent unit tests from making network requests	2024-06-19 14:56:30 +00:00
ccurme	ca798bc6ea	community: move test to integration tests (#23178 ) Tests failing on master with > FAILED tests/unit_tests/embeddings/test_ovhcloud.py::test_ovhcloud_embed_documents - ValueError: Request failed with status code: 401, {"message":"Bad token; invalid JSON"}	2024-06-19 14:39:48 +00:00
Eugene Yurtsev	4fe8403bfb	core[patch]: Expand documentation in the indexing namespace (#23134 )	2024-06-19 10:11:44 -04:00
Eugene Yurtsev	fe4f10047b	core[patch]: Document embeddings namespace (#23132 ) Document embeddings namespace	2024-06-19 10:11:16 -04:00
Eugene Yurtsev	a3bae56a48	core[patch]: Update documentation in LLM namespace (#23138 ) Update documentation in lllm namespace.	2024-06-19 10:10:50 -04:00
Leonid Ganeline	a70b7a688e	ai21: docstrings (#23142 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-19 08:51:15 -04:00
bilk0h	3d54784e6d	text-splitters: Fix/recursive json splitter data persistence issue (#21529 ) Thank you for contributing to LangChain! Description: Noticed an issue with when I was calling `RecursiveJsonSplitter().split_json()` multiple times that I was getting weird results. I found an issue where `chunks` list in the `_json_split` method. If chunks is not provided when _json_split (which is the case when split_json calls _json_split) then the same list is used for subsequent calls to `_json_split`. You can see this in the test case i also added to this commit. Output should be: ``` [{'a': 1, 'b': 2}] [{'c': 3, 'd': 4}] ``` Instead you get: ``` [{'a': 1, 'b': 2}] [{'a': 1, 'b': 2, 'c': 3, 'd': 4}] ``` --------- Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-18 20:21:55 -07:00
鹿鹿鹿鲨	6b46b5e9ce	community: add request_kwargs and expect TimeError AsyncHtmlLoader (#23068 ) - Description: add `request_kwargs` and expect `TimeError` in `_fetch` function for AsyncHtmlLoader. This allows you to fill in the kwargs parameter when using the `load()` method of the `AsyncHtmlLoader` class. Co-authored-by: Yucolu <yucolu@tencent.com>	2024-06-18 20:02:46 -07:00
Leonid Ganeline	109a70fc64	ibm: docstrings (#23149 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-18 20:00:27 -07:00
Ryan Elston	86ee4f0daa	text-splitters: Introduce Experimental Markdown Syntax Splitter (#22257 ) #### Description This MR defines a `ExperimentalMarkdownSyntaxTextSplitter` class. The main goal is to replicate the functionality of the original `MarkdownHeaderTextSplitter` which extracts the header stack as metadata but with one critical difference: it keeps the whitespace of the original text intact. This draft reimplements the `MarkdownHeaderTextSplitter` with a very different algorithmic approach. Instead of marking up each line of the text individually and aggregating them back together into chunks, this method builds each chunk sequentially and applies the metadata to each chunk. This makes the implementation simpler. However, since it's designed to keep white space intact its not a full drop in replacement for the original. Since it is a radical implementation change to the original code and I would like to get feedback to see if this is a worthwhile replacement, should be it's own class, or is not a good idea at all. Note: I implemented the `return_each_line` parameter but I don't think it's a necessary feature. I'd prefer to remove it. This implementation also adds the following additional features: - Splits out code blocks and includes the language in the `"Code"` metadata key - Splits text on the horizontal rule `---` as well - The `headers_to_split_on` parameter is now optional - with sensible defaults that can be overridden. #### Issue Keeping the whitespace keeps the paragraphs structure and the formatting of the code blocks intact which allows the caller much more flexibility in how they want to further split the individuals sections of the resulting documents. This addresses the issues brought up by the community in the following issues: - https://github.com/langchain-ai/langchain/issues/20823 - https://github.com/langchain-ai/langchain/issues/19436 - https://github.com/langchain-ai/langchain/issues/22256 #### Dependencies N/A #### Twitter handle @RyanElston --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-18 19:44:00 -07:00
Bagatur	93d0ad97fe	anthropic[patch]: test image input (#23155 )	2024-06-19 02:32:15 +00:00
Leonid Ganeline	3dfd055411	anthropic: docstrings (#23145 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-18 22:26:45 -04:00
Bagatur	90559fde70	openai[patch], standard-tests[patch]: don't pass in falsey stop vals (#23153 ) adds an image input test to standard-tests as well	2024-06-18 18:13:13 -07:00
Bagatur	e8a8286012	core[patch]: runnablewithchathistory from core.runnables (#23136 )	2024-06-19 00:15:18 +00:00
Vadym Barda	b483bf5095	core[minor]: handle boolean data in draw_mermaid (#23135 ) This change should address graph rendering issues for edges with boolean data Example from langgraph: ```python from typing import Annotated, TypedDict from langchain_core.messages import AnyMessage from langgraph.graph import END, START, StateGraph from langgraph.graph.message import add_messages class State(TypedDict): messages: Annotated[list[AnyMessage], add_messages] def branch(state: State) -> bool: return 1 + 1 == 3 graph_builder = StateGraph(State) graph_builder.add_node("foo", lambda state: {"messages": [("ai", "foo")]}) graph_builder.add_node("bar", lambda state: {"messages": [("ai", "bar")]}) graph_builder.add_conditional_edges( START, branch, path_map={True: "foo", False: "bar"}, then=END, ) app = graph_builder.compile() print(app.get_graph().draw_mermaid()) ``` Previous behavior: ```python AttributeError: 'bool' object has no attribute 'split' ``` Current behavior: ```python %%{init: {'flowchart': {'curve': 'linear'}}}%% graph TD; __start__[__start__]:::startclass; __end__[__end__]:::endclass; foo([foo]):::otherclass; bar([bar]):::otherclass; __start__ -. ('a',) .-> foo; foo --> __end__; __start__ -. ('b',) .-> bar; bar --> __end__; classDef startclass fill:#ffdfba; classDef endclass fill:#baffc9; classDef otherclass fill:#fad7de; ```	2024-06-18 20:15:42 +00:00
Bagatur	093ae04d58	core[patch]: Pin pydantic in py3.12.4 (#23130 )	2024-06-18 12:00:02 -07:00
hmasdev	ff0c06b1e5	langchain[patch]: fix `OutputType` of OutputParsers and fix legacy API in OutputParsers (#19792 ) # Description This pull request aims to address specific issues related to the ambiguity and error-proneness of the output types of certain output parsers, as well as the absence of unit tests for some parsers. These issues could potentially lead to runtime errors or unexpected behaviors due to type mismatches when used, causing confusion for developers and users. Through clarifying output types, this PR seeks to improve the stability and reliability. Therefore, this pull request - fixes the `OutputType` of OutputParsers to be the expected type; - e.g. `OutputType` property of `EnumOutputParser` raises `TypeError`. This PR introduce a logic to extract `OutputType` from its attribute. - and fixes the legacy API in OutputParsers like `LLMChain.run` to the modern API like `LLMChain.invoke`; - Note: For `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser`, this PR introduces `legacy` attribute with False as default value in order to keep the backward compatibility - and adds the tests for the `OutputFixingParser` and `RetryOutputParser`. The following table shows my expected output and the actual output of the `OutputType` of OutputParsers. I have used this table to fix `OutputType` of OutputParsers. \| Class Name of OutputParser \| My Expected `OutputType` (after this PR)\| Actual `OutputType` [evidence](#evidence) (before this PR)\| Fix Required \| \|---------\|--------------\|---------\|--------\| \| BooleanOutputParser \| `<class 'bool'>` \| `<class 'bool'>` \| NO \| \| CombiningOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| DatetimeOutputParser \| `<class 'datetime.datetime'>` \| `<class 'datetime.datetime'>` \| NO \| \| EnumOutputParser(enum=MyEnum) \| `MyEnum` \| `TypeError` is raised \| YES \| \| OutputFixingParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| CommaSeparatedListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| MarkdownListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| NumberedListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| JsonOutputKeyToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| JsonOutputToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| PydanticToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| PandasDataFrameOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| PydanticOutputParser(pydantic_object=MyModel) \| `<class '__main__.MyModel'>` \| `<class '__main__.MyModel'>` \| NO \| \| RegexParser \| `typing.Dict[str, str]` \| `TypeError` is raised \| YES \| \| RegexDictParser \| `typing.Dict[str, str]` \| `TypeError` is raised \| YES \| \| RetryOutputParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| RetryWithErrorOutputParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| StructuredOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| YamlOutputParser(pydantic_object=MyModel) \| `MyModel` \| `~T` \| YES \| NOTE: In "Fix Required", "YES" means that it is required to fix in this PR while "NO" means that it is not required. # Issue No issues for this PR. # Twitter handle - [hmdev3](https://twitter.com/hmdev3) # Questions: 1. Is it required to create tests for legacy APIs `LLMChain.run` in the following scripts? - libs/langchain/tests/unit_tests/output_parsers/test_fix.py; - libs/langchain/tests/unit_tests/output_parsers/test_retry.py. 2. Is there a more appropriate expected output type than I expect in the above table? - e.g. the `OutputType` of `CombiningOutputParser` should be SOMETHING... # Actual outputs (before this PR) <div id='evidence'></div> <details><summary>Actual outputs</summary> ## Requirements - Python==3.9.13 - langchain==0.1.13 ```python Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import langchain >>> langchain.__version__ '0.1.13' >>> from langchain import output_parsers ``` ### `BooleanOutputParser` ```python >>> output_parsers.BooleanOutputParser().OutputType <class 'bool'> ``` ### `CombiningOutputParser` ```python >>> output_parsers.CombiningOutputParser(parsers=[output_parsers.DatetimeOutputParser(), output_parsers.CommaSeparatedListOutputParser()]).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable CombiningOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `DatetimeOutputParser` ```python >>> output_parsers.DatetimeOutputParser().OutputType <class 'datetime.datetime'> ``` ### `EnumOutputParser` ```python >>> from enum import Enum >>> class MyEnum(Enum): ... a = 'a' ... b = 'b' ... >>> output_parsers.EnumOutputParser(enum=MyEnum).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable EnumOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `OutputFixingParser` ```python >>> output_parsers.OutputFixingParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `CommaSeparatedListOutputParser` ```python >>> output_parsers.CommaSeparatedListOutputParser().OutputType typing.List[str] ``` ### `MarkdownListOutputParser` ```python >>> output_parsers.MarkdownListOutputParser().OutputType typing.List[str] ``` ### `NumberedListOutputParser` ```python >>> output_parsers.NumberedListOutputParser().OutputType typing.List[str] ``` ### `JsonOutputKeyToolsParser` ```python >>> output_parsers.JsonOutputKeyToolsParser(key_name='tool').OutputType typing.Any ``` ### `JsonOutputToolsParser` ```python >>> output_parsers.JsonOutputToolsParser().OutputType typing.Any ``` ### `PydanticToolsParser` ```python >>> from langchain.pydantic_v1 import BaseModel >>> class MyModel(BaseModel): ... a: int ... >>> output_parsers.PydanticToolsParser(tools=[MyModel, MyModel]).OutputType typing.Any ``` ### `PandasDataFrameOutputParser` ```python >>> output_parsers.PandasDataFrameOutputParser().OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable PandasDataFrameOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `PydanticOutputParser` ```python >>> output_parsers.PydanticOutputParser(pydantic_object=MyModel).OutputType <class '__main__.MyModel'> ``` ### `RegexParser` ```python >>> output_parsers.RegexParser(regex='$', output_keys=['a']).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RegexDictParser` ```python >>> output_parsers.RegexDictParser(output_key_to_format={'a':'a'}).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexDictParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RetryOutputParser` ```python >>> output_parsers.RetryOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `RetryWithErrorOutputParser` ```python >>> output_parsers.RetryWithErrorOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `StructuredOutputParser` ```python >>> from langchain.output_parsers.structured import ResponseSchema >>> response_schemas = [ResponseSchema(name="foo",description="a list of strings",type="List[string]"),ResponseSchema(name="bar",description="a string",type="string"), ] >>> output_parsers.StructuredOutputParser.from_response_schemas(response_schemas).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable StructuredOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `YamlOutputParser` ```python >>> output_parsers.YamlOutputParser(pydantic_object=MyModel).OutputType ~T ``` <div> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-18 18:59:42 +00:00
Artem Mukhin	e271f75bee	docs: Fix URL formatting in deprecation warnings (#23075 ) Description Updated the URLs in deprecation warning messages. The URLs were previously written as raw strings and are now formatted to be clickable HTML links. Example of a broken link in the current API Reference: https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html <img width="942" alt="Screenshot 2024-06-18 at 13 21 07" src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">	2024-06-18 14:49:58 -04:00
Gabriel Petracca	c6660df58e	community[minor]: Implement Doctran async execution (#22372 ) Description The DoctranTextTranslator has an async transform function that was not implemented because [the doctran library](https://github.com/psychic-api/doctran) uses a sync version of the `execute` method. - I implemented the `DoctranTextTranslator.atransform_documents()` method using `asyncio.to_thread` to run the function in a separate thread. - I updated the example in the Notebook with the new async version. - The performance improvements can be appreciated when a big document is divided into multiple chunks. Relates to: - Issue #14645: https://github.com/langchain-ai/langchain/issues/14645 - Issue #14437: https://github.com/langchain-ai/langchain/issues/14437 - https://github.com/langchain-ai/langchain/pull/15264 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-18 18:17:37 +00:00
Eugene Yurtsev	aa6415aa7d	core[minor]: Support multiple keys in get_from_dict_or_env (#23086 ) Support passing multiple keys for ge_from_dict_or_env	2024-06-18 14:13:28 -04:00
nold	226802f0c4	community: add args_schema to SearxSearch (#22954 ) This change adds args_schema (pydantic BaseModel) to SearxSearchRun for correct schema formatting on LLM function calls Issue: currently using SearxSearchRun with OpenAI function calling returns the following error "TypeError: SearxSearchRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"foobar"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 17:27:39 +00:00
Bagatur	01783d67fc	core[patch]: Release 0.2.9 (#23091 )	2024-06-18 17:15:04 +00:00
Finlay Macklon	616d06d7fe	community: glob multiple patterns when using DirectoryLoader (#22852 ) - Description: Updated community.langchain_community.document_loaders.directory.py to enable the use of multiple glob patterns in the `DirectoryLoader` class. Now, the glob parameter is of type `list[str] \| str` and still defaults to the same value as before. I updated the docstring of the class to reflect this, and added a unit test to community.tests.unit_tests.document_loaders.test_directory.py named `test_directory_loader_glob_multiple`. This test also shows an example of how to use the new functionality. - ~~Issue:~~Discussion Thread: https://github.com/langchain-ai/langchain/discussions/18559 - Dependencies: None - Twitter handle: N/a - [x] Add tests and docs - Added test (described above) - Updated class docstring - [x] Lint and test --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-18 09:24:50 -07:00
Eugene Yurtsev	5564d9e404	core[patch]: Document BaseStore (#23082 ) Add doc-string to BaseStore	2024-06-18 11:47:47 -04:00
Takuya Igei	9f791b6ad5	core[patch],community[patch],langchain[patch]: `tenacity` dependency to version `>=8.1.0,<8.4.0` (#22973 ) Fix https://github.com/langchain-ai/langchain/issues/22972. - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-18 10:34:28 -04:00
Raviraj	858ce264ef	SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895 ) ```SemanticChunker``` currently provide three methods to split the texts semantically: - percentile - standard_deviation - interquartile I propose new method ```gradient```. In this method, the gradient of distance is used to split chunks along with the percentile method (technically) . This method is useful when chunks are highly correlated with each other or specific to a domain e.g. legal or medical. The idea is to apply anomaly detection on gradient array so that the distribution become wider and easy to identify boundaries in highly semantic data. I have tested this merge on a set of 10 domain specific documents (mostly legal). Details : - Issue: Improvement - Dependencies: NA - Twitter handle: [x.com/prajapat_ravi](https://x.com/prajapat_ravi) @hwchase17 --------- Co-authored-by: Raviraj Prajapat <raviraj.prajapat@sirionlabs.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-17 21:01:08 -07:00
Raghav Dixit	55705c0f5e	LanceDB integration update (#22869 ) Added : - [x] relevance search (w/wo scores) - [x] maximal marginal search - [x] image ingestion - [x] filtering support - [x] hybrid search w reranking make test, lint_diff and format checked.	2024-06-17 20:54:26 -07:00
Chang Liu	62c8a67f56	community: add KafkaChatMessageHistory (#22216 ) Add chat history store based on Kafka. Files added: `libs/community/langchain_community/chat_message_histories/kafka.py` `docs/docs/integrations/memory/kafka_chat_message_history.ipynb` New issue to be created for future improvement: 1. Async method implementation. 2. Message retrieval based on timestamp. 3. Support for other configs when connecting to cloud hosted Kafka (e.g. add `api_key` field) 4. Improve unit testing & integration testing.	2024-06-17 20:34:01 -07:00
shimajiroxyz	3e835a1aa1	langchain: add id_key option to EnsembleRetriever for metadata-based document merging (#22950 ) Description: - What I changed - By specifying the `id_key` during the initialization of `EnsembleRetriever`, it is now possible to determine which documents to merge scores for based on the value corresponding to the `id_key` element in the metadata, instead of `page_content`. Below is an example of how to use the modified `EnsembleRetriever`: ```python retriever = EnsembleRetriever(retrievers=[ret1, ret2], id_key="id") # The Document returned by each retriever must keep the "id" key in its metadata. ``` - Additionally, I added a script to easily test the behavior of the `invoke` method of the modified `EnsembleRetriever`. - Why I changed - There are cases where you may want to calculate scores by treating Documents with different `page_content` as the same when using `EnsembleRetriever`. For example, when you want to ensemble the search results of the same document described in two different languages. - The previous `EnsembleRetriever` used `page_content` as the basis for score aggregation, making the above usage difficult. Therefore, the score is now calculated based on the specified key value in the Document's metadata. Twitter handle: @shimajiroxyz	2024-06-18 03:29:17 +00:00
mackong	39f6c4169d	langchain[patch]: add tool messages formatter for tool calling agent (#22849 ) - Description: add tool_messages_formatter for tool calling agent, make tool messages can be formatted in different ways for your LLM. - Issue: N/A - Dependencies: N/A	2024-06-17 20:29:00 -07:00
Lucas Tucker	e25a5966b5	docs: Standardize DocumentLoader docstrings (#22932 ) Standardizing DocumentLoader docstrings (of which there are many) This PR addresses issue #22866 and adds docstrings according to the issue's specified format (in the appendix) for files csv_loader.py and json_loader.py in langchain_community.document_loaders. In particular, the following sections have been added to both CSVLoader and JSONLoader: Setup, Instantiate, Load, Async load, and Lazy load. It may be worth adding a 'Metadata' section to the JSONLoader docstring to clarify how we want to extract the JSON metadata (using the `metadata_func` argument). The files I used to walkthrough the various sections were `example_2.json` from [HERE](https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files) and `hw_200.csv` from [HERE](https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html). --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-18 03:26:36 +00:00
Mohammad Mohtashim	60ba02f5db	[Community]: Fixed DDG DuckDuckGoSearchResults Docstring (#22968 ) - Description: A very small fix in the Docstring of `DuckDuckGoSearchResults` identified in the following issue. - Issue: #22961 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:16:24 +00:00
Eun Hye Kim	70761af8cf	community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community) (#22977 ) - PR title: "community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community)" - PR message: - Description: - Added an optional verify parameter to the Requests class with a default value of True. - Modified the get, post, patch, put, and delete methods to include the verify parameter. - Updated the _arequest async context manager to include the verify parameter. - Added the verify parameter to the GenericRequestsWrapper class and passed it to the Requests class. - Issue: This PR fixes issue #22975. - Dependencies: No additional dependencies are required for this change. - Twitter handle: @lunara_x You can check this change with below code. ```python from langchain_openai.chat_models import ChatOpenAI from langchain.requests import RequestsWrapper from langchain_community.agent_toolkits.openapi import planner from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec with open("swagger.yaml") as f: data = yaml.load(f, Loader=yaml.FullLoader) swagger_api_spec = reduce_openapi_spec(data) llm = ChatOpenAI(model='gpt-4o') swagger_requests_wrapper = RequestsWrapper(verify=False) # modified point superset_agent = planner.create_openapi_agent(swagger_api_spec, swagger_requests_wrapper, llm, allow_dangerous_requests=True, handle_parsing_errors=True) superset_agent.run( "Tell me the number and types of charts and dashboards available." ) ``` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:12:40 +00:00
Mohammad Mohtashim	bf839676c7	[Community]: FIxed the DocumentDBVectorSearch `_similarity_search_without_score` (#22970 ) - Description: The PR #22777 introduced a bug in `_similarity_search_without_score` which was raising the `OperationFailure` error. The mistake was syntax error for MongoDB pipeline which has been corrected now. - Issue: #22770	2024-06-17 20:08:42 -07:00
Nuno Campos	f01f12ce1e	Include "no escape" and "inverted section" mustache vars in Prompt.input_variables and Prompt.input_schema (#22981 )	2024-06-17 19:24:13 -07:00
Bagatur	c2b2e3266c	core[minor]: message transformer utils (#22752 )	2024-06-17 15:30:07 -07:00

... 6 7 8 9 10 ...

5308 Commits