langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-06-01 04:29:09 +00:00

Author	SHA1	Message	Date
ccurme	122e80e04d	core[patch]: add versionadded to `as_tool` (#24138 )	2024-07-11 18:08:08 +00:00
Nuno Campos	2428984205	core: Add metadata to graph json repr (#24131 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-11 17:23:52 +00:00
Nuno Campos	3e454d7568	core: fix docstring (#24129 )	2024-07-11 16:38:14 +00:00
Nuno Campos	ee3fe20af4	core: mermaid: Render metadata key-value pairs when drawing mermaid graph (#24103 ) - if node is runnable binding with metadata attached	2024-07-11 16:22:23 +00:00
Eugene Yurtsev	dc131ac42a	core[minor]: Add dispatching for custom events (#24080 ) This PR allows dispatching adhoc events for a given run. # Context This PR allows users to send arbitrary data to the callback system and to the astream events API from within a given runnable. This can be extremely useful to surface custom information to end users about progress etc. Integration with langsmith tracer will be done separately since the data cannot be currently visualized. It'll be accommodated using the events attribute of the Run # Examples with astream events ```python from langchain_core.callbacks import adispatch_custom_event from langchain_core.tools import tool @tool async def foo(x: int) -> int: """Foo""" await adispatch_custom_event("event1", {"x": x}) await adispatch_custom_event("event2", {"x": x}) return x + 1 async for event in foo.astream_events({'x': 1}, version='v2'): print(event) ``` ```python {'event': 'on_tool_start', 'data': {'input': {'x': 1}}, 'name': 'foo', 'tags': [], 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'metadata': {}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_tool_end', 'data': {'output': 2}, 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []} ``` ```python from langchain_core.callbacks import adispatch_custom_event from langchain_core.runnables import RunnableLambda @RunnableLambda async def foo(x: int) -> int: """Foo""" await adispatch_custom_event("event1", {"x": x}) await adispatch_custom_event("event2", {"x": x}) return x + 1 async for event in foo.astream_events(1, version='v2'): print(event) ``` ```python {'event': 'on_chain_start', 'data': {'input': 1}, 'name': 'foo', 'tags': [], 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'metadata': {}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_chain_stream', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'data': {'chunk': 2}, 'parent_ids': []} {'event': 'on_chain_end', 'data': {'output': 2}, 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []} ``` # Examples with handlers This is copy pasted from unit tests ```python class CustomCallbackManager(BaseCallbackHandler): def __init__(self) -> None: self.events: List[Any] = [] def on_custom_event( self, name: str, data: Any, , run_id: UUID, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, *kwargs: Any, ) -> None: assert kwargs == {} self.events.append( ( name, data, run_id, tags, metadata, ) ) callback = CustomCallbackManager() run_id = uuid.UUID(int=7) @RunnableLambda def foo(x: int, config: RunnableConfig) -> int: dispatch_custom_event("event1", {"x": x}) dispatch_custom_event("event2", {"x": x}, config=config) return x foo.invoke(1, {"callbacks": [callback], "run_id": run_id}) assert callback.events == [ ("event1", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}), ("event2", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}), ] ```	2024-07-11 02:25:12 +00:00
ccurme	975b6129f6	core[patch]: support conversion of runnables to tools (#23992 ) Open to other thoughts on UX. string input: ```python as_tool = retriever.as_tool() as_tool.invoke("cat") # [Document(...), ...] ``` typed dict input: ```python class Args(TypedDict): key: int def f(x: Args) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(f).as_tool( name="my tool", description="description", # name, description are inferred if not supplied ) as_tool.invoke({"key": 3}) # "6" ``` for untyped dict input, allow specification of parameters + types ```python def g(x: Dict[str, Any]) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(g).as_tool(arg_types={"key": int}) result = as_tool.invoke({"key": 3}) # "6" ``` Passing the `arg_types` is slightly awkward but necessary to ensure tool calls populate parameters correctly: ```python from typing import Any, Dict from langchain_core.runnables import RunnableLambda from langchain_openai import ChatOpenAI def f(x: Dict[str, Any]) -> str: return str(x["key"] * 2) runnable = RunnableLambda(f) as_tool = runnable.as_tool(arg_types={"key": int}) llm = ChatOpenAI().bind_tools([as_tool]) result = llm.invoke("Use the tool on 3.") tool_call = result.tool_calls[0] args = tool_call["args"] assert args == {"key": 3} as_tool.run(args) ``` Contrived (?) example with langgraph agent as a tool: ```python from typing import List, Literal from typing_extensions import TypedDict from langchain_openai import ChatOpenAI from langgraph.prebuilt import create_react_agent llm = ChatOpenAI(temperature=0) def magic_function(input: int) -> int: """Applies a magic function to an input.""" return input + 2 agent_1 = create_react_agent(llm, [magic_function]) class Message(TypedDict): role: Literal["human"] content: str agent_tool = agent_1.as_tool( arg_types={"messages": List[Message]}, name="Jeeves", description="Ask Jeeves.", ) agent_2 = create_react_agent(llm, [agent_tool]) ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-10 19:29:59 -04:00
Bagatur	6928f4c438	core[minor]: Add ToolMessage.raw_output (#23994 ) Decisions to discuss: 1. is a new attr needed or could additional_kwargs be used for this 2. is raw_output a good name for this attr 3. should raw_output default to {} or None 4. should raw_output be included in serialization 5. do we need to update repr/str to exclude raw_output	2024-07-10 20:11:10 +00:00
William FH	1e1fd30def	[Core] Fix fstring in logger warning (#24043 )	2024-07-09 19:53:18 -07:00
Nuno Campos	859e434932	core: Speed up json parse for large strings (#24036 ) for a large string: - old 4.657918874989264 - new 0.023724667000351474	2024-07-09 12:26:50 -07:00
Nuno Campos	160fc7f246	core: Move json parsing in base chat model / output parser to bg thread (#24031 ) - add version of AIMessageChunk.__add__ that can add many chunks, instead of only 2 - In agenerate_from_stream merge and parse chunks in bg thread - In output parse base classes do more work in bg threads where appropriate --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2024-07-09 12:26:36 -07:00
Eugene Yurtsev	f765e8fa9d	core[minor],community[patch],standard-tests[patch]: Move InMemoryImplementation to langchain-core (#23986 ) This PR moves the in memory implementation to langchain-core. * The implementation remains importable from langchain-community. * Supporting utilities are marked as private for now.	2024-07-08 14:11:51 -07:00
Eugene Yurtsev	2c180d645e	core[minor],community[minor]: Upgrade all @root_validator() to @pre_init (#23841 ) This PR introduces a @pre_init decorator that's a @root_validator(pre=True) but with all the defaults populated!	2024-07-08 16:09:29 -04:00
Eugene Yurtsev	e0186df56b	core[patch]: Clarify upsert response semantics (#23921 )	2024-07-05 15:59:47 -04:00
Eugene Yurtsev	5b7d5f7729	core[patch]: Add comment to clarify aadd_documents (#23920 ) Add comment to clarify how add documents works	2024-07-05 15:20:16 -04:00
ccurme	74c7198906	core, anthropic[patch]: support streaming tool calls when function has no arguments (#23915 ) resolves https://github.com/langchain-ai/langchain/issues/23911 When an AIMessageChunk is instantiated, we attempt to parse tool calls off of the tool_call_chunks. Here we add a special-case to this parsing, where `""` will be parsed as `{}`. This is a reaction to how Anthropic streams tool calls in the case where a function has no arguments: ``` {'id': 'toolu_01J8CgKcuUVrMqfTQWPYh64r', 'input': {}, 'name': 'magic_function', 'type': 'tool_use', 'index': 1} {'partial_json': '', 'type': 'tool_use', 'index': 1} ``` The `partial_json` does not accumulate to a valid json string-- most other providers tend to emit `"{}"` in this case.	2024-07-05 18:57:41 +00:00
Christophe Bornet	42d049f618	core[minor]: Add Graph Store component (#23092 ) This PR introduces a GraphStore component. GraphStore extends VectorStore with the concept of links between documents based on document metadata. This allows linking documents based on a variety of techniques, including common keywords, explicit links in the content, and other patterns. This works with existing Documents, so it’s easy to extend existing VectorStores to be used as GraphStores. The interface can be implemented for any Vector Store technology that supports metadata, not only graph DBs. When retrieving documents for a given query, the first level of search is done using classical similarity search. Next, links may be followed using various traversal strategies to get additional documents. This allows documents to be retrieved that aren’t directly similar to the query but contain relevant information. 2 retrieving methods are added to the VectorStore ones : * traversal_search which gets all linked documents up to a certain depth * mmr_traversal_search which selects linked documents using an MMR algorithm to have more diverse results. If a depth of retrieval of 0 is used, GraphStore is effectively a VectorStore. It enables an easy transition from a simple VectorStore to GraphStore by adding links between documents as a second step. An implementation for Apache Cassandra is also proposed. See https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb for a notebook explaining how to use GraphStore and that shows that it can answer correctly to questions that a simple VectorStore cannot. Twitter handle: _cbornet	2024-07-05 12:24:10 -04:00
Leonid Ganeline	77f5fc3d55	core: docstrings `load` (#23787 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:23:19 -04:00
Eugene Yurtsev	6f08e11d7c	core[minor]: add upsert, streaming_upsert, aupsert, astreaming_upsert methods to the VectorStore abstraction (#23774 ) This PR rolls out part of the new proposed interface for vectorstores (https://github.com/langchain-ai/langchain/pull/23544) to existing store implementations. The PR makes the following changes: 1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert methods to the vectorstore. 2. Updates `add_texts` and `aadd_texts` to be non required with a default implementation that delegates to `upsert` and `aupsert` if those have been implemented. The original `add_texts` and `aadd_texts` methods are problematic as they spread object specific information across document and *kwargs. (e.g., ids are not a part of the document) 3. Adds a default implementation to `add_documents` and `aadd_documents` that delegates to `upsert` and `aupsert` respectively. 4. Adds standard unit tests to verify that a given vectorstore implements a correct read/write API. A downside of this implementation is that it creates `upsert` with a very similar signature to `add_documents`. The reason for introducing `upsert` is to: Remove any ambiguities about what information is allowed in `kwargs`. Specifically kwargs should only be used for information common to all indexed data. (e.g., indexing timeout). *Allow inheriting from an anticipated generalized interface for indexing that will allow indexing `BaseMedia` (i.e., allow making a vectorstore for images/audio etc.) `add_documents` can be deprecated in the future in favor of `upsert` to make sure that users have a single correct way of indexing content. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-05 12:21:40 -04:00
G Sreejith	3c752238c5	core[patch]: Fix typo in docstring (graphm -> graph) (#23910 ) Changes has been as per the request Replaced graphm with graph	2024-07-05 16:20:33 +00:00
Leonid Ganeline	12c92b6c19	core: docstrings `outputs` (#23889 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:18:17 -04:00
Leonid Ganeline	1eca98ec56	core: docstrings `prompts` (#23890 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:17:52 -04:00
Mohammad Mohtashim	2274d2b966	core[patch]: Accounting for Optional Input Variables in BasePromptTemplate (#22851 ) Description: After reviewing the prompts API, it is clear that the only way a user can explicitly mark an input variable as optional is through the `MessagePlaceholder.optional` attribute. Otherwise, the user must explicitly pass in the `input_variables` expected to be used in the `BasePromptTemplate`, which will be validated upon execution. Therefore, to semantically handle a `MessagePlaceholder` `variable_name` as optional, we will treat the `variable_name` of `MessagePlaceholder` as a `partial_variable` if it has been marked as optional. This approach aligns with how the `variable_name` of `MessagePlaceholder` is already handled [here](https://github.com/keenborder786/langchain/blob/optional_input_variables/libs/core/langchain_core/prompts/chat.py#L991). Additionally, an attribute `optional_variable` has been added to `BasePromptTemplate`, and the `variable_name` of `MessagePlaceholder` is also made part of `optional_variable` when marked as optional. Moreover, the `get_input_schema` method has been updated for `BasePromptTemplate` to differentiate between optional and non-optional variables. Issue: #22832, #21425 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-05 15:49:40 +00:00
Eugene Yurtsev	9ccc4b1616	core[patch]: Fix logic in BaseChatModel that processes the llm string that is used as a key for caching chat models responses (#23842 ) This PR should fix the following issue: https://github.com/langchain-ai/langchain/issues/23824 Introduced as part of this PR: https://github.com/langchain-ai/langchain/pull/23416 I am unable to reproduce the issue locally though it's clear that we're getting a `serialized` object which is not a dictionary somehow. The test below passes for me prior to the PR as well ```python def test_cache_with_sqllite() -> None: from langchain_community.cache import SQLiteCache from langchain_core.globals import set_llm_cache cache = SQLiteCache(database_path=".langchain.db") set_llm_cache(cache) chat_model = FakeListChatModel(responses=["hello", "goodbye"], cache=True) assert chat_model.invoke("How are you?").content == "hello" assert chat_model.invoke("How are you?").content == "hello" ```	2024-07-03 16:23:55 -04:00
Vadym Barda	9bb623381b	core[minor]: update conversion utils to handle RemoveMessage (#23840 )	2024-07-03 16:13:31 -04:00
Théo Deschamps	39b19cf764	core[patch]: extract input variables for `path` and `detail` keys in order to format an `ImagePromptTemplate` (#22613 ) - Description: Add support for `path` and `detail` keys in `ImagePromptTemplate`. Previously, only variables associated with the `url` key were considered. This PR allows for the inclusion of a local image path and a detail parameter as input to the format method. - Issues: - fixes #20820 - related to #22024 - Dependencies: None - Twitter handle: @DeschampsTho5 --------- Co-authored-by: tdeschamps <tdeschamps@kameleoon.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-03 18:58:42 +00:00
Leonid Ganeline	55f6f91f17	core[patch]: docstrings `output_parsers` (#23825 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 14:27:40 -04:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
William FH	6cd56821dc	[Core] Unify function schema parsing (#23370 ) Use pydantic to infer nested schemas and all that fun. Include bagatur's convenient docstring parser Include annotation support Previously we didn't adequately support many typehints in the bind_tools() method on raw functions (like optionals/unions, nested types, etc.)	2024-07-03 09:55:38 -07:00
Leonid Ganeline	716a316654	core: docstrings `indexing` (#23785 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:27:34 -04:00
Leonid Ganeline	30fdc2dbe7	core: docstrings `messages` (#23788 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:25:00 -04:00
Bagatur	d677dadf5f	core[patch]: mark RemoveMessage beta (#23656 )	2024-07-02 21:27:21 +00:00
SN	acc457f645	core[patch]: fix nested sections for mustache templating (#23747 ) The prompt template variable detection only worked for singly-nested sections because we just kept track of whether we were in a section and then set that to false as soon as we encountered an end block. i.e. the following: ``` {{#outerSection}} {{variableThatShouldntShowUp}} {{#nestedSection}} {{nestedVal}} {{/nestedSection}} {{anotherVariableThatShouldntShowUp}} {{/outerSection}} ``` Would yield `['outerSection', 'anotherVariableThatShouldntShowUp']` as input_variables (whereas it should just yield `['outerSection']`). This fixes that by keeping track of the current depth and using a stack.	2024-07-02 10:20:45 -07:00
Eugene Yurtsev	ebcee4f610	core[patch]: Add versionadded to get_by_ids (#23728 )	2024-07-01 15:16:00 -04:00
Eugene Yurtsev	e800f6bb57	core[minor]: Create BaseMedia object (#23639 ) This PR implements a BaseContent object from which Document and Blob objects will inherit proposed here: https://github.com/langchain-ai/langchain/pull/23544 Alternative: Create a base object that only has an identifier and no metadata. For now decided against it, since that refactor can be done at a later time. It also feels a bit odd since our IDs are optional at the moment. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-01 15:07:30 -04:00
Nuno Campos	b36e95caa9	core[patch]: use async messages where possible (#23718 ) Fix #23716 Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-01 18:33:05 +00:00
Spyros Avlonitis	8cfb2fa1b7	core[minor]: Add maxsize for InMemoryCache (#23405 ) This PR introduces a maxsize parameter for the InMemoryCache class, allowing users to specify the maximum number of items to store in the cache. If the cache exceeds the specified maximum size, the oldest items are removed. Additionally, comprehensive unit tests have been added to ensure all functionalities are thoroughly tested. The tests are written using pytest and cover both synchronous and asynchronous methods. Twitter: @spyrosavl --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-01 14:21:21 -04:00
Eugene Yurtsev	b5aef4cf97	core[patch]: Fix llm string representation for serializable models (#23416 ) Fix LLM string representation for serializable objects. Fix for issue: https://github.com/langchain-ai/langchain/issues/23257 The llm string of serializable chat models is the serialized representation of the object. LangChain serialization dumps some basic information about non serializable objects including their repr() which includes an object id. This means that if a chat model has any non serializable fields (e.g., a cache), then any new instantiation of the those fields will change the llm representation of the chat model and cause chat misses. i.e., re-instantiating a postgres cache would result in cache misses!	2024-07-01 14:06:33 -04:00
nobbbbby	3904f2cd40	core: fix NameError (#23658 ) Description: In the chat_models module of the language model, the import statement for BaseModel has been moved from the conditionally imported section to the main import area, fixing `NameError `. Issue: fix `NameError `	2024-07-01 17:51:23 +00:00
Eugene Yurtsev	4f1821db3e	core[minor]: Add get_by_ids to vectorstore interface (#23594 ) This PR adds a part of the indexing API proposed in this RFC https://github.com/langchain-ai/langchain/pull/23544/files. It allows rolling out `get_by_ids` which should be uncontroversial to existing vectorstores without introducing new abstractions. The semantics for this method depend on the ability of identifying returned documents using the new optional ID field on documents: https://github.com/langchain-ai/langchain/pull/23411 Alternatives are: 1. Relax the sequence requirement ```python def get_by_ids(self, ids: Iterable[str], /) -> Iterable[Document]: ``` Rejected: - implementations are more likley to start batching with bad defaults - users would need to call list() or we'd need to introduce another convenience method 2. Support more kwargs ```python def get_by_ids(self, ids: Sequence[str], /, **kwargs) -> List[Document]: ... ``` Rejected: - No need for `batch` parameter since IDs is a sequence - Output cannot be customized since `Document` is fixed. (e.g., parameters could be useful to grab extra metadata like the vector that was indexed with the Document or to project a part of the document)	2024-07-01 13:04:33 -04:00
Vadym Barda	e8d77002ea	core: add RemoveMessage (#23636 ) This change adds a new message type `RemoveMessage`. This will enable `langgraph` users to manually modify graph state (or have the graph nodes modify the state) to remove messages by `id` Examples: * allow users to delete messages from state by calling ```python graph.update_state(config, values=[RemoveMessage(id=state.values[-1].id)]) ``` * allow nodes to delete messages ```python graph.add_node("delete_messages", lambda state: [RemoveMessage(id=state[-1].id)]) ```	2024-06-28 14:40:02 -07:00
Leonid Ganeline	75a44fe951	core: `chat_*` docstrings (#23412 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-27 17:29:38 -04:00
Eugene Yurtsev	da7beb1c38	core[patch]: Add unit test when catching generator exit (#23402 ) This pr adds a unit test for: https://github.com/langchain-ai/langchain/pull/22662 And narrows the scope where the exception is caught.	2024-06-27 20:36:07 +00:00
Eugene Yurtsev	96b72edac8	core[minor]: Add optional ID field to Document schema (#23411 ) This PR adds an optional ID field to the document schema. # 1. Optional or Required - An optional field will will requrie additional checking for the type in user code (annoying). - However, vectorstores currently don't respect this field. So if we make it required and start returning random UUIDs that might be even more confusing to users. Proposal: Start with Optional and convert to Required (with default set to uuid4()) in 1-2 major releases. # 2. Override __str__ or generic solution in prompts Overriding __str__ as a simple way to avoid changing user code that relies on default str(document) in prompts. I considered rolling out a more general solution in prompts (https://github.com/langchain-ai/langchain/pull/8685), but to do that we need to: 1. Make things serializable 2. The more general solution would likely need to be backwards compatible as well 3. It's unclear that one wants to format a List[int] in the same way as List[Document]. The former should be `,` seperated (likely), the latter should be `---` separated (likely). Proposal Start with __str__ override and focus on the vectorstore APIs, we generalize prompts later	2024-06-27 12:15:58 -04:00
Leonid Ganeline	2c9b84c3a8	core[patch]: docstrings `agents` (#23502 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:50:48 -04:00
Leonid Ganeline	2a5d59b3d7	core[patch]: `callbacks` docstrings (#23375 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:11:06 -04:00
Leonid Ganeline	1141b08eb8	core: docstrings `example_selectors` (#23542 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:10:40 -04:00
Bagatur	32f8f39974	core[patch]: use args_schema doc for tool description (#23503 )	2024-06-25 15:26:35 -07:00
Isaac Francisco	85f5d14cef	[docs]: split up tool docs (#22919 )	2024-06-25 13:15:08 -07:00
William FH	8955bc1866	[Core] Logging: Suppress missing parent warning (#23363 )	2024-06-25 14:57:23 -04:00
ccurme	730c551819	core[patch]: export tool output parsers from langchain_core.output_parsers (#23305 ) These currently read off AIMessage.tool_calls, and only fall back to OpenAI parsing if tool calls aren't populated. Importing these from `openai_tools` (e.g., in our [tool calling docs](https://python.langchain.com/v0.2/docs/how_to/tool_calling/#tool-calls)) can lead to confusion. After landing, would need to release core and update docs.	2024-06-25 14:40:42 -04:00

1 2 3 4 5 ...

496 Commits