langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-10 23:41:28 +00:00

Author	SHA1	Message	Date
Théo Deschamps	39b19cf764	core[patch]: extract input variables for `path` and `detail` keys in order to format an `ImagePromptTemplate` (#22613 ) - Description: Add support for `path` and `detail` keys in `ImagePromptTemplate`. Previously, only variables associated with the `url` key were considered. This PR allows for the inclusion of a local image path and a detail parameter as input to the format method. - Issues: - fixes #20820 - related to #22024 - Dependencies: None - Twitter handle: @DeschampsTho5 --------- Co-authored-by: tdeschamps <tdeschamps@kameleoon.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-03 18:58:42 +00:00
Leonid Ganeline	55f6f91f17	core[patch]: docstrings `output_parsers` (#23825 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 14:27:40 -04:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
William FH	6cd56821dc	[Core] Unify function schema parsing (#23370 ) Use pydantic to infer nested schemas and all that fun. Include bagatur's convenient docstring parser Include annotation support Previously we didn't adequately support many typehints in the bind_tools() method on raw functions (like optionals/unions, nested types, etc.)	2024-07-03 09:55:38 -07:00
Leonid Ganeline	716a316654	core: docstrings `indexing` (#23785 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:27:34 -04:00
Leonid Ganeline	30fdc2dbe7	core: docstrings `messages` (#23788 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:25:00 -04:00
Bagatur	7a3d8e5a99	core[patch]: Release 0.2.11 (#23780 )	2024-07-02 17:35:57 -04:00
Bagatur	d677dadf5f	core[patch]: mark RemoveMessage beta (#23656 )	2024-07-02 21:27:21 +00:00
SN	acc457f645	core[patch]: fix nested sections for mustache templating (#23747 ) The prompt template variable detection only worked for singly-nested sections because we just kept track of whether we were in a section and then set that to false as soon as we encountered an end block. i.e. the following: ``` {{#outerSection}} {{variableThatShouldntShowUp}} {{#nestedSection}} {{nestedVal}} {{/nestedSection}} {{anotherVariableThatShouldntShowUp}} {{/outerSection}} ``` Would yield `['outerSection', 'anotherVariableThatShouldntShowUp']` as input_variables (whereas it should just yield `['outerSection']`). This fixes that by keeping track of the current depth and using a stack.	2024-07-02 10:20:45 -07:00
Eugene Yurtsev	ebcee4f610	core[patch]: Add versionadded to get_by_ids (#23728 )	2024-07-01 15:16:00 -04:00
Eugene Yurtsev	e800f6bb57	core[minor]: Create BaseMedia object (#23639 ) This PR implements a BaseContent object from which Document and Blob objects will inherit proposed here: https://github.com/langchain-ai/langchain/pull/23544 Alternative: Create a base object that only has an identifier and no metadata. For now decided against it, since that refactor can be done at a later time. It also feels a bit odd since our IDs are optional at the moment. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-01 15:07:30 -04:00
Nuno Campos	b36e95caa9	core[patch]: use async messages where possible (#23718 ) Fix #23716 Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-01 18:33:05 +00:00
Spyros Avlonitis	8cfb2fa1b7	core[minor]: Add maxsize for InMemoryCache (#23405 ) This PR introduces a maxsize parameter for the InMemoryCache class, allowing users to specify the maximum number of items to store in the cache. If the cache exceeds the specified maximum size, the oldest items are removed. Additionally, comprehensive unit tests have been added to ensure all functionalities are thoroughly tested. The tests are written using pytest and cover both synchronous and asynchronous methods. Twitter: @spyrosavl --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-01 14:21:21 -04:00
Eugene Yurtsev	b5aef4cf97	core[patch]: Fix llm string representation for serializable models (#23416 ) Fix LLM string representation for serializable objects. Fix for issue: https://github.com/langchain-ai/langchain/issues/23257 The llm string of serializable chat models is the serialized representation of the object. LangChain serialization dumps some basic information about non serializable objects including their repr() which includes an object id. This means that if a chat model has any non serializable fields (e.g., a cache), then any new instantiation of the those fields will change the llm representation of the chat model and cause chat misses. i.e., re-instantiating a postgres cache would result in cache misses!	2024-07-01 14:06:33 -04:00
nobbbbby	3904f2cd40	core: fix NameError (#23658 ) Description: In the chat_models module of the language model, the import statement for BaseModel has been moved from the conditionally imported section to the main import area, fixing `NameError `. Issue: fix `NameError `	2024-07-01 17:51:23 +00:00
Eugene Yurtsev	4f1821db3e	core[minor]: Add get_by_ids to vectorstore interface (#23594 ) This PR adds a part of the indexing API proposed in this RFC https://github.com/langchain-ai/langchain/pull/23544/files. It allows rolling out `get_by_ids` which should be uncontroversial to existing vectorstores without introducing new abstractions. The semantics for this method depend on the ability of identifying returned documents using the new optional ID field on documents: https://github.com/langchain-ai/langchain/pull/23411 Alternatives are: 1. Relax the sequence requirement ```python def get_by_ids(self, ids: Iterable[str], /) -> Iterable[Document]: ``` Rejected: - implementations are more likley to start batching with bad defaults - users would need to call list() or we'd need to introduce another convenience method 2. Support more kwargs ```python def get_by_ids(self, ids: Sequence[str], /, **kwargs) -> List[Document]: ... ``` Rejected: - No need for `batch` parameter since IDs is a sequence - Output cannot be customized since `Document` is fixed. (e.g., parameters could be useful to grab extra metadata like the vector that was indexed with the Document or to project a part of the document)	2024-07-01 13:04:33 -04:00
Vadym Barda	e8d77002ea	core: add RemoveMessage (#23636 ) This change adds a new message type `RemoveMessage`. This will enable `langgraph` users to manually modify graph state (or have the graph nodes modify the state) to remove messages by `id` Examples: * allow users to delete messages from state by calling ```python graph.update_state(config, values=[RemoveMessage(id=state.values[-1].id)]) ``` * allow nodes to delete messages ```python graph.add_node("delete_messages", lambda state: [RemoveMessage(id=state[-1].id)]) ```	2024-06-28 14:40:02 -07:00
Jacob Lee	a032583b17	docs[patch]: Update diagrams (#23613 )	2024-06-28 12:36:00 -07:00
Leonid Ganeline	75a44fe951	core: `chat_*` docstrings (#23412 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-27 17:29:38 -04:00
Eugene Yurtsev	da7beb1c38	core[patch]: Add unit test when catching generator exit (#23402 ) This pr adds a unit test for: https://github.com/langchain-ai/langchain/pull/22662 And narrows the scope where the exception is caught.	2024-06-27 20:36:07 +00:00
Eugene Yurtsev	96b72edac8	core[minor]: Add optional ID field to Document schema (#23411 ) This PR adds an optional ID field to the document schema. # 1. Optional or Required - An optional field will will requrie additional checking for the type in user code (annoying). - However, vectorstores currently don't respect this field. So if we make it required and start returning random UUIDs that might be even more confusing to users. Proposal: Start with Optional and convert to Required (with default set to uuid4()) in 1-2 major releases. # 2. Override __str__ or generic solution in prompts Overriding __str__ as a simple way to avoid changing user code that relies on default str(document) in prompts. I considered rolling out a more general solution in prompts (https://github.com/langchain-ai/langchain/pull/8685), but to do that we need to: 1. Make things serializable 2. The more general solution would likely need to be backwards compatible as well 3. It's unclear that one wants to format a List[int] in the same way as List[Document]. The former should be `,` seperated (likely), the latter should be `---` separated (likely). Proposal Start with __str__ override and focus on the vectorstore APIs, we generalize prompts later	2024-06-27 12:15:58 -04:00
Jacob Lee	60fc15a56b	docs[patch]: Update docs introduction and README (#23558 ) CC @hwchase17 @baskaryan	2024-06-27 08:51:43 -07:00
Leonid Ganeline	2c9b84c3a8	core[patch]: docstrings `agents` (#23502 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:50:48 -04:00
Leonid Ganeline	2a5d59b3d7	core[patch]: `callbacks` docstrings (#23375 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:11:06 -04:00
Leonid Ganeline	1141b08eb8	core: docstrings `example_selectors` (#23542 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-26 17:10:40 -04:00
Bagatur	32f8f39974	core[patch]: use args_schema doc for tool description (#23503 )	2024-06-25 15:26:35 -07:00
ccurme	86ca44d451	core: release 0.2.10 (#23420 )	2024-06-25 16:26:31 -04:00
Isaac Francisco	85f5d14cef	[docs]: split up tool docs (#22919 )	2024-06-25 13:15:08 -07:00
William FH	8955bc1866	[Core] Logging: Suppress missing parent warning (#23363 )	2024-06-25 14:57:23 -04:00
ccurme	730c551819	core[patch]: export tool output parsers from langchain_core.output_parsers (#23305 ) These currently read off AIMessage.tool_calls, and only fall back to OpenAI parsing if tool calls aren't populated. Importing these from `openai_tools` (e.g., in our [tool calling docs](https://python.langchain.com/v0.2/docs/how_to/tool_calling/#tool-calls)) can lead to confusion. After landing, would need to release core and update docs.	2024-06-25 14:40:42 -04:00
Eugene Yurtsev	7e9e69c758	core[patch]: Add unit test for str and repr for Document (#23414 )	2024-06-25 18:28:21 +00:00
Riccardo Schirone	4530d851e4	Merge pull request #22662 * core: runnables: special handling GeneratorExit because no error	2024-06-25 08:42:03 -04:00
William FH	efb4c12abe	[Core] Add support for inferring Annotated types (#23284 ) in bind_tools() / convert_to_openai_function	2024-06-21 15:16:30 -07:00
Vadym Barda	9ac302cb97	core[minor]: update draw_mermaid node label processing (#23285 ) This fixes processing issue for nodes with numbers in their labels (e.g. `"node_1"`, which would previously be relabeled as `"node__"`, and now are correctly processed as `"node_1"`)	2024-06-21 21:35:32 +00:00
Bagatur	f824f6d925	docs: fix merge message runs docstring (#23279 )	2024-06-21 19:50:50 +00:00
Bagatur	9eda8f2fe8	docs: fix trim_messages code blocks (#23271 )	2024-06-21 17:15:31 +00:00
Bagatur	4c97a9ee53	docs: fix message transformer docstrings (#23264 )	2024-06-21 16:10:03 +00:00
Brace Sproul	abe7566d7d	core[minor]: BaseChatModel with_structured_output implementation (#22859 )	2024-06-21 08:14:03 -07:00
mackong	360a70c8a8	core[patch]: fix no current event loop for sql history in async mode (#22933 ) - Description: When use RunnableWithMessageHistory/SQLChatMessageHistory in async mode, we'll get the following error: ``` Error in RootListenersTracer.on_chain_end callback: RuntimeError("There is no current event loop in thread 'asyncio_3'.") ``` which throwed by `ddfbca38df/libs/community/langchain_community/chat_message_histories/sql.py (L259)`. and no message history will be add to database. In this patch, a new _aexit_history function which will'be called in async mode is added, and in turn aadd_messages will be called. In this patch, we use `afunc` attribute of a Runnable to check if the end listener should be run in async mode or not. - Issue: #22021, #22022 - Dependencies: N/A	2024-06-21 10:39:47 -04:00
mackong	b108b4d010	core[patch]: set schema format for AsyncRootListenersTracer (#23214 ) - Description: AsyncRootListenersTracer support on_chat_model_start, it's schema_format should be "original+chat". - Issue: N/A - Dependencies:	2024-06-21 09:30:27 -04:00
Bagatur	976b456619	docs: BaseChatModel key methods table (#23238 ) If we're moving documenting inherited params think these kinds of tables become more important ![Screenshot 2024-06-20 at 3 59 12 PM](https://github.com/langchain-ai/langchain/assets/22008038/722266eb-2353-4e85-8fae-76b19bd333e0)	2024-06-20 21:00:22 -07:00
Bagatur	12e0c28a6e	docs: fix chat model methods table (#23233 ) rst table not md ![Screenshot 2024-06-20 at 12 37 46 PM](https://github.com/langchain-ai/langchain/assets/22008038/7a03b869-c1f4-45d0-8d27-3e16f4c6eb19)	2024-06-20 19:51:10 +00:00
Eugene Yurtsev	7545b1d29b	core[patch]: Fix doc-strings for code blocks (#23232 ) Code blocks need extra space around them to be rendered properly by sphinx	2024-06-20 19:34:52 +00:00
Eugene Yurtsev	59d7adff8f	core[patch]: Add clarification about streaming to RunnableLambda (#23227 ) Add streaming clarification to runnable lambda docstring.	2024-06-20 16:47:16 +00:00
ChrisDEV	cb6cf4b631	Fix return value type of dumpd (#20123 ) The return type of `json.loads` is `Any`. In fact, the return type of `dumpd` must be based on `json.loads`, so the correction here is understandable. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-20 16:31:41 +00:00
Guangdong Liu	0bce28cd30	core(patch): Fix encoding problem of load_prompt method (#21559 ) - description: Add encoding parameters. - @baskaryan, @efriis, @eyurtsev, @hwchase17. ![54d25ac7b1d5c2e47741a56fe8ed8ba](https://github.com/langchain-ai/langchain/assets/48236177/ffea9596-2001-4e19-b245-f8a6e231b9f9)	2024-06-20 09:25:54 -07:00
Philippe PRADOS	8711c61298	core[minor]: Adds an in-memory implementation of RecordManager (#13200 ) Description: langchain offers three technologies to save data: - [vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/) - [docstore](https://js.langchain.com/docs/api/schema/classes/Docstore) - [record manager](https://python.langchain.com/docs/modules/data_connection/indexing) If you want to combine these technologies in a sample persistence stategy you need a common implementation for each. `DocStore` propose `InMemoryDocstore`. We propose the class `MemoryRecordManager` to complete the system. This is the prelude to another full-request, which needs a consistent combination of persistence components. Tag maintainer: @baskaryan Twitter handle: @pprados --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-20 12:19:10 -04:00
David DeCaprio	a4bcb45f65	core:Add optional max_messages to MessagePlaceholder (#16098 ) - Description: Add optional max_messages to MessagePlaceholder - Issue: [16096](https://github.com/langchain-ai/langchain/issues/16096) - Dependencies: None - Twitter handle: @davedecaprio Sometimes it's better to limit the history in the prompt itself rather than the memory. This is needed if you want different prompts in the chain to have different history lengths. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-19 23:39:51 +00:00
Eugene Yurtsev	1fcf875fe3	core[patch]: Document agent schema (#23194 ) * Document agent schema * Refer folks to langgraph for more information on how to create agents.	2024-06-19 20:16:57 +00:00
Eugene Yurtsev	c2d43544cc	core[patch]: Document messages namespace (#23154 ) - Moved doc-strings below attribtues in TypedDicts -- seems to render better on APIReference pages. * Provided more description and some simple code examples	2024-06-19 15:00:00 -04:00

... 5 6 7 8 9 ...

899 Commits