langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-22 02:50:31 +00:00

Author	SHA1	Message	Date
Chester Curme	844b8b87d7	Merge branch 'standard_outputs' into cc/openai_v1 # Conflicts: # libs/core/langchain_core/language_models/v1/chat_models.py # libs/core/langchain_core/messages/utils.py # libs/core/langchain_core/messages/v1.py # libs/partners/openai/langchain_openai/chat_models/_compat.py # libs/partners/openai/langchain_openai/chat_models/base.py	2025-07-28 12:38:32 -04:00
Chester Curme	61e329637b	lint	2025-07-28 11:02:37 -04:00
Chester Curme	b8fed06409	move get_num_tokens_from_messages to BaseChatModel and BaseChatModelV1	2025-07-28 10:58:57 -04:00
Mason Daugherty	ef9b5a9e18	add back standard_outputs	2025-07-28 10:47:26 -04:00
Mason Daugherty	5e9eb19a83	chore: update branch with changes from master (#32277 ) Co-authored-by: Maxime Grenu <69890511+cluster2600@users.noreply.github.com> Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: jmaillefaud <jonathan.maillefaud@evooq.ch> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: tanwirahmad <tanwirahmad@users.noreply.github.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: niceg <79145285+growmuye@users.noreply.github.com> Co-authored-by: Chaitanya varma <varmac301@gmail.com> Co-authored-by: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Kanav Bansal <13186335+bansalkanav@users.noreply.github.com> Co-authored-by: Aleksandr Filippov <71711753+alex-feel@users.noreply.github.com> Co-authored-by: Alex Feel <afilippov@spotware.com>	2025-07-28 10:39:41 -04:00
Chester Curme	c409f723a2	Merge branch 'standard_outputs' into cc/openai_v1 # Conflicts: # libs/core/langchain_core/messages/utils.py	2025-07-28 10:19:50 -04:00
ccurme	3d9e694f73	feat(core): start on v1 chat model (#32276 ) Co-authored-by: Nuno Campos <nuno@langchain.dev>	2025-07-28 10:17:06 -04:00
Mason Daugherty	c921d08b18	feat(docs): add docstring to `_convert_from_v1_message()`	2025-07-25 11:01:48 -04:00
Mason Daugherty	3f653011e6	nit: use `block` instead of `content_block` for consistency in `convert_to_openai_image_block()`	2025-07-25 10:57:22 -04:00
Mason Daugherty	ee13a3b6fa	nit: rearrange `index` to be grouped with other always-present fields	2025-07-25 10:16:35 -04:00
Chester Curme	4899857042	start on openai	2025-07-24 17:12:22 -04:00
Chester Curme	041b196145	Revert "copy BaseChatModel to language_models.v1" This reverts commit `2d031031e3`.	2025-07-24 13:33:41 -04:00
Chester Curme	dd8057a034	remove type ignores for eugene	2025-07-24 13:31:50 -04:00
Chester Curme	b94f23883f	move best-effort v1 conversion	2025-07-24 13:31:27 -04:00
Chester Curme	2d031031e3	copy BaseChatModel to language_models.v1	2025-07-24 09:56:45 -04:00
ccurme	e9b0b84675	feat: new message formats (v0.4) (#32208 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-07-23 13:30:21 -04:00
Chester Curme	eb8d32aff2	output_version -> str	2025-07-23 09:38:01 -04:00
Chester Curme	78d036a093	Merge branch 'wip-v0.4' into standard_outputs	2025-07-23 09:34:20 -04:00
Chester Curme	6572656cd2	core: support both old and new data content blocks	2025-07-22 18:19:09 -04:00
Chester Curme	b1a02f971b	fix tests	2025-07-22 16:45:19 -04:00
Mason Daugherty	a02ad3d192	docs: formatting cleanup (#32188 ) * formatting cleaning * make `init_chat_model` more prominent in list of guides	2025-07-22 15:46:15 -04:00
ccurme	0c4054a7fc	release(core): 0.3.71 (#32186 )	2025-07-22 15:44:36 -04:00
ccurme	ebf2e11bcb	fix(core): exclude api_key from tracing metadata (#32184 ) (standard param)	2025-07-22 15:32:12 -04:00
ccurme	8acfd677bc	fix(core): add type key when tracing in some cases (#31825 )	2025-07-22 18:08:16 +00:00
Mason Daugherty	b24f90dabe	refactor(core): standard content blocks (#32085 )	2025-07-22 09:17:55 -04:00
Copilot	18c64aed6d	feat(core): add `sanitize_for_postgres` utility to fix PostgreSQL NUL byte DataError (#32157 ) This PR fixes the PostgreSQL NUL byte issue that causes `psycopg.DataError` when inserting documents containing `\x00` bytes into PostgreSQL-based vector stores. ## Problem PostgreSQL text fields cannot contain NUL (0x00) bytes. When documents with such characters are processed by PGVector or langchain-postgres implementations, they fail with: ``` (psycopg.DataError) PostgreSQL text fields cannot contain NUL (0x00) bytes ``` This commonly occurs when processing PDFs, documents from various loaders, or text extracted by libraries like unstructured that may contain embedded NUL bytes. ## Solution Added `sanitize_for_postgres()` utility function to `langchain_core.utils.strings` that removes or replaces NUL bytes from text content. ### Key Features - Simple API: `sanitize_for_postgres(text, replacement="")` - Configurable: Replace NUL bytes with empty string (default) or space for readability - Comprehensive: Handles all problematic examples from the original issue - Well-tested: Complete unit tests with real-world examples - Backward compatible: No breaking changes, purely additive ### Usage Example ```python from langchain_core.utils import sanitize_for_postgres from langchain_core.documents import Document # Before: This would fail with DataError problematic_content = "Getting\x00Started with embeddings" # After: Clean the content before database insertion clean_content = sanitize_for_postgres(problematic_content) # Result: "GettingStarted with embeddings" # Or preserve readability with spaces readable_content = sanitize_for_postgres(problematic_content, " ") # Result: "Getting Started with embeddings" # Use in Document processing doc = Document(page_content=clean_content, metadata={...}) ``` ### Integration Pattern PostgreSQL vector store implementations should sanitize content before insertion: ```python def add_documents(self, documents: List[Document]) -> List[str]: # Sanitize documents before insertion sanitized_docs = [] for doc in documents: sanitized_content = sanitize_for_postgres(doc.page_content, " ") sanitized_doc = Document( page_content=sanitized_content, metadata=doc.metadata, id=doc.id ) sanitized_docs.append(sanitized_doc) return self._insert_documents_to_db(sanitized_docs) ``` ## Changes Made - Added `sanitize_for_postgres()` function in `langchain_core/utils/strings.py` - Updated `langchain_core/utils/__init__.py` to export the new function - Added comprehensive unit tests in `tests/unit_tests/utils/test_strings.py` - Validated against all examples from the original issue report ## Testing All tests pass, including: - Basic NUL byte removal and replacement - Multiple consecutive NUL bytes - Empty string handling - Real examples from the GitHub issue - Backward compatibility with existing string utilities This utility enables PostgreSQL integrations in both langchain-community and langchain-postgres packages to handle documents with NUL bytes reliably. Fixes #26033. <!-- START COPILOT CODING AGENT TIPS --> --- 💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click [here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to start the survey. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-07-21 20:33:20 -04:00
Mohammad Mohtashim	095f4a7c28	fix(core): fix `parse_result`in case of self.first_tool_only with multiple keys matching for JsonOutputKeyToolsParser (#32106 ) * Description: Updated `parse_result` logic to handle cases where `self.first_tool_only` is `True` and multiple matching keys share the same function name. Instead of returning the first match prematurely, the method now prioritizes filtering results by the specified key to ensure correct selection. * Issue: #32100 --------- Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-07-21 12:50:22 -04:00
ccurme	0355da3159	release(core): 0.3.70 (#32144 )	2025-07-21 10:49:32 -04:00
astraszab	668c084520	docs(core): move incorrect arg limitation in rate limiter's docstring (#32118 )	2025-07-20 14:28:35 -04:00
Yoshi	6d71bb83de	fix(core): fix docstrings and add sleep to FakeListChatModel._call (#32108 )	2025-07-19 17:30:15 -04:00
Isaac Francisco	98bfd57a76	fix(core): better error message for empty var names (#32073 ) Previously, we hit an index out of range error with empty variable names (accessing tag[0]), now we through a slightly nicer error --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-07-18 17:00:02 -04:00
Gurram Siddarth Reddy	427d2d6397	fix(core): implement sleep delay in FakeMessagesListChatModel `_generate` (#32014 ) implement sleep delay in FakeMessagesListChatModel._generate so the sleep parameter is respected, matching the documented behavior. This adds artificial latency between responses for testing purposes. Issue: closes [#31974](https://github.com/langchain-ai/langchain/issues/31974) following [docs](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.fake_chat_models.FakeMessagesListChatModel.html#langchain_core.language_models.fake_chat_models.FakeMessagesListChatModel.sleep) Dependencies: none Twitter handle: [@siddarthreddyg2](https://x.com/siddarthreddyg2) --------- Signed-off-by: Siddarthreddygsr <siddarthreddygsr@gmail.com>	2025-07-18 15:54:28 -04:00
open-swe[bot]	5da986c3f6	fix(core): JSON Schema reference resolution for list indices (#32088 ) Fixes #32042 ## Summary Fixes a critical bug in JSON Schema reference resolution that prevented correctly dereferencing numeric components in JSON pointer paths, specifically for list indices in `anyOf`, `oneOf`, and `allOf` arrays. ## Changes - Fixed `_retrieve_ref` function in `libs/core/langchain_core/utils/json_schema.py` to properly handle numeric components - Added comprehensive test function `test_dereference_refs_list_index()` in `libs/core/tests/unit_tests/utils/test_json_schema.py` - Resolved line length formatting issues - Improved type checking and index validation for list and dictionary references ## Key Improvements - Correctly handles list index references in JSON pointer paths - Maintains backward compatibility with existing dictionary numeric key functionality - Adds robust error handling for out-of-bounds and invalid indices - Passes all test cases covering various reference scenarios ## Test Coverage - Verified fix for `#/properties/payload/anyOf/1/properties/startDate` reference - Tested edge cases including out-of-bounds and negative indices - Ensured no regression in existing reference resolution functionality Resolves the reported issue with JSON Schema reference dereferencing for list indices. --------- Co-authored-by: open-swe-dev[bot] <open-swe-dev@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-07-17 15:54:38 -04:00
efj-amzn	d3072e2d2e	feat(core): update `_import_utils.py` to not mask the thrown exception (#32071 )	2025-07-16 17:11:56 -04:00
Mason Daugherty	3c19cafab0	docs: improve `output_version` description (#31977 )	2025-07-16 12:29:07 -04:00
Mohammad Mohtashim	96bf8262e2	fix: fixing missing Docstring Bug if no Docstring is provided in BaseModel class (#31608 ) - Description: Ensure that the tool description is an empty string when creating a Structured Tool from a Pydantic class in case no description is provided - Issue: Fixes #31606 --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-07-16 11:56:05 -04:00
Casi	686a6b754c	fix: issue a warning if `np.nan` or `np.inf` are in `_cosine_similarity` argument Matrices (#31532 ) - Description: issues a warning if inf and nan are passed as inputs to langchain_core.vectorstores.utils._cosine_similarity - Issue: Fixes #31496 - Dependencies: no external dependencies added, only warnings module imported --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-07-16 11:50:09 -04:00
Mason Daugherty	ad44f0688b	release(core): release 0.3.69 (#32056 )	2025-07-15 17:13:46 -04:00
Jacob Lee	535ba43b0d	feat(core): add an option to make deserialization more permissive (#32054 ) ## Description Currently when deserializing objects that contain non-deserializable values, we throw an error. However, there are cases (e.g. proxies that return response fields containing extra fields like Python datetimes), where these values are not important and we just want to drop them. Twitter handle: @hacubu --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-07-15 17:00:01 -04:00
Eugene Yurtsev	02d0a9af6c	chore(core): unpin packaging dependency (#32032 ) Unpin packaging dependency --------- Co-authored-by: ntjohnson1 <24689722+ntjohnson1@users.noreply.github.com>	2025-07-14 21:42:32 +00:00
董哥的黑板报	553ac1863b	docs: add deprecation notice for PipelinePromptTemplate (#31999 ) PR title: add deprecation notice for PipelinePromptTemplate PR message: In the API documentation, PipelinePromptTemplate is marked as deprecated, but this is not mentioned in the docs. I'm submitting this PR to add a deprecation notice to the docs. Tests: N/A (documentation only) --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-07-14 15:27:29 +00:00
Andreas V. Jonsterhaug	6dcca35a34	fix(core): correct return type hints in BaseChatPromptTemplate (#32009 ) This PR changes the return type hints of the `format_prompt` and `aformat_prompt` methods in `BaseChatPromptTemplate` from `PromptValue` to `ChatPromptValue`. Since both methods always return a `ChatPromptValue`.	2025-07-14 11:00:01 -04:00
Christophe Bornet	d57216c295	feat(core): add ruff rules D to tests except D1 (#32000 ) Docs are not required for tests but when there are docstrings, they shall be correctly formatted. See https://docs.astral.sh/ruff/rules/#pydocstyle-d	2025-07-14 10:42:03 -04:00
Chester Curme	7c1b59d26a	add test for beta content	2025-07-11 21:03:18 -04:00
Chester Curme	3460c48af6	cr	2025-07-11 15:25:07 -04:00
Chester Curme	7e740e5e1f	cr	2025-07-11 15:16:37 -04:00
Chester Curme	679a9e7c8f	implement beta_content	2025-07-11 14:05:45 -04:00
Chester Curme	67fc58011a	remove total	2025-07-10 17:53:21 -04:00
Chester Curme	a3a95805eb	revert	2025-07-10 17:53:08 -04:00
Chester Curme	354f5d1c7a	NotRequired -> Required	2025-07-10 17:53:00 -04:00

1 2 3 4 5 ...

1255 Commits