Merge branch 'master' into wip-v1.0

This commit is contained in:
Mason Daugherty
2025-09-11 23:21:37 -04:00
committed by GitHub
8 changed files with 481 additions and 17 deletions

View File

@@ -1,6 +1,7 @@
name: "\U0001F41B Bug Report"
description: Report a bug in LangChain. To report a security issue, please instead use the security option below. For questions, please use the LangChain forum.
labels: ["bug"]
type: bug
body:
- type: markdown
attributes:
@@ -13,9 +14,7 @@ body:
if there's another way to solve your problem:
* [LangChain Forum](https://forum.langchain.com/),
* [LangChain Github Issues](https://github.com/langchain-ai/langchain/issues?q=is%3Aissue),
* [LangChain documentation with the integrated search](https://python.langchain.com/docs/get_started/introduction),
* [LangChain how-to guides](https://python.langchain.com/docs/how_to/),
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference](https://python.langchain.com/api_reference/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
@@ -25,7 +24,7 @@ body:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
- label: This is a bug, not a usage question.
required: true
- label: I added a clear and descriptive title that summarizes this issue.
required: true
@@ -35,6 +34,8 @@ body:
required: true
- label: The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
required: true
- label: This is not related to the langchain-community package.
required: true
- label: I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
required: true
- label: I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
@@ -118,3 +119,7 @@ body:
python -m langchain_core.sys_info
validations:
required: true

View File

@@ -1,9 +1,9 @@
blank_issues_enabled: false
version: 2.1
contact_links:
- name: Documentation
- name: 📚 Documentation
url: https://github.com/langchain-ai/docs/issues/new?template=langchain.yml
about: Report an issue related to the LangChain documentation
- name: LangChain Forum
- name: 💬 LangChain Forum
url: https://forum.langchain.com/
about: General community discussions and support

View File

@@ -0,0 +1,118 @@
name: "✨ Feature Request"
description: Request a new feature or enhancement for LangChain. For questions, please use the LangChain forum.
labels: ["feature request"]
type: feature
body:
- type: markdown
attributes:
value: |
Thank you for taking the time to request a new feature.
Use this to request NEW FEATURES or ENHANCEMENTS in LangChain. For bug reports, please use the bug report template. For usage questions and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Relevant links to check before filing a feature request to see if your request has already been made or
if there's another way to achieve what you want:
* [LangChain Forum](https://forum.langchain.com/),
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference](https://python.langchain.com/api_reference/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
- type: checkboxes
id: checks
attributes:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: This is a feature request, not a bug report or usage question.
required: true
- label: I added a clear and descriptive title that summarizes the feature request.
required: true
- label: I used the GitHub search to find a similar feature request and didn't find it.
required: true
- label: I checked the LangChain documentation and API reference to see if this feature already exists.
required: true
- label: This is not related to the langchain-community package.
required: true
- type: textarea
id: feature-description
validations:
required: true
attributes:
label: Feature Description
description: |
Please provide a clear and concise description of the feature you would like to see added to LangChain.
What specific functionality are you requesting? Be as detailed as possible.
placeholder: |
I would like LangChain to support...
This feature would allow users to...
- type: textarea
id: use-case
validations:
required: true
attributes:
label: Use Case
description: |
Describe the specific use case or problem this feature would solve.
Why do you need this feature? What problem does it solve for you or other users?
placeholder: |
I'm trying to build an application that...
Currently, I have to work around this by...
This feature would help me/users to...
- type: textarea
id: proposed-solution
validations:
required: false
attributes:
label: Proposed Solution
description: |
If you have ideas about how this feature could be implemented, please describe them here.
This is optional but can be helpful for maintainers to understand your vision.
placeholder: |
I think this could be implemented by...
The API could look like...
```python
# Example of how the feature might work
```
- type: textarea
id: alternatives
validations:
required: false
attributes:
label: Alternatives Considered
description: |
Have you considered any alternative solutions or workarounds?
What other approaches have you tried or considered?
placeholder: |
I've tried using...
Alternative approaches I considered:
1. ...
2. ...
But these don't work because...
- type: textarea
id: additional-context
validations:
required: false
attributes:
label: Additional Context
description: |
Add any other context, screenshots, examples, or references that would help explain your feature request.
placeholder: |
Related issues: #...
Similar features in other libraries:
- ...
Additional context or examples:
- ...

View File

@@ -4,12 +4,7 @@ body:
- type: markdown
attributes:
value: |
Thanks for your interest in LangChain! 🚀
If you are not a LangChain maintainer or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
You are a LangChain maintainer if you maintain any of the packages inside of the LangChain repository
or are a regular contributor to LangChain with previous merged pull requests.
If you are not a LangChain maintainer, employee, or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
- type: checkboxes
id: privileged
attributes:

91
.github/ISSUE_TEMPLATE/task.yml vendored Normal file
View File

@@ -0,0 +1,91 @@
name: "📋 Task"
description: Create a task for project management and tracking by LangChain maintainers. If you are not a maintainer, please use other templates or the forum.
labels: ["task"]
type: task
body:
- type: markdown
attributes:
value: |
Thanks for creating a task to help organize LangChain development.
This template is for **maintainer tasks** such as project management, development planning, refactoring, documentation updates, and other organizational work.
If you are not a LangChain maintainer or were not asked directly by a maintainer to create a task, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead or use the appropriate bug report or feature request templates on the previous page.
- type: checkboxes
id: maintainer
attributes:
label: Maintainer task
description: Confirm that you are allowed to create a task here.
options:
- label: I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create a task here.
required: true
- type: textarea
id: task-description
attributes:
label: Task Description
description: |
Provide a clear and detailed description of the task.
What needs to be done? Be specific about the scope and requirements.
placeholder: |
This task involves...
The goal is to...
Specific requirements:
- ...
- ...
validations:
required: true
- type: textarea
id: acceptance-criteria
attributes:
label: Acceptance Criteria
description: |
Define the criteria that must be met for this task to be considered complete.
What are the specific deliverables or outcomes expected?
placeholder: |
This task will be complete when:
- [ ] ...
- [ ] ...
- [ ] ...
validations:
required: true
- type: textarea
id: context
attributes:
label: Context and Background
description: |
Provide any relevant context, background information, or links to related issues/PRs.
Why is this task needed? What problem does it solve?
placeholder: |
Background:
- ...
Related issues/PRs:
- #...
Additional context:
- ...
validations:
required: false
- type: textarea
id: dependencies
attributes:
label: Dependencies
description: |
List any dependencies or blockers for this task.
Are there other tasks, issues, or external factors that need to be completed first?
placeholder: |
This task depends on:
- [ ] Issue #...
- [ ] PR #...
- [ ] External dependency: ...
Blocked by:
- ...
validations:
required: false

View File

@@ -496,7 +496,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a custom Vector Store\n",
"## Create a custom Vector Store\n",
"\n",
"Customize the vectorstore with special column names or with custom metadata columns.\n",
"\n",
@@ -617,7 +617,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a Vector Store using existing table\n",
"## Create a Vector Store using existing table\n",
"\n",
"A Vector Store can be built up on an existing table.\n",
"\n",
@@ -713,6 +713,260 @@
"1. For new records, added via `VectorStore` embeddings are automatically generated."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Hybrid Search with PGVectorStore\n",
"\n",
"A Hybrid Search combines multiple lookup strategies to provide more comprehensive and relevant search results. Specifically, it leverages both dense embedding vector search (for semantic similarity) and TSV (Text Search Vector) based keyword search (for lexical matching). This approach is particularly powerful for applications requiring efficient searching through customized text and metadata, especially when a specialized embedding model isn't feasible or necessary.\n",
"\n",
"By integrating both semantic and lexical capabilities, hybrid search helps overcome the limitations of each individual method:\n",
"* **Semantic Search**: Excellent for understanding the meaning of a query, even if the exact keywords aren't present. However, it can sometimes miss highly relevant documents that contain the precise keywords but have a slightly different semantic context.\n",
"* **Keyword Search**: Highly effective for finding documents with exact keyword matches and is generally fast. Its weakness lies in its inability to understand synonyms, misspellings, or conceptual relationships."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hybrid Search Config\n",
"\n",
"You can take advantage of hybrid search with PGVectorStore using the `HybridSearchConfig`.\n",
"\n",
"With a `HybridSearchConfig` provided, the `PGVectorStore` class can efficiently manage a hybrid search vector store using PostgreSQL as the backend, automatically handling the creation and population of the necessary TSV columns when possible."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Building the config\n",
"\n",
"Here are the parameters to the hybrid search config:\n",
"* **tsv_column:** The column name for TSV column. Default: `<content_column>_tsv`\n",
"* **tsv_lang:** Value representing a supported language. Default: `pg_catalog.english`\n",
"* **fts_query:** If provided, this would be used for secondary retrieval instead of user provided query.\n",
"* **fusion_function:** Determines how the results are to be merged, default is equal weighted sum ranking.\n",
"* **fusion_function_parameters:** Parameters for the fusion function\n",
"* **primary_top_k:** Max results fetched for primary retrieval. Default: `4`\n",
"* **secondary_top_k:** Max results fetched for secondary retrieval. Default: `4`\n",
"* **index_name:** Name of the index built on the `tsv_column`\n",
"* **index_type:** GIN or GIST. Default: `GIN`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is an example `HybridSearchConfig`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain_postgres.v2.hybrid_search_config import (\n",
" HybridSearchConfig,\n",
" reciprocal_rank_fusion,\n",
")\n",
"\n",
"hybrid_search_config = HybridSearchConfig(\n",
" tsv_column=\"hybrid_description\",\n",
" tsv_lang=\"pg_catalog.english\",\n",
" fusion_function=reciprocal_rank_fusion,\n",
" fusion_function_parameters={\n",
" \"rrf_k\": 60,\n",
" \"fetch_top_k\": 10,\n",
" },\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note:** In this case, we have mentioned the fusion function to be a `reciprocal rank fusion` but you can also use the `weighted_sum_ranking`.\n",
"\n",
"Make sure to use the right fusion function parameters\n",
"\n",
"`reciprocal_rank_fusion`:\n",
"* rrf_k: The RRF parameter k. Defaults to 60\n",
"* fetch_top_k: The number of documents to fetch after merging the results. Defaults to 4\n",
"\n",
"`weighted_sum_ranking`:\n",
"* primary_results_weight: The weight for the primary source's scores. Defaults to 0.5\n",
"* secondary_results_weight: The weight for the secondary source's scores. Defaults to 0.5\n",
"* fetch_top_k: The number of documents to fetch after merging the results. Defaults to 4\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Usage\n",
"\n",
"Let's assume we are using the previously mentioned table [`products`](#create-a-vector-store-using-existing-table), which stores product details for an eComm venture.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### With a new hybrid search table\n",
"To create a new postgres table with the tsv column, specify the hybrid search config during the initialization of the vector store.\n",
"\n",
"In this case, all the similarity searches will make use of hybrid search."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain_postgres import PGVectorStore\n",
"\n",
"TABLE_NAME = \"hybrid_search_products\"\n",
"\n",
"await pg_engine.ainit_vectorstore_table(\n",
" table_name=TABLE_NAME,\n",
" # schema_name=SCHEMA_NAME,\n",
" vector_size=VECTOR_SIZE,\n",
" id_column=\"product_id\",\n",
" content_column=\"description\",\n",
" embedding_column=\"embed\",\n",
" metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n",
" metadata_json_column=\"metadata\",\n",
" hybrid_search_config=hybrid_search_config,\n",
" store_metadata=True,\n",
")\n",
"\n",
"vs_hybrid = await PGVectorStore.create(\n",
" pg_engine,\n",
" table_name=TABLE_NAME,\n",
" # schema_name=SCHEMA_NAME,\n",
" embedding_service=embedding,\n",
" # Connect to existing VectorStore by customizing below column names\n",
" id_column=\"product_id\",\n",
" content_column=\"description\",\n",
" embedding_column=\"embed\",\n",
" metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n",
" metadata_json_column=\"metadata\",\n",
" hybrid_search_config=hybrid_search_config,\n",
")\n",
"\n",
"# Fetch documents from the previously created store to fetch product documents\n",
"docs = await custom_store.asimilarity_search(\"products\", k=5)\n",
"# Add data normally to the hybrid search vector store, which will also add the tsv values in tsv_column\n",
"await vs_hybrid.aadd_documents(docs)\n",
"\n",
"# Use hybrid search\n",
"hybrid_docs = await vs_hybrid.asimilarity_search(\"products\", k=5)\n",
"print(hybrid_docs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### With a pre-existing table\n",
"\n",
"If a hybrid search config is **NOT** provided during `init_vectorstore_table` while creating a table, the table will not contain a tsv_column. In this case you can still take advantage of hybrid search using the `HybridSearchConfig`.\n",
"\n",
"The specified TSV column is not present but the TSV vectors are created dynamically on-the-go for hybrid search."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain_postgres import PGVectorStore\n",
"\n",
"# Set the existing table name\n",
"TABLE_NAME = \"products\"\n",
"# SCHEMA_NAME = \"my_schema\"\n",
"\n",
"hybrid_search_config = HybridSearchConfig(\n",
" tsv_lang=\"pg_catalog.english\",\n",
" fusion_function=reciprocal_rank_fusion,\n",
" fusion_function_parameters={\n",
" \"rrf_k\": 60,\n",
" \"fetch_top_k\": 10,\n",
" },\n",
")\n",
"\n",
"# Initialize PGVectorStore with the hybrid search config\n",
"custom_hybrid_store = await PGVectorStore.create(\n",
" pg_engine,\n",
" table_name=TABLE_NAME,\n",
" # schema_name=SCHEMA_NAME,\n",
" embedding_service=embedding,\n",
" # Connect to existing VectorStore by customizing below column names\n",
" id_column=\"product_id\",\n",
" content_column=\"description\",\n",
" embedding_column=\"embed\",\n",
" metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n",
" metadata_json_column=\"metadata\",\n",
" hybrid_search_config=hybrid_search_config,\n",
")\n",
"\n",
"# Use hybrid search\n",
"hybrid_docs = await custom_hybrid_store.asimilarity_search(\"products\", k=5)\n",
"print(hybrid_docs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this case, all the similarity searches will make use of hybrid search."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Applying Hybrid Search to Specific Queries\n",
"\n",
"To use hybrid search only for certain queries, omit the configuration during initialization and pass it directly to the search method when needed.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Use hybrid search\n",
"hybrid_docs = await custom_store.asimilarity_search(\n",
" \"products\", k=5, hybrid_search_config=hybrid_search_config\n",
")\n",
"print(hybrid_docs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hybrid Search Index\n",
"\n",
"Optionally, if you have created a Postgres table with a tsv_column, you can create an index."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"await vs_hybrid.aapply_hybrid_search_index()"
]
},
{
"cell_type": "markdown",
"metadata": {},

View File

@@ -1120,8 +1120,7 @@ def test_web_search(output_version: Literal["v0", "v1"]) -> None:
)
# @pytest.mark.vcr
@pytest.mark.xfail(reason="Citations broken in Anthropic API; all other features work")
@pytest.mark.vcr
def test_web_fetch() -> None:
"""Note: this is a beta feature.
@@ -1179,7 +1178,9 @@ def test_web_fetch() -> None:
citation_response = llm_with_citations.invoke([citation_message])
citation_results = [
block for block in citation_response.content if isinstance(block, dict)
block
for block in citation_response.content
if isinstance(block, dict) and block.get("type") == "web_fetch_tool_result"
]
assert len(citation_results) == 1 # Since max_uses=1
citation_result = citation_results[0]