Merge branch 'master' into wip-v1.0

2026-02-21 06:33:41 +00:00 · 2025-09-11 23:21:37 -04:00
parent 9f14714367 649d8a8223
commit 67aa37b144
8 changed files with 481 additions and 17 deletions
--- a/.github/ISSUE_TEMPLATE/bug-report.yml
+++ b/.github/ISSUE_TEMPLATE/bug-report.yml
@@ -1,6 +1,7 @@
 name: "\U0001F41B Bug Report"
 description: Report a bug in LangChain. To report a security issue, please instead use the security option below. For questions, please use the LangChain forum.
 labels: ["bug"]
+type: bug
 body:
  - type: markdown
    attributes:
@@ -13,9 +14,7 @@ body:
        if there's another way to solve your problem:

        * [LangChain Forum](https://forum.langchain.com/),
-        * [LangChain Github Issues](https://github.com/langchain-ai/langchain/issues?q=is%3Aissue),
-        * [LangChain documentation with the integrated search](https://python.langchain.com/docs/get_started/introduction),
-        * [LangChain how-to guides](https://python.langchain.com/docs/how_to/),
+        * [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
        * [API Reference](https://python.langchain.com/api_reference/),
        * [LangChain ChatBot](https://chat.langchain.com/)
        * [GitHub search](https://github.com/langchain-ai/langchain),
@@ -25,7 +24,7 @@ body:
      label: Checked other resources
      description: Please confirm and check all the following options.
      options:
-        - label: This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
+        - label: This is a bug, not a usage question.
          required: true
        - label: I added a clear and descriptive title that summarizes this issue.
          required: true
@@ -35,6 +34,8 @@ body:
          required: true
        - label: The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
          required: true
+        - label: This is not related to the langchain-community package.
+          required: true
        - label: I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
          required: true
        - label: I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
@@ -118,3 +119,7 @@ body:
        python -m langchain_core.sys_info
    validations:
      required: true
+
+
+
+
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -1,9 +1,9 @@
 blank_issues_enabled: false
 version: 2.1
 contact_links:
-  - name: Documentation
+  - name: 📚 Documentation
    url: https://github.com/langchain-ai/docs/issues/new?template=langchain.yml
    about: Report an issue related to the LangChain documentation
-  - name: LangChain Forum
+  - name: 💬 LangChain Forum
    url:  https://forum.langchain.com/
    about: General community discussions and support
--- a/.github/ISSUE_TEMPLATE/feature-request.yml
+++ b/.github/ISSUE_TEMPLATE/feature-request.yml
@@ -0,0 +1,118 @@
+name: "✨ Feature Request"
+description: Request a new feature or enhancement for LangChain. For questions, please use the LangChain forum.
+labels: ["feature request"]
+type: feature
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thank you for taking the time to request a new feature.
+
+        Use this to request NEW FEATURES or ENHANCEMENTS in LangChain. For bug reports, please use the bug report template. For usage questions and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
+
+        Relevant links to check before filing a feature request to see if your request has already been made or
+        if there's another way to achieve what you want:
+
+        * [LangChain Forum](https://forum.langchain.com/),
+        * [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
+        * [API Reference](https://python.langchain.com/api_reference/),
+        * [LangChain ChatBot](https://chat.langchain.com/)
+        * [GitHub search](https://github.com/langchain-ai/langchain),
+  - type: checkboxes
+    id: checks
+    attributes:
+      label: Checked other resources
+      description: Please confirm and check all the following options.
+      options:
+        - label: This is a feature request, not a bug report or usage question.
+          required: true
+        - label: I added a clear and descriptive title that summarizes the feature request.
+          required: true
+        - label: I used the GitHub search to find a similar feature request and didn't find it.
+          required: true
+        - label: I checked the LangChain documentation and API reference to see if this feature already exists.
+          required: true
+        - label: This is not related to the langchain-community package.
+          required: true
+  - type: textarea
+    id: feature-description
+    validations:
+      required: true
+    attributes:
+      label: Feature Description
+      description: |
+        Please provide a clear and concise description of the feature you would like to see added to LangChain.
+        
+        What specific functionality are you requesting? Be as detailed as possible.
+      placeholder: |
+        I would like LangChain to support...
+        
+        This feature would allow users to...
+  - type: textarea
+    id: use-case
+    validations:
+      required: true
+    attributes:
+      label: Use Case
+      description: |
+        Describe the specific use case or problem this feature would solve.
+        
+        Why do you need this feature? What problem does it solve for you or other users?
+      placeholder: |
+        I'm trying to build an application that...
+        
+        Currently, I have to work around this by...
+        
+        This feature would help me/users to...
+  - type: textarea
+    id: proposed-solution
+    validations:
+      required: false
+    attributes:
+      label: Proposed Solution
+      description: |
+        If you have ideas about how this feature could be implemented, please describe them here.
+        
+        This is optional but can be helpful for maintainers to understand your vision.
+      placeholder: |
+        I think this could be implemented by...
+        
+        The API could look like...
+        
+        ```python
+        # Example of how the feature might work
+        ```
+  - type: textarea
+    id: alternatives
+    validations:
+      required: false
+    attributes:
+      label: Alternatives Considered
+      description: |
+        Have you considered any alternative solutions or workarounds?
+        
+        What other approaches have you tried or considered?
+      placeholder: |
+        I've tried using...
+        
+        Alternative approaches I considered:
+        1. ...
+        2. ...
+        
+        But these don't work because...
+  - type: textarea
+    id: additional-context
+    validations:
+      required: false
+    attributes:
+      label: Additional Context
+      description: |
+        Add any other context, screenshots, examples, or references that would help explain your feature request.
+      placeholder: |
+        Related issues: #...
+        
+        Similar features in other libraries:
+        - ...
+        
+        Additional context or examples:
+        - ...
--- a/.github/ISSUE_TEMPLATE/privileged.yml
+++ b/.github/ISSUE_TEMPLATE/privileged.yml
@@ -4,12 +4,7 @@ body:
  - type: markdown
    attributes:
      value: |
-        Thanks for your interest in LangChain! 🚀
-
-        If you are not a LangChain maintainer or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
-
-        You are a LangChain maintainer if you maintain any of the packages inside of the LangChain repository
-        or are a regular contributor to LangChain with previous merged pull requests.
+        If you are not a LangChain maintainer, employee, or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
  - type: checkboxes
    id: privileged
    attributes:
--- a/.github/ISSUE_TEMPLATE/task.yml
+++ b/.github/ISSUE_TEMPLATE/task.yml
@@ -0,0 +1,91 @@
+name: "📋 Task"
+description: Create a task for project management and tracking by LangChain maintainers. If you are not a maintainer, please use other templates or the forum.
+labels: ["task"]
+type: task
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for creating a task to help organize LangChain development.
+
+        This template is for **maintainer tasks** such as project management, development planning, refactoring, documentation updates, and other organizational work.
+
+        If you are not a LangChain maintainer or were not asked directly by a maintainer to create a task, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead or use the appropriate bug report or feature request templates on the previous page.
+  - type: checkboxes
+    id: maintainer
+    attributes:
+      label: Maintainer task
+      description: Confirm that you are allowed to create a task here.
+      options:
+        - label: I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create a task here.
+          required: true
+  - type: textarea
+    id: task-description
+    attributes:
+      label: Task Description
+      description: |
+        Provide a clear and detailed description of the task.
+        
+        What needs to be done? Be specific about the scope and requirements.
+      placeholder: |
+        This task involves...
+        
+        The goal is to...
+        
+        Specific requirements:
+        - ...
+        - ...
+    validations:
+      required: true
+  - type: textarea
+    id: acceptance-criteria
+    attributes:
+      label: Acceptance Criteria
+      description: |
+        Define the criteria that must be met for this task to be considered complete.
+        
+        What are the specific deliverables or outcomes expected?
+      placeholder: |
+        This task will be complete when:
+        - [ ] ...
+        - [ ] ...
+        - [ ] ...
+    validations:
+      required: true
+  - type: textarea
+    id: context
+    attributes:
+      label: Context and Background
+      description: |
+        Provide any relevant context, background information, or links to related issues/PRs.
+        
+        Why is this task needed? What problem does it solve?
+      placeholder: |
+        Background:
+        - ...
+        
+        Related issues/PRs:
+        - #...
+        
+        Additional context:
+        - ...
+    validations:
+      required: false
+  - type: textarea
+    id: dependencies
+    attributes:
+      label: Dependencies
+      description: |
+        List any dependencies or blockers for this task.
+        
+        Are there other tasks, issues, or external factors that need to be completed first?
+      placeholder: |
+        This task depends on:
+        - [ ] Issue #...
+        - [ ] PR #...
+        - [ ] External dependency: ...
+        
+        Blocked by:
+        - ...
+    validations:
+      required: false
--- a/docs/docs/integrations/vectorstores/pgvectorstore.ipynb
+++ b/docs/docs/integrations/vectorstores/pgvectorstore.ipynb
@@ -496,7 +496,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Create a custom Vector Store\n",
+    "## Create a custom Vector Store\n",
    "\n",
    "Customize the vectorstore with special column names or with custom metadata columns.\n",
    "\n",
@@ -617,7 +617,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Create a Vector Store using existing table\n",
+    "## Create a Vector Store using existing table\n",
    "\n",
    "A Vector Store can be built up on an existing table.\n",
    "\n",
@@ -713,6 +713,260 @@
    "1. For new records, added via `VectorStore` embeddings are automatically generated."
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Hybrid Search with PGVectorStore\n",
+    "\n",
+    "A Hybrid Search combines multiple lookup strategies to provide more comprehensive and relevant search results. Specifically, it leverages both dense embedding vector search (for semantic similarity) and TSV (Text Search Vector) based keyword search (for lexical matching). This approach is particularly powerful for applications requiring efficient searching through customized text and metadata, especially when a specialized embedding model isn't feasible or necessary.\n",
+    "\n",
+    "By integrating both semantic and lexical capabilities, hybrid search helps overcome the limitations of each individual method:\n",
+    "* **Semantic Search**: Excellent for understanding the meaning of a query, even if the exact keywords aren't present. However, it can sometimes miss highly relevant documents that contain the precise keywords but have a slightly different semantic context.\n",
+    "* **Keyword Search**: Highly effective for finding documents with exact keyword matches and is generally fast. Its weakness lies in its inability to understand synonyms, misspellings, or conceptual relationships."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Hybrid Search Config\n",
+    "\n",
+    "You can take advantage of hybrid search with PGVectorStore using the `HybridSearchConfig`.\n",
+    "\n",
+    "With a `HybridSearchConfig` provided, the `PGVectorStore` class can efficiently manage a hybrid search vector store using PostgreSQL as the backend, automatically handling the creation and population of the necessary TSV columns when possible."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Building the config\n",
+    "\n",
+    "Here are the parameters to the hybrid search config:\n",
+    "* **tsv_column:** The column name for TSV column. Default: `<content_column>_tsv`\n",
+    "* **tsv_lang:** Value representing a supported language. Default: `pg_catalog.english`\n",
+    "* **fts_query:** If provided, this would be used for secondary retrieval instead of user provided query.\n",
+    "* **fusion_function:** Determines how the results are to be merged, default is equal weighted sum ranking.\n",
+    "* **fusion_function_parameters:** Parameters for the fusion function\n",
+    "* **primary_top_k:** Max results fetched for primary retrieval. Default: `4`\n",
+    "* **secondary_top_k:** Max results fetched for secondary retrieval. Default: `4`\n",
+    "* **index_name:** Name of the index built on the `tsv_column`\n",
+    "* **index_type:** GIN or GIST. Default: `GIN`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here is an example `HybridSearchConfig`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_postgres.v2.hybrid_search_config import (\n",
+    "    HybridSearchConfig,\n",
+    "    reciprocal_rank_fusion,\n",
+    ")\n",
+    "\n",
+    "hybrid_search_config = HybridSearchConfig(\n",
+    "    tsv_column=\"hybrid_description\",\n",
+    "    tsv_lang=\"pg_catalog.english\",\n",
+    "    fusion_function=reciprocal_rank_fusion,\n",
+    "    fusion_function_parameters={\n",
+    "        \"rrf_k\": 60,\n",
+    "        \"fetch_top_k\": 10,\n",
+    "    },\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Note:** In this case, we have mentioned the fusion function to be a `reciprocal rank fusion` but you can also use the `weighted_sum_ranking`.\n",
+    "\n",
+    "Make sure to use the right fusion function parameters\n",
+    "\n",
+    "`reciprocal_rank_fusion`:\n",
+    "* rrf_k: The RRF parameter k. Defaults to 60\n",
+    "* fetch_top_k: The number of documents to fetch after merging the results. Defaults to 4\n",
+    "\n",
+    "`weighted_sum_ranking`:\n",
+    "* primary_results_weight: The weight for the primary source's scores. Defaults to 0.5\n",
+    "* secondary_results_weight: The weight for the secondary source's scores. Defaults to 0.5\n",
+    "* fetch_top_k: The number of documents to fetch after merging the results. Defaults to 4\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Usage\n",
+    "\n",
+    "Let's assume we are using the previously mentioned table [`products`](#create-a-vector-store-using-existing-table), which stores product details for an eComm venture.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### With a new hybrid search table\n",
+    "To create a new postgres table with the tsv column, specify the hybrid search config during the initialization of the vector store.\n",
+    "\n",
+    "In this case, all the similarity searches will make use of hybrid search."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_postgres import PGVectorStore\n",
+    "\n",
+    "TABLE_NAME = \"hybrid_search_products\"\n",
+    "\n",
+    "await pg_engine.ainit_vectorstore_table(\n",
+    "    table_name=TABLE_NAME,\n",
+    "    # schema_name=SCHEMA_NAME,\n",
+    "    vector_size=VECTOR_SIZE,\n",
+    "    id_column=\"product_id\",\n",
+    "    content_column=\"description\",\n",
+    "    embedding_column=\"embed\",\n",
+    "    metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n",
+    "    metadata_json_column=\"metadata\",\n",
+    "    hybrid_search_config=hybrid_search_config,\n",
+    "    store_metadata=True,\n",
+    ")\n",
+    "\n",
+    "vs_hybrid = await PGVectorStore.create(\n",
+    "    pg_engine,\n",
+    "    table_name=TABLE_NAME,\n",
+    "    # schema_name=SCHEMA_NAME,\n",
+    "    embedding_service=embedding,\n",
+    "    # Connect to existing VectorStore by customizing below column names\n",
+    "    id_column=\"product_id\",\n",
+    "    content_column=\"description\",\n",
+    "    embedding_column=\"embed\",\n",
+    "    metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n",
+    "    metadata_json_column=\"metadata\",\n",
+    "    hybrid_search_config=hybrid_search_config,\n",
+    ")\n",
+    "\n",
+    "# Fetch documents from the previously created store to fetch product documents\n",
+    "docs = await custom_store.asimilarity_search(\"products\", k=5)\n",
+    "# Add data normally to the hybrid search vector store, which will also add the tsv values in tsv_column\n",
+    "await vs_hybrid.aadd_documents(docs)\n",
+    "\n",
+    "# Use hybrid search\n",
+    "hybrid_docs = await vs_hybrid.asimilarity_search(\"products\", k=5)\n",
+    "print(hybrid_docs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### With a pre-existing table\n",
+    "\n",
+    "If a hybrid search config is **NOT** provided during `init_vectorstore_table` while creating a table, the table will not contain a tsv_column. In this case you can still take advantage of hybrid search using the `HybridSearchConfig`.\n",
+    "\n",
+    "The specified TSV column is not present but the TSV vectors are created dynamically on-the-go for hybrid search."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_postgres import PGVectorStore\n",
+    "\n",
+    "# Set the existing table name\n",
+    "TABLE_NAME = \"products\"\n",
+    "# SCHEMA_NAME = \"my_schema\"\n",
+    "\n",
+    "hybrid_search_config = HybridSearchConfig(\n",
+    "    tsv_lang=\"pg_catalog.english\",\n",
+    "    fusion_function=reciprocal_rank_fusion,\n",
+    "    fusion_function_parameters={\n",
+    "        \"rrf_k\": 60,\n",
+    "        \"fetch_top_k\": 10,\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "# Initialize PGVectorStore with the hybrid search config\n",
+    "custom_hybrid_store = await PGVectorStore.create(\n",
+    "    pg_engine,\n",
+    "    table_name=TABLE_NAME,\n",
+    "    # schema_name=SCHEMA_NAME,\n",
+    "    embedding_service=embedding,\n",
+    "    # Connect to existing VectorStore by customizing below column names\n",
+    "    id_column=\"product_id\",\n",
+    "    content_column=\"description\",\n",
+    "    embedding_column=\"embed\",\n",
+    "    metadata_columns=[\"name\", \"category\", \"price_usd\", \"quantity\", \"sku\", \"image_url\"],\n",
+    "    metadata_json_column=\"metadata\",\n",
+    "    hybrid_search_config=hybrid_search_config,\n",
+    ")\n",
+    "\n",
+    "# Use hybrid search\n",
+    "hybrid_docs = await custom_hybrid_store.asimilarity_search(\"products\", k=5)\n",
+    "print(hybrid_docs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this case, all the similarity searches will make use of hybrid search."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Applying Hybrid Search to Specific Queries\n",
+    "\n",
+    "To use hybrid search only for certain queries, omit the configuration during initialization and pass it directly to the search method when needed.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Use hybrid search\n",
+    "hybrid_docs = await custom_store.asimilarity_search(\n",
+    "    \"products\", k=5, hybrid_search_config=hybrid_search_config\n",
+    ")\n",
+    "print(hybrid_docs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Hybrid Search Index\n",
+    "\n",
+    "Optionally, if you have created a Postgres table with a tsv_column, you can create an index."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "await vs_hybrid.aapply_hybrid_search_index()"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
--- a/libs/partners/anthropic/tests/cassettes/test_web_fetch.yaml.gz
+++ b/libs/partners/anthropic/tests/cassettes/test_web_fetch.yaml.gz
--- a/libs/partners/anthropic/tests/integration_tests/test_chat_models.py
+++ b/libs/partners/anthropic/tests/integration_tests/test_chat_models.py
@@ -1120,8 +1120,7 @@ def test_web_search(output_version: Literal["v0", "v1"]) -> None:
    )


-# @pytest.mark.vcr
-@pytest.mark.xfail(reason="Citations broken in Anthropic API; all other features work")
+@pytest.mark.vcr
 def test_web_fetch() -> None:
    """Note: this is a beta feature.

@@ -1179,7 +1178,9 @@ def test_web_fetch() -> None:
    citation_response = llm_with_citations.invoke([citation_message])

    citation_results = [
-        block for block in citation_response.content if isinstance(block, dict)
+        block
+        for block in citation_response.content
+        if isinstance(block, dict) and block.get("type") == "web_fetch_tool_result"
    ]
    assert len(citation_results) == 1  # Since max_uses=1
    citation_result = citation_results[0]