bump 244 (#8314 )

Clean queries prior to search (#8309 )
With some search tools, we see no results returned if the query is a numeric list. E.g., if we pass: ``` '1. "LangChain vs LangSmith: How do they differ?"' ``` We see: ``` No good Google Search Result was found ``` Local testing w/ Streamlit: ![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)
2026-02-05 08:40:36 +00:00 · 2023-07-26 11:58:26 -07:00 · 2023-07-26 11:48:28 -07:00 · 2023-07-26 11:45:50 -07:00 · 2023-07-26 11:31:08 -07:00 · 2023-07-26 11:30:17 -07:00
18 changed files with 274 additions and 60 deletions
--- a/docs/extras/integrations/document_loaders/geopandas.ipynb
+++ b/docs/extras/integrations/document_loaders/geopandas.ipynb
@@ -46,7 +46,7 @@
   "id": "04981332",
   "metadata": {},
   "source": [
-    "Create a GeoPandas dataframe from [`Open City Data`](https://python.langchain.com/docs/modules/data_connection/document_loaders/integrations/open_city_data) as an example input."
+    "Create a GeoPandas dataframe from [`Open City Data`](https://python.langchain.com/docs/integrations/document_loaders/open_city_data) as an example input."
   ]
  },
  {
--- a/docs/extras/integrations/providers/rockset.mdx
+++ b/docs/extras/integrations/providers/rockset.mdx
@@ -20,7 +20,7 @@ from langchain.vectorstores import RocksetDB

 ## Document Loader

-See a [usage example](docs/modules/data_connection/document_loaders/integrations/rockset).
+See a [usage example](/docs/integrations/document_loaders/rockset).
 ```python
 from langchain.document_loaders import RocksetLoader
 ```
--- a/docs/extras/integrations/toolkits/python.ipynb
+++ b/docs/extras/integrations/toolkits/python.ipynb
@@ -34,7 +34,7 @@
   "source": [
    "## Using ZERO_SHOT_REACT_DESCRIPTION\n",
    "\n",
-    "This shows how to initialize the agent using the ZERO_SHOT_REACT_DESCRIPTION agent type. Note that this is an alternative to the above."
+    "This shows how to initialize the agent using the ZERO_SHOT_REACT_DESCRIPTION agent type."
   ]
  },
  {
@@ -271,7 +271,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.11.3"
  }
 },
 "nbformat": 4,
--- a/docs/extras/integrations/tools/ddg.ipynb
+++ b/docs/extras/integrations/tools/ddg.ipynb
@@ -12,7 +12,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 19,
   "id": "21e46d4d",
   "metadata": {},
   "outputs": [],
@@ -22,7 +22,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 20,
   "id": "ac4910f8",
   "metadata": {},
   "outputs": [],
@@ -32,7 +32,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 21,
   "id": "84b8f773",
   "metadata": {},
   "outputs": [],
@@ -42,17 +42,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 22,
   "id": "068991a6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "'Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009-17) and the first African American to hold the office. Before winning the presidency, Obama represented Illinois in the U.S. Senate (2005-08). Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American former politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing ... Barack Obama was the first African American president of the United States (2009-17). He oversaw the recovery of the U.S. economy (from the Great Recession of 2008-09) and the enactment of landmark health care reform (the Patient Protection and Affordable Care Act ). In 2009 he was awarded the Nobel Peace Prize. His birth certificate lists his first name as Barack: That\\'s how Obama has spelled his name throughout his life. His name derives from a Hebrew name which means \"lightning.\". The Hebrew word has been transliterated into English in various spellings, including Barak, Buraq, Burack, and Barack. Most common names of U.S. presidents 1789-2021. Published by. Aaron O\\'Neill , Jun 21, 2022. The most common first name for a U.S. president is James, followed by John and then William. Six U.S ...'"
+       "'August 4, 1961 (age 61) Honolulu Hawaii Title / Office: presidency of the United States of America (2009-2017), United States United States Senate (2005-2008), United States ... (Show more) Political Affiliation: Democratic Party Awards And Honors: Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing Illinois ... Answer (1 of 12): I see others have answered President Obama\\'s name which is \"Barack Hussein Obama\". President Obama has received many comments about his name from the racists across US. It is worth noting that he never changed his name. Also, it is worth noting that a simple search would have re... What is Barack Obama\\'s full name? Updated: 11/11/2022 Wiki User ∙ 6y ago Study now See answer (1) Best Answer Copy His full, birth name is Barack Hussein Obama, II. He was named after his... Alex Oliveira July 24, 2023 4:57pm Updated 0 seconds of 43 secondsVolume 0% 00:00 00:43 The man who drowned while paddleboarding on a pond outside the Obamas\\' Martha\\'s Vineyard estate has been...'"
      ]
     },
-     "execution_count": 5,
+     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -60,6 +60,145 @@
   "source": [
    "search.run(\"Obama's first name?\")"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "889027d4",
+   "metadata": {},
+   "source": [
+    "To get more additional information (e.g. link, source) use `DuckDuckGoSearchResults()`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "id": "95635444",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.tools import DuckDuckGoSearchResults"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "id": "0133d103",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search = DuckDuckGoSearchResults()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "id": "439efc06",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"[snippet: Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing Illinois ..., title: Barack Obama - Wikipedia, link: https://en.wikipedia.org/wiki/Barack_Obama], [snippet: Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009-17) and the first African American to hold the office. Before winning the presidency, Obama represented Illinois in the U.S. Senate (2005-08). He was the third African American to be elected to that body ..., title: Barack Obama | Biography, Parents, Education, Presidency, Books ..., link: https://www.britannica.com/biography/Barack-Obama], [snippet: Barack Obama 's tenure as the 44th president of the United States began with his first inauguration on January 20, 2009, and ended on January 20, 2017. A Democrat from Illinois, Obama took office following a decisive victory over Republican nominee John McCain in the 2008 presidential election. Four years later, in the 2012 presidential ..., title: Presidency of Barack Obama - Wikipedia, link: https://en.wikipedia.org/wiki/Presidency_of_Barack_Obama], [snippet: First published on Mon 24 Jul 2023 20.03 EDT. Barack Obama's personal chef died while paddleboarding near the ex-president's home on Martha's Vineyard over the weekend, Massachusetts state ..., title: Obama's personal chef dies while paddleboarding off Martha's Vineyard ..., link: https://www.theguardian.com/us-news/2023/jul/24/tafari-campbell-barack-obama-chef-drowns-marthas-vineyard]\""
+      ]
+     },
+     "execution_count": 25,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "search.run(\"Obama\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e17ccfe7",
+   "metadata": {},
+   "source": [
+    "You can also just search for news articles. Use the keyword ``backend=\"news\"``"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "id": "21afe28d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search = DuckDuckGoSearchResults(backend=\"news\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "id": "2a4beeb9",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"[date: 2023-07-26T12:01:22, title: 'My heart is broken': Former Obama White House chef mourned following apparent drowning death in Edgartown, snippet: Tafari Campbell of Dumfries, Va., had been paddle boarding in Edgartown Great Pond when he appeared to briefly struggle, submerged, and did not return to the surface, authorities have said. Crews ultimately found the 45-year-old's body Monday morning., source: The Boston Globe on MSN.com, link: https://www.msn.com/en-us/news/us/my-heart-is-broken-former-obama-white-house-chef-mourned-following-apparent-drowning-death-in-edgartown/ar-AA1elNB8], [date: 2023-07-25T18:44:00, title: Obama's chef drowns paddleboarding near former president's Edgartown vacation home, snippet: Campbell was visiting Martha's Vineyard, where the Obamas own a vacation home. He was not wearing a lifejacket when he fell off his paddleboard., source: YAHOO!News, link: https://news.yahoo.com/obama-chef-drowns-paddleboarding-near-184437491.html], [date: 2023-07-26T00:30:00, title: Obama's personal chef dies while paddleboarding off Martha's Vineyard, snippet: Tafari Campbell, who worked at the White House during Obama's presidency, was visiting the island while the family was away, source: The Guardian, link: https://www.theguardian.com/us-news/2023/jul/24/tafari-campbell-barack-obama-chef-drowns-marthas-vineyard], [date: 2023-07-24T21:54:00, title: Obama's chef ID'd as paddleboarder who drowned near former president's Martha's Vineyard estate, snippet: Former President Barack Obama's personal chef, Tafari Campbell, has been identified as the paddle boarder who drowned near the Obamas' Martha's Vineyard estate., source: Fox News, link: https://www.foxnews.com/politics/obamas-chef-idd-paddleboarder-who-drowned-near-former-presidents-marthas-vineyard-estate]\""
+      ]
+     },
+     "execution_count": 27,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "search.run(\"Obama\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5f7c0129",
+   "metadata": {},
+   "source": [
+    "You can also directly pass a custom ``DuckDuckGoSearchAPIWrapper`` to ``DuckDuckGoSearchResults``. Therefore, you have much more control over the search results."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 28,
+   "id": "c7ab3b55",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.utilities import DuckDuckGoSearchAPIWrapper\n",
+    "\n",
+    "wrapper = DuckDuckGoSearchAPIWrapper(region=\"de-de\", time=\"d\", max_results=2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "id": "adce16e1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search = DuckDuckGoSearchResults(api_wrapper=wrapper, backend=\"news\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 30,
+   "id": "b7e77c54",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'[date: 2023-07-25T12:15:00, title: Barack + Michelle Obama: Sie trauern um Angestellten, snippet: Barack und Michelle Obama trauern um ihren ehemaligen Küchenchef Tafari Campbell. Der Familienvater verunglückte am vergangenen Sonntag und wurde in einem Teich geborgen., source: Gala, link: https://www.gala.de/stars/news/barack---michelle-obama--sie-trauern-um-angestellten-23871228.html], [date: 2023-07-25T10:30:00, title: Barack Obama: Sein Koch (†45) ist tot - diese Details sind bekannt, snippet: Tafari Campbell war früher im Weißen Haus eingestellt, arbeitete anschließend weiter für Ex-Präsident Barack Obama. Nun ist er gestorben. Diese Details sind bekannt., source: T-Online, link: https://www.t-online.de/unterhaltung/stars/id_100213226/barack-obama-sein-koch-45-ist-tot-diese-details-sind-bekannt.html], [date: 2023-07-25T05:33:23, title: Barack Obama: Sein Privatkoch ist bei einem tragischen Unfall gestorben, snippet: Barack Obama (61) und Michelle Obama (59) sind in tiefer Trauer. Ihr Privatkoch Tafari Campbell ist am Montag (24. Juli) ums Leben gekommen, er wurde nur 45 Jahre alt. Laut US-Polizei starb er bei ein, source: BUNTE.de, link: https://www.msn.com/de-de/unterhaltung/other/barack-obama-sein-privatkoch-ist-bei-einem-tragischen-unfall-gestorben/ar-AA1ejrAd], [date: 2023-07-25T02:25:00, title: Barack Obama: Privatkoch tot in See gefunden, snippet: Tafari Campbell kochte für Barack Obama im Weißen Haus - und auch privat nach dessen Abschied aus dem Präsidentenamt. Nun machte die Polizei in einem Gewässer eine traurige Entdeckung., source: SPIEGEL, link: https://www.spiegel.de/panorama/justiz/barack-obama-leibkoch-tot-in-see-gefunden-a-3cdf6377-bee0-43f1-a200-a285742f9ffc]'"
+      ]
+     },
+     "execution_count": 30,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "search.run(\"Obama\")"
+   ]
  }
 ],
 "metadata": {
@@ -78,7 +217,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.9"
  },
  "vscode": {
   "interpreter": {
--- a/docs/extras/use_cases/question_answering/index.mdx
+++ b/docs/extras/use_cases/question_answering/index.mdx
@@ -189,7 +189,7 @@ All retrievers implement some common methods, such as `get_relevant_documents()`
 from langchain.retrievers import SVMRetriever
 svm_retriever = SVMRetriever.from_documents(all_splits,OpenAIEmbeddings())
 docs_svm=svm_retriever.get_relevant_documents(question)
-len(docs)
+len(docs_svm)
 ```


--- a/libs/langchain/dev.Dockerfile
+++ b/libs/langchain/dev.Dockerfile
@@ -35,7 +35,7 @@ FROM langchain-dev-base AS langchain-dev-dependencies
 ARG PYTHON_VIRTUALENV_HOME

 # Copy only the dependency files for installation
-COPY pyproject.toml poetry.toml ./
+COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml ./

 # Copy the langchain library for installation
 COPY libs/langchain/ libs/langchain/
--- a/libs/langchain/langchain/chat_models/openai.py
+++ b/libs/langchain/langchain/chat_models/openai.py
@@ -340,8 +340,10 @@ class ChatOpenAI(BaseChatModel):
                    if _function_call:
                        if function_call is None:
                            function_call = _function_call
-                        else:
+                        elif "arguments" in function_call:
                            function_call["arguments"] += _function_call["arguments"]
+                        else:
+                            function_call["arguments"] = _function_call["arguments"]
                    if run_manager:
                        run_manager.on_llm_new_token(token)
            message = _convert_dict_to_message(
@@ -406,8 +408,10 @@ class ChatOpenAI(BaseChatModel):
                    if _function_call:
                        if function_call is None:
                            function_call = _function_call
-                        else:
+                        elif "arguments" in function_call:
                            function_call["arguments"] += _function_call["arguments"]
+                        else:
+                            function_call["arguments"] = _function_call["arguments"]
                    if run_manager:
                        await run_manager.on_llm_new_token(token)
            message = _convert_dict_to_message(
--- a/libs/langchain/langchain/load/load.py
+++ b/libs/langchain/langchain/load/load.py
@@ -1,7 +1,7 @@
 import importlib
 import json
 import os
-from typing import Any, Dict, Optional
+from typing import Any, Dict, List, Optional

 from langchain.load.serializable import Serializable

@@ -9,8 +9,16 @@ from langchain.load.serializable import Serializable
 class Reviver:
    """Reviver for JSON objects."""

-    def __init__(self, secrets_map: Optional[Dict[str, str]] = None) -> None:
+    def __init__(
+        self,
+        secrets_map: Optional[Dict[str, str]] = None,
+        valid_namespaces: Optional[List[str]] = None,
+    ) -> None:
        self.secrets_map = secrets_map or dict()
+        # By default only support langchain, but user can pass in additional namespaces
+        self.valid_namespaces = (
+            ["langchain", *valid_namespaces] if valid_namespaces else ["langchain"]
+        )

    def __call__(self, value: Dict[str, Any]) -> Any:
        if (
@@ -43,8 +51,7 @@ class Reviver:
        ):
            [*namespace, name] = value["id"]

-            # Currently, we only support langchain imports.
-            if namespace[0] != "langchain":
+            if namespace[0] not in self.valid_namespaces:
                raise ValueError(f"Invalid namespace: {value}")

            # The root namespace "langchain" is not a valid identifier.
@@ -66,14 +73,21 @@ class Reviver:
        return value


-def loads(text: str, *, secrets_map: Optional[Dict[str, str]] = None) -> Any:
+def loads(
+    text: str,
+    *,
+    secrets_map: Optional[Dict[str, str]] = None,
+    valid_namespaces: Optional[List[str]] = None,
+) -> Any:
    """Load a JSON object from a string.

    Args:
        text: The string to load.
        secrets_map: A map of secrets to load.
+        valid_namespaces: A list of additional namespaces (modules)
+            to allow to be deserialized.

    Returns:

    """
-    return json.loads(text, object_hook=Reviver(secrets_map))
+    return json.loads(text, object_hook=Reviver(secrets_map, valid_namespaces))
--- a/libs/langchain/langchain/retrievers/web_research.py
+++ b/libs/langchain/langchain/retrievers/web_research.py
@@ -35,7 +35,7 @@ class SearchQueries(BaseModel):
 DEFAULT_LLAMA_SEARCH_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""<<SYS>> \n You are an assistant tasked with improving Google search 
-    results. \n <</SYS>> \n\n [INST] Generate FIVE Google search queries that 
+    results. \n <</SYS>> \n\n [INST] Generate THREE Google search queries that 
    are similar to this question. The output should be a numbered list of questions 
    and each should have a question mark at the end: \n\n {question} [/INST]""",
 )
@@ -43,7 +43,7 @@ DEFAULT_LLAMA_SEARCH_PROMPT = PromptTemplate(
 DEFAULT_SEARCH_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an assistant tasked with improving Google search 
-    results. Generate FIVE Google search queries that are similar to
+    results. Generate THREE Google search queries that are similar to
    this question. The output should be a numbered list of questions and each
    should have a question mark at the end: {question}""",
 )
@@ -73,7 +73,6 @@ class WebResearchRetriever(BaseRetriever):
    )
    llm_chain: LLMChain
    search: GoogleSearchAPIWrapper = Field(..., description="Google Search API Wrapper")
-    max_splits_per_doc: int = Field(100, description="Maximum splits per document")
    num_search_results: int = Field(1, description="Number of pages per Google search")
    text_splitter: RecursiveCharacterTextSplitter = Field(
        RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=50),
@@ -90,10 +89,9 @@ class WebResearchRetriever(BaseRetriever):
        llm: BaseLLM,
        search: GoogleSearchAPIWrapper,
        prompt: Optional[BasePromptTemplate] = None,
-        max_splits_per_doc: int = 100,
        num_search_results: int = 1,
        text_splitter: RecursiveCharacterTextSplitter = RecursiveCharacterTextSplitter(
-            chunk_size=1500, chunk_overlap=50
+            chunk_size=1500, chunk_overlap=150
        ),
    ) -> "WebResearchRetriever":
        """Initialize from llm using default template.
@@ -103,7 +101,6 @@ class WebResearchRetriever(BaseRetriever):
            llm: llm for search question generation
            search: GoogleSearchAPIWrapper
            prompt: prompt to generating search questions
-            max_splits_per_doc: Maximum splits per document to keep
            num_search_results: Number of pages per Google search
            text_splitter: Text splitter for splitting web pages into chunks

@@ -131,14 +128,30 @@ class WebResearchRetriever(BaseRetriever):
            vectorstore=vectorstore,
            llm_chain=llm_chain,
            search=search,
-            max_splits_per_doc=max_splits_per_doc,
            num_search_results=num_search_results,
            text_splitter=text_splitter,
        )

+    def clean_search_query(self, query: str) -> str:
+        # Some search tools (e.g., Google) will
+        # fail to return results if query has a
+        # leading digit: 1. "LangCh..."
+        # Check if the first character is a digit
+        if query[0].isdigit():
+            # Find the position of the first quote
+            first_quote_pos = query.find('"')
+            if first_quote_pos != -1:
+                # Extract the part of the string after the quote
+                query = query[first_quote_pos + 1 :]
+                # Remove the trailing quote if present
+                if query.endswith('"'):
+                    query = query[:-1]
+        return query.strip()
+
    def search_tool(self, query: str, num_search_results: int = 1) -> List[dict]:
        """Returns num_serch_results pages per Google search."""
-        result = self.search.results(query, num_search_results)
+        query_clean = self.clean_search_query(query)
+        result = self.search.results(query_clean, num_search_results)
        return result

    def _get_relevant_documents(
--- a/libs/langchain/langchain/tools/ddg_search/tool.py
+++ b/libs/langchain/langchain/tools/ddg_search/tool.py
@@ -56,6 +56,7 @@ class DuckDuckGoSearchResults(BaseTool):
    api_wrapper: DuckDuckGoSearchAPIWrapper = Field(
        default_factory=DuckDuckGoSearchAPIWrapper
    )
+    backend: str = "api"

    def _run(
        self,
@@ -63,7 +64,9 @@ class DuckDuckGoSearchResults(BaseTool):
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        """Use the tool."""
-        return str(self.api_wrapper.results(query, self.num_results))
+        res = self.api_wrapper.results(query, self.num_results, backend=self.backend)
+        res_strs = [", ".join([f"{k}: {v}" for k, v in d.items()]) for d in res]
+        return ", ".join([f"[{rs}]" for rs in res_strs])

    async def _arun(
        self,
--- a/libs/langchain/langchain/utilities/duckduckgo_search.py
+++ b/libs/langchain/langchain/utilities/duckduckgo_search.py
@@ -62,7 +62,9 @@ class DuckDuckGoSearchAPIWrapper(BaseModel):
        snippets = self.get_snippets(query)
        return " ".join(snippets)

-    def results(self, query: str, num_results: int) -> List[Dict[str, str]]:
+    def results(
+        self, query: str, num_results: int, backend: str = "api"
+    ) -> List[Dict[str, str]]:
        """Run query through DuckDuckGo and return metadata.

        Args:
@@ -83,11 +85,20 @@ class DuckDuckGoSearchAPIWrapper(BaseModel):
                region=self.region,
                safesearch=self.safesearch,
                timelimit=self.time,
+                backend=backend,
            )
            if results is None:
                return [{"Result": "No good DuckDuckGo Search Result was found"}]

            def to_metadata(result: Dict) -> Dict[str, str]:
+                if backend == "news":
+                    return {
+                        "date": result["date"],
+                        "title": result["title"],
+                        "snippet": result["body"],
+                        "source": result["source"],
+                        "link": result["url"],
+                    }
                return {
                    "snippet": result["body"],
                    "title": result["title"],
--- a/libs/langchain/langchain/vectorstores/base.py
+++ b/libs/langchain/langchain/vectorstores/base.py
@@ -446,7 +446,7 @@ class VectorStore(ABC):
        """Return VectorStore initialized from texts and embeddings."""
        raise NotImplementedError

-    def __get_retriever_tags(self) -> List[str]:
+    def _get_retriever_tags(self) -> List[str]:
        """Get tags for retriever."""
        tags = [self.__class__.__name__]
        if self.embeddings:
@@ -455,7 +455,7 @@ class VectorStore(ABC):

    def as_retriever(self, **kwargs: Any) -> VectorStoreRetriever:
        tags = kwargs.pop("tags", None) or []
-        tags.extend(self.__get_retriever_tags())
+        tags.extend(self._get_retriever_tags())
        return VectorStoreRetriever(vectorstore=self, **kwargs, tags=tags)


--- a/libs/langchain/langchain/vectorstores/elastic_vector_search.py
+++ b/libs/langchain/langchain/vectorstores/elastic_vector_search.py
@@ -294,6 +294,8 @@ class ElasticVectorSearch(VectorStore, ABC):
        elasticsearch_url = get_from_dict_or_env(
            kwargs, "elasticsearch_url", "ELASTICSEARCH_URL"
        )
+        if "elasticsearch_url" in kwargs:
+            del kwargs["elasticsearch_url"]
        index_name = index_name or uuid.uuid4().hex
        vectorsearch = cls(elasticsearch_url, index_name, embedding, **kwargs)
        vectorsearch.add_texts(
--- a/libs/langchain/langchain/vectorstores/redis.py
+++ b/libs/langchain/langchain/vectorstores/redis.py
@@ -607,7 +607,7 @@ class Redis(VectorStore):

    def as_retriever(self, **kwargs: Any) -> RedisVectorStoreRetriever:
        tags = kwargs.pop("tags", None) or []
-        tags.extend(self.__get_retriever_tags())
+        tags.extend(self._get_retriever_tags())
        return RedisVectorStoreRetriever(vectorstore=self, **kwargs, tags=tags)


--- a/libs/langchain/langchain/vectorstores/singlestoredb.py
+++ b/libs/langchain/langchain/vectorstores/singlestoredb.py
@@ -446,7 +446,7 @@ class SingleStoreDB(VectorStore):

    def as_retriever(self, **kwargs: Any) -> SingleStoreDBRetriever:
        tags = kwargs.pop("tags", None) or []
-        tags.extend(self.__get_retriever_tags())
+        tags.extend(self._get_retriever_tags())
        return SingleStoreDBRetriever(vectorstore=self, **kwargs, tags=tags)


--- a/libs/langchain/langchain/vectorstores/vectara.py
+++ b/libs/langchain/langchain/vectorstores/vectara.py
@@ -409,7 +409,7 @@ class Vectara(VectorStore):

    def as_retriever(self, **kwargs: Any) -> VectaraRetriever:
        tags = kwargs.pop("tags", None) or []
-        tags.extend(self.__get_retriever_tags())
+        tags.extend(self._get_retriever_tags())
        return VectaraRetriever(vectorstore=self, **kwargs, tags=tags)


--- a/libs/langchain/pyproject.toml
+++ b/libs/langchain/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "langchain"
-version = "0.0.243"
+version = "0.0.244"
 description = "Building applications with LLMs through composability"
 authors = []
 license = "MIT"
--- a/libs/langchain/tests/integration_tests/evaluation/embedding_distance/test_embedding.py
+++ b/libs/langchain/tests/integration_tests/evaluation/embedding_distance/test_embedding.py
@@ -5,6 +5,7 @@ import pytest

 from langchain.evaluation.embedding_distance import (
    EmbeddingDistance,
+    EmbeddingDistanceEvalChain,
    PairwiseEmbeddingDistanceEvalChain,
 )

@@ -44,18 +45,25 @@ def vectors() -> Tuple[np.ndarray, np.ndarray]:


@pytest.fixture
-def chain() -> PairwiseEmbeddingDistanceEvalChain:
+def pairwise_embedding_distance_eval_chain() -> PairwiseEmbeddingDistanceEvalChain:
    """Create a PairwiseEmbeddingDistanceEvalChain."""
    return PairwiseEmbeddingDistanceEvalChain()


+@pytest.fixture
+def embedding_distance_eval_chain() -> EmbeddingDistanceEvalChain:
+    """Create a EmbeddingDistanceEvalChain."""
+    return EmbeddingDistanceEvalChain()
+
+
@pytest.mark.requires("scipy")
-def test_cosine_similarity(
-    chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
+def test_pairwise_embedding_distance_eval_chain_cosine_similarity(
+    pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
+    vectors: Tuple[np.ndarray, np.ndarray],
 ) -> None:
    """Test the cosine similarity."""
-    chain.distance_metric = EmbeddingDistance.COSINE
-    result = chain._compute_score(np.array(vectors))
+    pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.COSINE
+    result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
    expected = 1.0 - np.dot(vectors[0], vectors[1]) / (
        np.linalg.norm(vectors[0]) * np.linalg.norm(vectors[1])
    )
@@ -63,61 +71,81 @@ def test_cosine_similarity(


@pytest.mark.requires("scipy")
-def test_euclidean_distance(
-    chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
+def test_pairwise_embedding_distance_eval_chain_euclidean_distance(
+    pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
+    vectors: Tuple[np.ndarray, np.ndarray],
 ) -> None:
    """Test the euclidean distance."""
    from scipy.spatial.distance import euclidean

-    chain.distance_metric = EmbeddingDistance.EUCLIDEAN
-    result = chain._compute_score(np.array(vectors))
+    pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.EUCLIDEAN
+    result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
    expected = euclidean(*vectors)
    assert np.isclose(result, expected)


@pytest.mark.requires("scipy")
-def test_manhattan_distance(
-    chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
+def test_pairwise_embedding_distance_eval_chain_manhattan_distance(
+    pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
+    vectors: Tuple[np.ndarray, np.ndarray],
 ) -> None:
    """Test the manhattan distance."""
    from scipy.spatial.distance import cityblock

-    chain.distance_metric = EmbeddingDistance.MANHATTAN
-    result = chain._compute_score(np.array(vectors))
+    pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.MANHATTAN
+    result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
    expected = cityblock(*vectors)
    assert np.isclose(result, expected)


@pytest.mark.requires("scipy")
-def test_chebyshev_distance(
-    chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
+def test_pairwise_embedding_distance_eval_chain_chebyshev_distance(
+    pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
+    vectors: Tuple[np.ndarray, np.ndarray],
 ) -> None:
    """Test the chebyshev distance."""
    from scipy.spatial.distance import chebyshev

-    chain.distance_metric = EmbeddingDistance.CHEBYSHEV
-    result = chain._compute_score(np.array(vectors))
+    pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.CHEBYSHEV
+    result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
    expected = chebyshev(*vectors)
    assert np.isclose(result, expected)


@pytest.mark.requires("scipy")
-def test_hamming_distance(
-    chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
+def test_pairwise_embedding_distance_eval_chain_hamming_distance(
+    pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
+    vectors: Tuple[np.ndarray, np.ndarray],
 ) -> None:
    """Test the hamming distance."""
    from scipy.spatial.distance import hamming

-    chain.distance_metric = EmbeddingDistance.HAMMING
-    result = chain._compute_score(np.array(vectors))
+    pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.HAMMING
+    result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
    expected = hamming(*vectors)
    assert np.isclose(result, expected)


@pytest.mark.requires("openai", "tiktoken")
-def test_embedding_distance(chain: PairwiseEmbeddingDistanceEvalChain) -> None:
+def test_pairwise_embedding_distance_eval_chain_embedding_distance(
+    pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
+) -> None:
    """Test the embedding distance."""
-    result = chain.evaluate_string_pairs(
+    result = pairwise_embedding_distance_eval_chain.evaluate_string_pairs(
        prediction="A single cat", prediction_b="A single cat"
    )
    assert np.isclose(result["score"], 0.0)
+
+
+@pytest.mark.requires("scipy")
+def test_embedding_distance_eval_chain(
+    embedding_distance_eval_chain: EmbeddingDistanceEvalChain,
+) -> None:
+    embedding_distance_eval_chain.distance_metric = EmbeddingDistance.COSINE
+    prediction = "Hi"
+    reference = "Hello"
+    result = embedding_distance_eval_chain.evaluate_strings(
+        prediction=prediction,
+        reference=reference,
+    )
+    assert result["score"] < 1.0
Author	SHA1	Message	Date
Bagatur	2c2fd9ff13	bump 244 (#8314 )	2023-07-26 11:58:26 -07:00
Lance Martin	77c0582243	Clean queries prior to search (#8309 ) With some search tools, we see no results returned if the query is a numeric list. E.g., if we pass: ``` '1. "LangChain vs LangSmith: How do they differ?"' ``` We see: ``` No good Google Search Result was found ``` Local testing w/ Streamlit: ![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)	2023-07-26 11:48:28 -07:00
shibuiwilliam	6b88fbd9bb	add test for embedding distance evaluation (#8285 ) Add tests for embedding distance evaluation - Description: Add tests for embedding distance evaluation - Issue: None - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: @MlopsJ	2023-07-26 11:45:50 -07:00
Riche Akparuorji	f3d2fdd54c	Fix for code snippet in documentation (#8290 ) - Description: I fixed an issue in the code snippet related to the variable name and the evaluation of its length. The original code used the variable "docs," but the correct variable name is "docs_svm" after using the SVMRetriever. - maintainer: @baskaryan - Twitter handle: @iamreechi_ Co-authored-by: iamreechi <richieakparuorji>	2023-07-26 11:31:08 -07:00
Bagatur	f27176930a	fix geopandas link (#8305 )	2023-07-26 11:30:17 -07:00
Timon Palm	70604e590f	DuckDuckGoSearch News Tool (#8292 ) Description: I wanted to use the DuckDuckGoSearch tool in an agent to let him get the latest news for a topic. DuckDuckGoSearch has already an implemented function for retrieving news articles. But there wasn't a tool to use it. I simply adapted the SearchResult class with an extra argument "backend". You can set it to "news" to only get news articles. Furthermore, I added an example to the DuckDuckGo Notebook on how to further customize the results by using the DuckDuckGoSearchAPIWrapper. Dependencies: no new dependencies --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 11:30:01 -07:00
Aarav Borthakur	8ce661d5a1	Docs: Fix Rockset links (#8214 ) Fix broken Rockset links. Right now links at https://python.langchain.com/docs/integrations/providers/rockset are broken.	2023-07-26 10:38:37 -07:00
Byron Saltysiak	61347bd322	giving path to the copy command for *.toml files (#8294 ) Description: in the .devcontainer, docker-compose build is currently failing due to the src paths in the COPY command. This change adds the full path to the pyproject.toml and poetry.toml to allow the build to run. Issue: You can see the issue if you try to build the dev docker image with: ``` cd .devcontainer docker-compose build ``` Dependencies: none Twitter handle: byronsalty	2023-07-26 10:37:03 -07:00
happyxhw	6384c1ec8f	fix: ElasticVectorSearch.from_documents failed #8293 (#8296 ) - Description: fix ElasticVectorSearch.from_documents with elasticsearch_url param, - Issue: ElasticVectorSearch.from_documents failed #8293 # it fixes (if applicable), --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 10:33:52 -07:00
Jon Bennion	ad38eb2d50	correction to reference to code (#8301 ) - Description: fixes typo referencing code --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 10:33:18 -07:00
jacobswe	83a53e2126	Bug Fix: AzureChatOpenAI streaming with function calls (#8300 ) - Description: During streaming, the first chunk may only contain the name of an OpenAI function and not any arguments. In this case, the current code presumes there is a streaming response and tries to append to it, but gets a KeyError. This fixes that case by checking if the arguments key exists, and if not, creates a new entry instead of appending. - Issue: Related to #6462 Sample Code: ```python llm = AzureChatOpenAI( deployment_name=deployment_name, model_name=model_name, streaming=True ) tools = [PythonREPLTool()] callbacks = [StreamingStdOutCallbackHandler()] agent = initialize_agent( tools=tools, llm=llm, agent=AgentType.OPENAI_FUNCTIONS, callbacks=callbacks ) agent('Run some python code to test your interpreter') ``` Previous Result: ``` File ...langchain/chat_models/openai.py:344, in ChatOpenAI._generate(self, messages, stop, run_manager, **kwargs) 342 function_call = _function_call 343 else: --> 344 function_call["arguments"] += _function_call["arguments"] 345 if run_manager: 346 run_manager.on_llm_new_token(token) KeyError: 'arguments' ``` New Result: ```python {'input': 'Run some python code to test your interpreter', 'output': "The Python code `print('Hello, World!')` has been executed successfully, and the output `Hello, World!` has been printed."} ``` Co-authored-by: jswe <jswe@polencapital.com>	2023-07-26 10:11:50 -07:00
German Martin	457a4730b2	Fix the mangling issue on several VectorStores child classes. (#8274 ) - Description: Fix mangling issue affecting a couple of VectorStore classes including Redis. - Issue: https://github.com/langchain-ai/langchain/issues/8185 - @rlancemartin This is a simple issue but I lack of some context in the original implementation. My changes perhaps are not the definitive fix but to start a quick discussion. @hinthornw Tagging you since one of your changes introduced this [here.](`c38965fcba`)	2023-07-26 09:48:55 -07:00
Alec Flett	4da43f77e5	Add ability to load (deserialize) objects from other namespaces (#7726 ) I have some Prompt subclasses in my project that I'd like to be able to deserialize in callbacks. Right now `loads()`/`load()` will bail when it encounters my object, but I know I can trust the objects because they're in my own projects. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-26 16:59:28 +01:00