Compare commits

..

13 Commits

Author SHA1 Message Date
Bagatur
2c2fd9ff13 bump 244 (#8314) 2023-07-26 11:58:26 -07:00
Lance Martin
77c0582243 Clean queries prior to search (#8309)
With some search tools, we see no results returned if the query is a
numeric list.

E.g., if we pass:
```
'1. "LangChain vs LangSmith: How do they differ?"'
```

We see:
```
No good Google Search Result was found
```

Local testing w/ Streamlit:

![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)
2023-07-26 11:48:28 -07:00
shibuiwilliam
6b88fbd9bb add test for embedding distance evaluation (#8285)
Add tests for embedding distance evaluation

  - Description: Add tests for embedding distance evaluation
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: @MlopsJ
2023-07-26 11:45:50 -07:00
Riche Akparuorji
f3d2fdd54c Fix for code snippet in documentation (#8290)
- Description: I fixed an issue in the code snippet related to the
variable name and the evaluation of its length. The original code used
the variable "docs," but the correct variable name is "docs_svm" after
using the SVMRetriever.
- maintainer: @baskaryan
- Twitter handle: @iamreechi_

Co-authored-by: iamreechi <richieakparuorji>
2023-07-26 11:31:08 -07:00
Bagatur
f27176930a fix geopandas link (#8305) 2023-07-26 11:30:17 -07:00
Timon Palm
70604e590f DuckDuckGoSearch News Tool (#8292)
Description: 
I wanted to use the DuckDuckGoSearch tool in an agent to let him get the
latest news for a topic. DuckDuckGoSearch has already an implemented
function for retrieving news articles. But there wasn't a tool to use
it. I simply adapted the SearchResult class with an extra argument
"backend". You can set it to "news" to only get news articles.

Furthermore, I added an example to the DuckDuckGo Notebook on how to
further customize the results by using the DuckDuckGoSearchAPIWrapper.

Dependencies: no new dependencies
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 11:30:01 -07:00
Aarav Borthakur
8ce661d5a1 Docs: Fix Rockset links (#8214)
Fix broken Rockset links.

Right now links at
https://python.langchain.com/docs/integrations/providers/rockset are
broken.
2023-07-26 10:38:37 -07:00
Byron Saltysiak
61347bd322 giving path to the copy command for *.toml files (#8294)
Description: in the .devcontainer, docker-compose build is currently
failing due to the src paths in the COPY command. This change adds the
full path to the pyproject.toml and poetry.toml to allow the build to
run.
Issue: 

You can see the issue if you try to build the dev docker image with:
```
cd .devcontainer
docker-compose build
```

Dependencies: none
Twitter handle: byronsalty
2023-07-26 10:37:03 -07:00
happyxhw
6384c1ec8f fix: ElasticVectorSearch.from_documents failed #8293 (#8296)
- Description: fix ElasticVectorSearch.from_documents with
elasticsearch_url param,
- Issue: ElasticVectorSearch.from_documents failed #8293 # it fixes (if
applicable),


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 10:33:52 -07:00
Jon Bennion
ad38eb2d50 correction to reference to code (#8301)
- Description: fixes typo referencing code

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 10:33:18 -07:00
jacobswe
83a53e2126 Bug Fix: AzureChatOpenAI streaming with function calls (#8300)
- Description: During streaming, the first chunk may only contain the
name of an OpenAI function and not any arguments. In this case, the
current code presumes there is a streaming response and tries to append
to it, but gets a KeyError. This fixes that case by checking if the
arguments key exists, and if not, creates a new entry instead of
appending.
  - Issue: Related to #6462

Sample Code:
```python
llm = AzureChatOpenAI(
    deployment_name=deployment_name,
    model_name=model_name,
    streaming=True
)

tools = [PythonREPLTool()]
callbacks = [StreamingStdOutCallbackHandler()]

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    callbacks=callbacks
)

agent('Run some python code to test your interpreter')
```

Previous Result:
```
File ...langchain/chat_models/openai.py:344, in ChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
    342         function_call = _function_call
    343     else:
--> 344         function_call["arguments"] += _function_call["arguments"]
    345 if run_manager:
    346     run_manager.on_llm_new_token(token)

KeyError: 'arguments'
```

New Result:
```python
{'input': 'Run some python code to test your interpreter',
 'output': "The Python code `print('Hello, World!')` has been executed successfully, and the output `Hello, World!` has been printed."}
```

Co-authored-by: jswe <jswe@polencapital.com>
2023-07-26 10:11:50 -07:00
German Martin
457a4730b2 Fix the mangling issue on several VectorStores child classes. (#8274)
- Description: Fix mangling issue affecting a couple of VectorStore
classes including Redis.
  - Issue: https://github.com/langchain-ai/langchain/issues/8185
  - @rlancemartin 
  
This is a simple issue but I lack of some context in the original
implementation.
My changes perhaps are not the definitive fix but to start a quick
discussion.

@hinthornw Tagging you since one of your changes introduced this
[here.](c38965fcba)
2023-07-26 09:48:55 -07:00
Alec Flett
4da43f77e5 Add ability to load (deserialize) objects from other namespaces (#7726)
I have some Prompt subclasses in my project that I'd like to be able to
deserialize in callbacks. Right now `loads()`/`load()` will bail when it
encounters my object, but I know I can trust the objects because they're
in my own projects.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-26 16:59:28 +01:00
18 changed files with 274 additions and 60 deletions

View File

@@ -46,7 +46,7 @@
"id": "04981332",
"metadata": {},
"source": [
"Create a GeoPandas dataframe from [`Open City Data`](https://python.langchain.com/docs/modules/data_connection/document_loaders/integrations/open_city_data) as an example input."
"Create a GeoPandas dataframe from [`Open City Data`](https://python.langchain.com/docs/integrations/document_loaders/open_city_data) as an example input."
]
},
{

View File

@@ -20,7 +20,7 @@ from langchain.vectorstores import RocksetDB
## Document Loader
See a [usage example](docs/modules/data_connection/document_loaders/integrations/rockset).
See a [usage example](/docs/integrations/document_loaders/rockset).
```python
from langchain.document_loaders import RocksetLoader
```

View File

@@ -34,7 +34,7 @@
"source": [
"## Using ZERO_SHOT_REACT_DESCRIPTION\n",
"\n",
"This shows how to initialize the agent using the ZERO_SHOT_REACT_DESCRIPTION agent type. Note that this is an alternative to the above."
"This shows how to initialize the agent using the ZERO_SHOT_REACT_DESCRIPTION agent type."
]
},
{
@@ -271,7 +271,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.3"
}
},
"nbformat": 4,

View File

@@ -12,7 +12,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 19,
"id": "21e46d4d",
"metadata": {},
"outputs": [],
@@ -22,7 +22,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 20,
"id": "ac4910f8",
"metadata": {},
"outputs": [],
@@ -32,7 +32,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 21,
"id": "84b8f773",
"metadata": {},
"outputs": [],
@@ -42,17 +42,17 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 22,
"id": "068991a6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009-17) and the first African American to hold the office. Before winning the presidency, Obama represented Illinois in the U.S. Senate (2005-08). Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American former politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing ... Barack Obama was the first African American president of the United States (2009-17). He oversaw the recovery of the U.S. economy (from the Great Recession of 2008-09) and the enactment of landmark health care reform (the Patient Protection and Affordable Care Act ). In 2009 he was awarded the Nobel Peace Prize. His birth certificate lists his first name as Barack: That\\'s how Obama has spelled his name throughout his life. His name derives from a Hebrew name which means \"lightning.\". The Hebrew word has been transliterated into English in various spellings, including Barak, Buraq, Burack, and Barack. Most common names of U.S. presidents 1789-2021. Published by. Aaron O\\'Neill , Jun 21, 2022. The most common first name for a U.S. president is James, followed by John and then William. Six U.S ...'"
"'August 4, 1961 (age 61) Honolulu Hawaii Title / Office: presidency of the United States of America (2009-2017), United States United States Senate (2005-2008), United States ... (Show more) Political Affiliation: Democratic Party Awards And Honors: Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing Illinois ... Answer (1 of 12): I see others have answered President Obama\\'s name which is \"Barack Hussein Obama\". President Obama has received many comments about his name from the racists across US. It is worth noting that he never changed his name. Also, it is worth noting that a simple search would have re... What is Barack Obama\\'s full name? Updated: 11/11/2022 Wiki User ∙ 6y ago Study now See answer (1) Best Answer Copy His full, birth name is Barack Hussein Obama, II. He was named after his... Alex Oliveira July 24, 2023 4:57pm Updated 0 seconds of 43 secondsVolume 0% 00:00 00:43 The man who drowned while paddleboarding on a pond outside the Obamas\\' Martha\\'s Vineyard estate has been...'"
]
},
"execution_count": 5,
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
@@ -60,6 +60,145 @@
"source": [
"search.run(\"Obama's first name?\")"
]
},
{
"cell_type": "markdown",
"id": "889027d4",
"metadata": {},
"source": [
"To get more additional information (e.g. link, source) use `DuckDuckGoSearchResults()`"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "95635444",
"metadata": {},
"outputs": [],
"source": [
"from langchain.tools import DuckDuckGoSearchResults"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "0133d103",
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchResults()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "439efc06",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"[snippet: Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing Illinois ..., title: Barack Obama - Wikipedia, link: https://en.wikipedia.org/wiki/Barack_Obama], [snippet: Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009-17) and the first African American to hold the office. Before winning the presidency, Obama represented Illinois in the U.S. Senate (2005-08). He was the third African American to be elected to that body ..., title: Barack Obama | Biography, Parents, Education, Presidency, Books ..., link: https://www.britannica.com/biography/Barack-Obama], [snippet: Barack Obama 's tenure as the 44th president of the United States began with his first inauguration on January 20, 2009, and ended on January 20, 2017. A Democrat from Illinois, Obama took office following a decisive victory over Republican nominee John McCain in the 2008 presidential election. Four years later, in the 2012 presidential ..., title: Presidency of Barack Obama - Wikipedia, link: https://en.wikipedia.org/wiki/Presidency_of_Barack_Obama], [snippet: First published on Mon 24 Jul 2023 20.03 EDT. Barack Obama's personal chef died while paddleboarding near the ex-president's home on Martha's Vineyard over the weekend, Massachusetts state ..., title: Obama's personal chef dies while paddleboarding off Martha's Vineyard ..., link: https://www.theguardian.com/us-news/2023/jul/24/tafari-campbell-barack-obama-chef-drowns-marthas-vineyard]\""
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.run(\"Obama\")"
]
},
{
"cell_type": "markdown",
"id": "e17ccfe7",
"metadata": {},
"source": [
"You can also just search for news articles. Use the keyword ``backend=\"news\"``"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "21afe28d",
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchResults(backend=\"news\")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "2a4beeb9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"[date: 2023-07-26T12:01:22, title: 'My heart is broken': Former Obama White House chef mourned following apparent drowning death in Edgartown, snippet: Tafari Campbell of Dumfries, Va., had been paddle boarding in Edgartown Great Pond when he appeared to briefly struggle, submerged, and did not return to the surface, authorities have said. Crews ultimately found the 45-year-old's body Monday morning., source: The Boston Globe on MSN.com, link: https://www.msn.com/en-us/news/us/my-heart-is-broken-former-obama-white-house-chef-mourned-following-apparent-drowning-death-in-edgartown/ar-AA1elNB8], [date: 2023-07-25T18:44:00, title: Obama's chef drowns paddleboarding near former president's Edgartown vacation home, snippet: Campbell was visiting Martha's Vineyard, where the Obamas own a vacation home. He was not wearing a lifejacket when he fell off his paddleboard., source: YAHOO!News, link: https://news.yahoo.com/obama-chef-drowns-paddleboarding-near-184437491.html], [date: 2023-07-26T00:30:00, title: Obama's personal chef dies while paddleboarding off Martha's Vineyard, snippet: Tafari Campbell, who worked at the White House during Obama's presidency, was visiting the island while the family was away, source: The Guardian, link: https://www.theguardian.com/us-news/2023/jul/24/tafari-campbell-barack-obama-chef-drowns-marthas-vineyard], [date: 2023-07-24T21:54:00, title: Obama's chef ID'd as paddleboarder who drowned near former president's Martha's Vineyard estate, snippet: Former President Barack Obama's personal chef, Tafari Campbell, has been identified as the paddle boarder who drowned near the Obamas' Martha's Vineyard estate., source: Fox News, link: https://www.foxnews.com/politics/obamas-chef-idd-paddleboarder-who-drowned-near-former-presidents-marthas-vineyard-estate]\""
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.run(\"Obama\")"
]
},
{
"cell_type": "markdown",
"id": "5f7c0129",
"metadata": {},
"source": [
"You can also directly pass a custom ``DuckDuckGoSearchAPIWrapper`` to ``DuckDuckGoSearchResults``. Therefore, you have much more control over the search results."
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "c7ab3b55",
"metadata": {},
"outputs": [],
"source": [
"from langchain.utilities import DuckDuckGoSearchAPIWrapper\n",
"\n",
"wrapper = DuckDuckGoSearchAPIWrapper(region=\"de-de\", time=\"d\", max_results=2)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "adce16e1",
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchResults(api_wrapper=wrapper, backend=\"news\")"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "b7e77c54",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'[date: 2023-07-25T12:15:00, title: Barack + Michelle Obama: Sie trauern um Angestellten, snippet: Barack und Michelle Obama trauern um ihren ehemaligen Küchenchef Tafari Campbell. Der Familienvater verunglückte am vergangenen Sonntag und wurde in einem Teich geborgen., source: Gala, link: https://www.gala.de/stars/news/barack---michelle-obama--sie-trauern-um-angestellten-23871228.html], [date: 2023-07-25T10:30:00, title: Barack Obama: Sein Koch (†45) ist tot - diese Details sind bekannt, snippet: Tafari Campbell war früher im Weißen Haus eingestellt, arbeitete anschließend weiter für Ex-Präsident Barack Obama. Nun ist er gestorben. Diese Details sind bekannt., source: T-Online, link: https://www.t-online.de/unterhaltung/stars/id_100213226/barack-obama-sein-koch-45-ist-tot-diese-details-sind-bekannt.html], [date: 2023-07-25T05:33:23, title: Barack Obama: Sein Privatkoch ist bei einem tragischen Unfall gestorben, snippet: Barack Obama (61) und Michelle Obama (59) sind in tiefer Trauer. Ihr Privatkoch Tafari Campbell ist am Montag (24. Juli) ums Leben gekommen, er wurde nur 45 Jahre alt. Laut US-Polizei starb er bei ein, source: BUNTE.de, link: https://www.msn.com/de-de/unterhaltung/other/barack-obama-sein-privatkoch-ist-bei-einem-tragischen-unfall-gestorben/ar-AA1ejrAd], [date: 2023-07-25T02:25:00, title: Barack Obama: Privatkoch tot in See gefunden, snippet: Tafari Campbell kochte für Barack Obama im Weißen Haus - und auch privat nach dessen Abschied aus dem Präsidentenamt. Nun machte die Polizei in einem Gewässer eine traurige Entdeckung., source: SPIEGEL, link: https://www.spiegel.de/panorama/justiz/barack-obama-leibkoch-tot-in-see-gefunden-a-3cdf6377-bee0-43f1-a200-a285742f9ffc]'"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.run(\"Obama\")"
]
}
],
"metadata": {
@@ -78,7 +217,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.9"
},
"vscode": {
"interpreter": {

View File

@@ -189,7 +189,7 @@ All retrievers implement some common methods, such as `get_relevant_documents()`
from langchain.retrievers import SVMRetriever
svm_retriever = SVMRetriever.from_documents(all_splits,OpenAIEmbeddings())
docs_svm=svm_retriever.get_relevant_documents(question)
len(docs)
len(docs_svm)
```

View File

@@ -35,7 +35,7 @@ FROM langchain-dev-base AS langchain-dev-dependencies
ARG PYTHON_VIRTUALENV_HOME
# Copy only the dependency files for installation
COPY pyproject.toml poetry.toml ./
COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml ./
# Copy the langchain library for installation
COPY libs/langchain/ libs/langchain/

View File

@@ -340,8 +340,10 @@ class ChatOpenAI(BaseChatModel):
if _function_call:
if function_call is None:
function_call = _function_call
else:
elif "arguments" in function_call:
function_call["arguments"] += _function_call["arguments"]
else:
function_call["arguments"] = _function_call["arguments"]
if run_manager:
run_manager.on_llm_new_token(token)
message = _convert_dict_to_message(
@@ -406,8 +408,10 @@ class ChatOpenAI(BaseChatModel):
if _function_call:
if function_call is None:
function_call = _function_call
else:
elif "arguments" in function_call:
function_call["arguments"] += _function_call["arguments"]
else:
function_call["arguments"] = _function_call["arguments"]
if run_manager:
await run_manager.on_llm_new_token(token)
message = _convert_dict_to_message(

View File

@@ -1,7 +1,7 @@
import importlib
import json
import os
from typing import Any, Dict, Optional
from typing import Any, Dict, List, Optional
from langchain.load.serializable import Serializable
@@ -9,8 +9,16 @@ from langchain.load.serializable import Serializable
class Reviver:
"""Reviver for JSON objects."""
def __init__(self, secrets_map: Optional[Dict[str, str]] = None) -> None:
def __init__(
self,
secrets_map: Optional[Dict[str, str]] = None,
valid_namespaces: Optional[List[str]] = None,
) -> None:
self.secrets_map = secrets_map or dict()
# By default only support langchain, but user can pass in additional namespaces
self.valid_namespaces = (
["langchain", *valid_namespaces] if valid_namespaces else ["langchain"]
)
def __call__(self, value: Dict[str, Any]) -> Any:
if (
@@ -43,8 +51,7 @@ class Reviver:
):
[*namespace, name] = value["id"]
# Currently, we only support langchain imports.
if namespace[0] != "langchain":
if namespace[0] not in self.valid_namespaces:
raise ValueError(f"Invalid namespace: {value}")
# The root namespace "langchain" is not a valid identifier.
@@ -66,14 +73,21 @@ class Reviver:
return value
def loads(text: str, *, secrets_map: Optional[Dict[str, str]] = None) -> Any:
def loads(
text: str,
*,
secrets_map: Optional[Dict[str, str]] = None,
valid_namespaces: Optional[List[str]] = None,
) -> Any:
"""Load a JSON object from a string.
Args:
text: The string to load.
secrets_map: A map of secrets to load.
valid_namespaces: A list of additional namespaces (modules)
to allow to be deserialized.
Returns:
"""
return json.loads(text, object_hook=Reviver(secrets_map))
return json.loads(text, object_hook=Reviver(secrets_map, valid_namespaces))

View File

@@ -35,7 +35,7 @@ class SearchQueries(BaseModel):
DEFAULT_LLAMA_SEARCH_PROMPT = PromptTemplate(
input_variables=["question"],
template="""<<SYS>> \n You are an assistant tasked with improving Google search
results. \n <</SYS>> \n\n [INST] Generate FIVE Google search queries that
results. \n <</SYS>> \n\n [INST] Generate THREE Google search queries that
are similar to this question. The output should be a numbered list of questions
and each should have a question mark at the end: \n\n {question} [/INST]""",
)
@@ -43,7 +43,7 @@ DEFAULT_LLAMA_SEARCH_PROMPT = PromptTemplate(
DEFAULT_SEARCH_PROMPT = PromptTemplate(
input_variables=["question"],
template="""You are an assistant tasked with improving Google search
results. Generate FIVE Google search queries that are similar to
results. Generate THREE Google search queries that are similar to
this question. The output should be a numbered list of questions and each
should have a question mark at the end: {question}""",
)
@@ -73,7 +73,6 @@ class WebResearchRetriever(BaseRetriever):
)
llm_chain: LLMChain
search: GoogleSearchAPIWrapper = Field(..., description="Google Search API Wrapper")
max_splits_per_doc: int = Field(100, description="Maximum splits per document")
num_search_results: int = Field(1, description="Number of pages per Google search")
text_splitter: RecursiveCharacterTextSplitter = Field(
RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=50),
@@ -90,10 +89,9 @@ class WebResearchRetriever(BaseRetriever):
llm: BaseLLM,
search: GoogleSearchAPIWrapper,
prompt: Optional[BasePromptTemplate] = None,
max_splits_per_doc: int = 100,
num_search_results: int = 1,
text_splitter: RecursiveCharacterTextSplitter = RecursiveCharacterTextSplitter(
chunk_size=1500, chunk_overlap=50
chunk_size=1500, chunk_overlap=150
),
) -> "WebResearchRetriever":
"""Initialize from llm using default template.
@@ -103,7 +101,6 @@ class WebResearchRetriever(BaseRetriever):
llm: llm for search question generation
search: GoogleSearchAPIWrapper
prompt: prompt to generating search questions
max_splits_per_doc: Maximum splits per document to keep
num_search_results: Number of pages per Google search
text_splitter: Text splitter for splitting web pages into chunks
@@ -131,14 +128,30 @@ class WebResearchRetriever(BaseRetriever):
vectorstore=vectorstore,
llm_chain=llm_chain,
search=search,
max_splits_per_doc=max_splits_per_doc,
num_search_results=num_search_results,
text_splitter=text_splitter,
)
def clean_search_query(self, query: str) -> str:
# Some search tools (e.g., Google) will
# fail to return results if query has a
# leading digit: 1. "LangCh..."
# Check if the first character is a digit
if query[0].isdigit():
# Find the position of the first quote
first_quote_pos = query.find('"')
if first_quote_pos != -1:
# Extract the part of the string after the quote
query = query[first_quote_pos + 1 :]
# Remove the trailing quote if present
if query.endswith('"'):
query = query[:-1]
return query.strip()
def search_tool(self, query: str, num_search_results: int = 1) -> List[dict]:
"""Returns num_serch_results pages per Google search."""
result = self.search.results(query, num_search_results)
query_clean = self.clean_search_query(query)
result = self.search.results(query_clean, num_search_results)
return result
def _get_relevant_documents(

View File

@@ -56,6 +56,7 @@ class DuckDuckGoSearchResults(BaseTool):
api_wrapper: DuckDuckGoSearchAPIWrapper = Field(
default_factory=DuckDuckGoSearchAPIWrapper
)
backend: str = "api"
def _run(
self,
@@ -63,7 +64,9 @@ class DuckDuckGoSearchResults(BaseTool):
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
return str(self.api_wrapper.results(query, self.num_results))
res = self.api_wrapper.results(query, self.num_results, backend=self.backend)
res_strs = [", ".join([f"{k}: {v}" for k, v in d.items()]) for d in res]
return ", ".join([f"[{rs}]" for rs in res_strs])
async def _arun(
self,

View File

@@ -62,7 +62,9 @@ class DuckDuckGoSearchAPIWrapper(BaseModel):
snippets = self.get_snippets(query)
return " ".join(snippets)
def results(self, query: str, num_results: int) -> List[Dict[str, str]]:
def results(
self, query: str, num_results: int, backend: str = "api"
) -> List[Dict[str, str]]:
"""Run query through DuckDuckGo and return metadata.
Args:
@@ -83,11 +85,20 @@ class DuckDuckGoSearchAPIWrapper(BaseModel):
region=self.region,
safesearch=self.safesearch,
timelimit=self.time,
backend=backend,
)
if results is None:
return [{"Result": "No good DuckDuckGo Search Result was found"}]
def to_metadata(result: Dict) -> Dict[str, str]:
if backend == "news":
return {
"date": result["date"],
"title": result["title"],
"snippet": result["body"],
"source": result["source"],
"link": result["url"],
}
return {
"snippet": result["body"],
"title": result["title"],

View File

@@ -446,7 +446,7 @@ class VectorStore(ABC):
"""Return VectorStore initialized from texts and embeddings."""
raise NotImplementedError
def __get_retriever_tags(self) -> List[str]:
def _get_retriever_tags(self) -> List[str]:
"""Get tags for retriever."""
tags = [self.__class__.__name__]
if self.embeddings:
@@ -455,7 +455,7 @@ class VectorStore(ABC):
def as_retriever(self, **kwargs: Any) -> VectorStoreRetriever:
tags = kwargs.pop("tags", None) or []
tags.extend(self.__get_retriever_tags())
tags.extend(self._get_retriever_tags())
return VectorStoreRetriever(vectorstore=self, **kwargs, tags=tags)

View File

@@ -294,6 +294,8 @@ class ElasticVectorSearch(VectorStore, ABC):
elasticsearch_url = get_from_dict_or_env(
kwargs, "elasticsearch_url", "ELASTICSEARCH_URL"
)
if "elasticsearch_url" in kwargs:
del kwargs["elasticsearch_url"]
index_name = index_name or uuid.uuid4().hex
vectorsearch = cls(elasticsearch_url, index_name, embedding, **kwargs)
vectorsearch.add_texts(

View File

@@ -607,7 +607,7 @@ class Redis(VectorStore):
def as_retriever(self, **kwargs: Any) -> RedisVectorStoreRetriever:
tags = kwargs.pop("tags", None) or []
tags.extend(self.__get_retriever_tags())
tags.extend(self._get_retriever_tags())
return RedisVectorStoreRetriever(vectorstore=self, **kwargs, tags=tags)

View File

@@ -446,7 +446,7 @@ class SingleStoreDB(VectorStore):
def as_retriever(self, **kwargs: Any) -> SingleStoreDBRetriever:
tags = kwargs.pop("tags", None) or []
tags.extend(self.__get_retriever_tags())
tags.extend(self._get_retriever_tags())
return SingleStoreDBRetriever(vectorstore=self, **kwargs, tags=tags)

View File

@@ -409,7 +409,7 @@ class Vectara(VectorStore):
def as_retriever(self, **kwargs: Any) -> VectaraRetriever:
tags = kwargs.pop("tags", None) or []
tags.extend(self.__get_retriever_tags())
tags.extend(self._get_retriever_tags())
return VectaraRetriever(vectorstore=self, **kwargs, tags=tags)

View File

@@ -1,6 +1,6 @@
[tool.poetry]
name = "langchain"
version = "0.0.243"
version = "0.0.244"
description = "Building applications with LLMs through composability"
authors = []
license = "MIT"

View File

@@ -5,6 +5,7 @@ import pytest
from langchain.evaluation.embedding_distance import (
EmbeddingDistance,
EmbeddingDistanceEvalChain,
PairwiseEmbeddingDistanceEvalChain,
)
@@ -44,18 +45,25 @@ def vectors() -> Tuple[np.ndarray, np.ndarray]:
@pytest.fixture
def chain() -> PairwiseEmbeddingDistanceEvalChain:
def pairwise_embedding_distance_eval_chain() -> PairwiseEmbeddingDistanceEvalChain:
"""Create a PairwiseEmbeddingDistanceEvalChain."""
return PairwiseEmbeddingDistanceEvalChain()
@pytest.fixture
def embedding_distance_eval_chain() -> EmbeddingDistanceEvalChain:
"""Create a EmbeddingDistanceEvalChain."""
return EmbeddingDistanceEvalChain()
@pytest.mark.requires("scipy")
def test_cosine_similarity(
chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
def test_pairwise_embedding_distance_eval_chain_cosine_similarity(
pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
vectors: Tuple[np.ndarray, np.ndarray],
) -> None:
"""Test the cosine similarity."""
chain.distance_metric = EmbeddingDistance.COSINE
result = chain._compute_score(np.array(vectors))
pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.COSINE
result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
expected = 1.0 - np.dot(vectors[0], vectors[1]) / (
np.linalg.norm(vectors[0]) * np.linalg.norm(vectors[1])
)
@@ -63,61 +71,81 @@ def test_cosine_similarity(
@pytest.mark.requires("scipy")
def test_euclidean_distance(
chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
def test_pairwise_embedding_distance_eval_chain_euclidean_distance(
pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
vectors: Tuple[np.ndarray, np.ndarray],
) -> None:
"""Test the euclidean distance."""
from scipy.spatial.distance import euclidean
chain.distance_metric = EmbeddingDistance.EUCLIDEAN
result = chain._compute_score(np.array(vectors))
pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.EUCLIDEAN
result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
expected = euclidean(*vectors)
assert np.isclose(result, expected)
@pytest.mark.requires("scipy")
def test_manhattan_distance(
chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
def test_pairwise_embedding_distance_eval_chain_manhattan_distance(
pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
vectors: Tuple[np.ndarray, np.ndarray],
) -> None:
"""Test the manhattan distance."""
from scipy.spatial.distance import cityblock
chain.distance_metric = EmbeddingDistance.MANHATTAN
result = chain._compute_score(np.array(vectors))
pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.MANHATTAN
result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
expected = cityblock(*vectors)
assert np.isclose(result, expected)
@pytest.mark.requires("scipy")
def test_chebyshev_distance(
chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
def test_pairwise_embedding_distance_eval_chain_chebyshev_distance(
pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
vectors: Tuple[np.ndarray, np.ndarray],
) -> None:
"""Test the chebyshev distance."""
from scipy.spatial.distance import chebyshev
chain.distance_metric = EmbeddingDistance.CHEBYSHEV
result = chain._compute_score(np.array(vectors))
pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.CHEBYSHEV
result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
expected = chebyshev(*vectors)
assert np.isclose(result, expected)
@pytest.mark.requires("scipy")
def test_hamming_distance(
chain: PairwiseEmbeddingDistanceEvalChain, vectors: Tuple[np.ndarray, np.ndarray]
def test_pairwise_embedding_distance_eval_chain_hamming_distance(
pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
vectors: Tuple[np.ndarray, np.ndarray],
) -> None:
"""Test the hamming distance."""
from scipy.spatial.distance import hamming
chain.distance_metric = EmbeddingDistance.HAMMING
result = chain._compute_score(np.array(vectors))
pairwise_embedding_distance_eval_chain.distance_metric = EmbeddingDistance.HAMMING
result = pairwise_embedding_distance_eval_chain._compute_score(np.array(vectors))
expected = hamming(*vectors)
assert np.isclose(result, expected)
@pytest.mark.requires("openai", "tiktoken")
def test_embedding_distance(chain: PairwiseEmbeddingDistanceEvalChain) -> None:
def test_pairwise_embedding_distance_eval_chain_embedding_distance(
pairwise_embedding_distance_eval_chain: PairwiseEmbeddingDistanceEvalChain,
) -> None:
"""Test the embedding distance."""
result = chain.evaluate_string_pairs(
result = pairwise_embedding_distance_eval_chain.evaluate_string_pairs(
prediction="A single cat", prediction_b="A single cat"
)
assert np.isclose(result["score"], 0.0)
@pytest.mark.requires("scipy")
def test_embedding_distance_eval_chain(
embedding_distance_eval_chain: EmbeddingDistanceEvalChain,
) -> None:
embedding_distance_eval_chain.distance_metric = EmbeddingDistance.COSINE
prediction = "Hi"
reference = "Hello"
result = embedding_distance_eval_chain.evaluate_strings(
prediction=prediction,
reference=reference,
)
assert result["score"] < 1.0