Merge branch 'langchain-ai:master' into master

This commit is contained in:
olgavrou 2023-08-30 04:03:20 -04:00 committed by GitHub
commit dfc3295a2c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
111 changed files with 3226 additions and 546 deletions

View File

@ -31,13 +31,15 @@ jobs:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v3
- name: Install poetry
run: pipx install "poetry==$POETRY_VERSION"
- name: Set up Python 3.10
uses: actions/setup-python@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: "3.10"
cache: "poetry"
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: release
- name: Build project for distribution
run: poetry build
- name: Check Version

View File

@ -42,9 +42,9 @@ Log and stream intermediate steps of any chain
## Examples, ecosystem, and resources
### [Use cases](/docs/use_cases/)
Walkthroughs and best-practices for common end-to-end use cases, like:
- [Chatbots](/docs/use_cases/chatbots/)
- [Chatbots](/docs/use_cases/chatbots)
- [Answering questions using sources](/docs/use_cases/question_answering/)
- [Analyzing structured data](/docs/use_cases/tabular.html)
- [Analyzing structured data](/docs/use_cases/sql)
- and much more...
### [Guides](/docs/guides/)
@ -56,9 +56,8 @@ LangChain is part of a rich ecosystem of tools that integrate with our framework
### [Additional resources](/docs/additional_resources/)
Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/additional_resources/youtube.html) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).
<h3><span style={{color:"#2e8555"}}> Support </span></h3>
Join us on [GitHub](https://github.com/hwchase17/langchain) or [Discord](https://discord.gg/6adMQxSpJS) to ask questions, share feedback, meet other developers building with LangChain, and dream about the future of LLMs.
### [Community](/docs/community)
Head to the [Community navigator](/docs/community) to find places to ask questions, share feedback, meet other developers, and dream about the future of LLMs.
## API reference

View File

@ -1,4 +1,4 @@
# Conversation buffer memory
# Conversation Buffer
This notebook shows how to use `ConversationBufferMemory`. This memory allows for storing of messages and then extracts the messages in a variable.

View File

@ -1,4 +1,4 @@
# Conversation buffer window memory
# Conversation Buffer Window
`ConversationBufferWindowMemory` keeps a list of the interactions of the conversation over time. It only uses the last K interactions. This can be useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large

View File

@ -1,4 +1,4 @@
# Entity memory
# Entity
Entity Memory remembers given facts about specific entities in a conversation. It extracts information on entities (using an LLM) and builds up its knowledge about that entity over time (also using an LLM).

View File

@ -4,5 +4,5 @@ sidebar_position: 2
# Memory Types
There are many different types of memory.
Each have their own parameters, their own return types, and are useful in different scenarios.
Each has their own parameters, their own return types, and is useful in different scenarios.
Please see their individual page for more detail on each one.

View File

@ -1,4 +1,4 @@
# Conversation summary memory
# Conversation Summary
Now let's take a look at using a slightly more complex type of memory - `ConversationSummaryMemory`. This type of memory creates a summary of the conversation over time. This can be useful for condensing information from the conversation over time.
Conversation summary memory summarizes the conversation as it happens and stores the current summary in memory. This memory can then be used to inject the summary of the conversation so far into a prompt/chain. This memory is most useful for longer conversations, where keeping the past message history in the prompt verbatim would take up too many tokens.

View File

@ -1,4 +1,4 @@
# Vector store-backed memory
# Backed by a Vector Store
`VectorStoreRetrieverMemory` stores memories in a VectorDB and queries the top-K most "salient" docs every time it is called.

View File

@ -51,7 +51,7 @@ Dependents stats for `langchain-ai/langchain`
|[e2b-dev/e2b](https://github.com/e2b-dev/e2b) | 5365 |
|[mage-ai/mage-ai](https://github.com/mage-ai/mage-ai) | 5352 |
|[wenda-LLM/wenda](https://github.com/wenda-LLM/wenda) | 5192 |
|[LangChain-Chinese-Getting-Started-Guide](https://github.com/liaokongVFX/LangChain-Chinese-Getting-Started-Guide) | 5129 |
|[liaokongVFX/LangChain-Chinese-Getting-Started-Guide](https://github.com/liaokongVFX/LangChain-Chinese-Getting-Started-Guide) | 5129 |
|[zilliztech/GPTCache](https://github.com/zilliztech/GPTCache) | 4993 |
|[GreyDGL/PentestGPT](https://github.com/GreyDGL/PentestGPT) | 4831 |
|[zauberzeug/nicegui](https://github.com/zauberzeug/nicegui) | 4824 |

View File

@ -0,0 +1,63 @@
# LLMonitor
[LLMonitor](https://llmonitor.com) is an open-source observability platform that provides cost tracking, user tracking and powerful agent tracing.
<video controls width='100%' >
<source src='https://llmonitor.com/videos/demo-annotated.mp4'/>
</video>
## Setup
Create an account on [llmonitor.com](https://llmonitor.com), create an `App`, and then copy the associated `tracking id`.
Once you have it, set it as an environment variable by running:
```bash
export LLMONITOR_APP_ID="..."
```
If you'd prefer not to set an environment variable, you can pass the key directly when initializing the callback handler:
```python
from langchain.callbacks import LLMonitorCallbackHandler
handler = LLMonitorCallbackHandler(app_id="...")
```
## Usage with LLM/Chat models
```python
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import LLMonitorCallbackHandler
handler = LLMonitorCallbackHandler(app_id="...")
llm = OpenAI(
callbacks=[handler],
)
chat = ChatOpenAI(
callbacks=[handler],
metadata={"userId": "123"}, # you can assign user ids to models in the metadata
)
```
## Usage with agents
```python
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.llms import OpenAI
from langchain.callbacks import LLMonitorCallbackHandler
handler = LLMonitorCallbackHandler(app_id="...")
llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
agent.run(
"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?",
callbacks=[handler],
metadata={
"agentName": "Leo DiCaprio's girlfriend", # you can assign a custom agent in the metadata
},
)
```
## Support
For any question or issue with integration you can reach out to the LLMonitor team on [Discord](http://discord.com/invite/8PafSG58kK) or via [email](mailto:vince@llmonitor.com).

View File

@ -106,15 +106,39 @@
" - `column_data_type`\n",
" - `column_title`\n",
" - `column_description`\n",
" - `column_values`"
" - `column_values`\n",
" - `cube_data_obj_type`"
]
},
{
"attachments": {},
"cell_type": "markdown",
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"> page_content='Users View City, None' metadata={'table_name': 'users_view', 'column_name': 'users_view.city', 'column_data_type': 'string', 'column_title': 'Users View City', 'column_description': 'None', 'column_member_type': 'dimension', 'column_values': ['Austin', 'Chicago', 'Los Angeles', 'Mountain View', 'New York', 'Palo Alto', 'San Francisco', 'Seattle']}"
"# Given string containing page content\n",
"page_content = 'Users View City, None'\n",
"\n",
"# Given dictionary containing metadata\n",
"metadata = {\n",
" 'table_name': 'users_view',\n",
" 'column_name': 'users_view.city',\n",
" 'column_data_type': 'string',\n",
" 'column_title': 'Users View City',\n",
" 'column_description': 'None',\n",
" 'column_member_type': 'dimension',\n",
" 'column_values': [\n",
" 'Austin',\n",
" 'Chicago',\n",
" 'Los Angeles',\n",
" 'Mountain View',\n",
" 'New York',\n",
" 'Palo Alto',\n",
" 'San Francisco',\n",
" 'Seattle'\n",
" ],\n",
" 'cube_data_obj_type': 'view'\n",
"}"
]
}
],

View File

@ -30,7 +30,45 @@
"```python\n",
"import os\n",
"os.environ[\"OPENAI_API_TYPE\"] = \"azure\"\n",
"...\n",
"```\n",
"\n",
"## Azure Active Directory Authentication\n",
"There are two ways you can authenticate to Azure OpenAI:\n",
"- API Key\n",
"- Azure Active Directory (AAD)\n",
"\n",
"Using the API key is the easiest way to get started. You can find your API key in the Azure portal under your Azure OpenAI resource.\n",
"\n",
"However, if you have complex security requirements - you may want to use Azure Active Directory. You can find more information on how to use AAD with Azure OpenAI [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity).\n",
"\n",
"If you are developing locally, you will need to have the Azure CLI installed and be logged in. You can install the Azure CLI [here](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli). Then, run `az login` to log in.\n",
"\n",
"Add a role an Azure role assignment `Cognitive Services OpenAI User` scoped to your Azure OpenAI resource. This will allow you to get a token from AAD to use with Azure OpenAI. You can grant this role assignment to a user, group, service principal, or managed identity. For more information about Azure OpenAI RBAC roles see [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control).\n",
"\n",
"To use AAD in Python with LangChain, install the `azure-identity` package. Then, set `OPENAI_API_TYPE` to `azure_ad`. Next, use the `DefaultAzureCredential` class to get a token from AAD by calling `get_token` as shown below. Finally, set the `OPENAI_API_KEY` environment variable to the token value.\n",
"\n",
"```python\n",
"import os\n",
"from azure.identity import DefaultAzureCredential\n",
"\n",
"# Get the Azure Credential\n",
"credential = DefaultAzureCredential()\n",
"\n",
"# Set the API type to `azure_ad`\n",
"os.environ[\"OPENAI_API_TYPE\"] = \"azure_ad\"\n",
"# Set the API_KEY to the token from the Azure credential\n",
"os.environ[\"OPENAI_API_KEY\"] = credential.get_token(\"https://cognitiveservices.azure.com/.default\").token\n",
"```\n",
"\n",
"The `DefaultAzureCredential` class is an easy way to get started with AAD authentication. You can also customize the credential chain if necessary. In the example shown below, we first try Managed Identity, then fall back to the Azure CLI. This is useful if you are running your code in Azure, but want to develop locally.\n",
"\n",
"```python\n",
"from azure.identity import ChainedTokenCredential, ManagedIdentityCredential, AzureCliCredential\n",
"\n",
"credential = ChainedTokenCredential(\n",
" ManagedIdentityCredential(),\n",
" AzureCliCredential()\n",
")\n",
"```\n",
"\n",
"## Deployments\n",
@ -144,7 +182,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mAzureOpenAI\u001b[0m\n",
"\u001B[1mAzureOpenAI\u001B[0m\n",
"Params: {'deployment_name': 'text-davinci-002', 'model_name': 'text-davinci-002', 'temperature': 0.7, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n"
]
}

View File

@ -1,17 +1,28 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Google Cloud Platform Vertex AI PaLM \n",
"# Google Vertex AI PaLM \n",
"\n",
"Note: This is seperate from the Google PaLM integration, it exposes [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) on Google Cloud. \n",
"**Note:** This is seperate from the `Google PaLM` integration, it exposes [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) on `Google Cloud`. \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By default, Google Cloud [does not use](https://cloud.google.com/vertex-ai/docs/generative-ai/data-governance#foundation_model_development) customer data to train its foundation models as part of Google Cloud's AI/ML Privacy Commitment. More details about how Google processes data can also be found in [Google's Customer Data Processing Addendum (CDPA)](https://cloud.google.com/terms/data-processing-addendum).\n",
"\n",
"By default, Google Cloud [does not use](https://cloud.google.com/vertex-ai/docs/generative-ai/data-governance#foundation_model_development) Customer Data to train its foundation models as part of Google Cloud`s AI/ML Privacy Commitment. More details about how Google processes data can also be found in [Google's Customer Data Processing Addendum (CDPA)](https://cloud.google.com/terms/data-processing-addendum).\n",
"\n",
"To use Vertex AI PaLM you must have the `google-cloud-aiplatform` Python package installed and either:\n",
"To use `Vertex AI PaLM` you must have the `google-cloud-aiplatform` Python package installed and either:\n",
"- Have credentials configured for your environment (gcloud, workload identity, etc...)\n",
"- Store the path to a service account JSON file as the GOOGLE_APPLICATION_CREDENTIALS environment variable\n",
"\n",
@ -19,8 +30,7 @@
"\n",
"For more information, see: \n",
"- https://cloud.google.com/docs/authentication/application-default-credentials#GAC\n",
"- https://googleapis.dev/python/google-auth/latest/reference/google.auth.html#module-google.auth\n",
"\n"
"- https://googleapis.dev/python/google-auth/latest/reference/google.auth.html#module-google.auth"
]
},
{
@ -40,7 +50,22 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import VertexAI\n",
"from langchain.llms import VertexAI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Question-answering example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain import PromptTemplate, LLMChain"
]
},
@ -98,13 +123,21 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"You can now leverage the Codey API for code generation within Vertex AI. The model names are:\n",
"- code-bison: for code suggestion\n",
"- code-gecko: for code completion"
"## Code generation example"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can now leverage the `Codey API` for code generation within `Vertex AI`. \n",
"\n",
"The model names are:\n",
"- `code-bison`: for code suggestion\n",
"- `code-gecko`: for code completion"
]
},
{
@ -191,7 +224,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

View File

@ -38,7 +38,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install xata==1.0.0rc0 openai langchain"
"!pip install xata openai langchain"
]
},
{

View File

@ -0,0 +1,44 @@
# Neo4j
This page covers how to use the Neo4j ecosystem within LangChain.
What is Neo4j?
**Neo4j in a nutshell:**
- Neo4j is an open-source database management system that specializes in graph database technology.
- Neo4j allows you to represent and store data in nodes and edges, making it ideal for handling connected data and relationships.
- Neo4j provides a Cypher Query Language, making it easy to interact with and query your graph data.
- With Neo4j, you can achieve high-performance graph traversals and queries, suitable for production-level systems.
- Get started quickly with Neo4j by visiting [their website](https://neo4j.com/).
## Installation and Setup
- Install the Python SDK with `pip install neo4j`
## Wrappers
### VectorStore
There exists a wrapper around Neo4j vector index, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import Neo4jVector
```
For a more detailed walkthrough of the Neo4j vector index wrapper, see [this notebook](/docs/integrations/vectorstores/neo4jvector.html)
### GraphCypherQAChain
There exists a wrapper around Neo4j graph database that allows you to generate Cypher statements based on the user input
and use them to retrieve relevant information from the database.
```python
from langchain.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
```
For a more detailed walkthrough of Cypher generating chain, see [this notebook](/docs/extras/use_cases/more/graph/graph_cypher_qa.html)

View File

@ -48,10 +48,31 @@
" accepts = \"application/json\"\n",
"\n",
" def transform_input(self, inputs: list[str], model_kwargs: Dict) -> bytes:\n",
" input_str = json.dumps({\"inputs\": inputs, **model_kwargs})\n",
" \"\"\"\n",
" Transforms the input into bytes that can be consumed by SageMaker endpoint.\n",
" Args:\n",
" inputs: List of input strings.\n",
" model_kwargs: Additional keyword arguments to be passed to the endpoint.\n",
" Returns:\n",
" The transformed bytes input.\n",
" \"\"\"\n",
" # Example: inference.py expects a JSON string with a \"inputs\" key:\n",
" input_str = json.dumps({\"inputs\": inputs, **model_kwargs}) \n",
" return input_str.encode(\"utf-8\")\n",
"\n",
" def transform_output(self, output: bytes) -> List[List[float]]:\n",
" \"\"\"\n",
" Transforms the bytes output from the endpoint into a list of embeddings.\n",
" Args:\n",
" output: The bytes output from SageMaker endpoint.\n",
" Returns:\n",
" The transformed output - list of embeddings\n",
" Note:\n",
" The length of the outer list is the number of input strings.\n",
" The length of the inner lists is the embedding dimension.\n",
" \"\"\"\n",
" # Example: inference.py returns a JSON string with the list of\n",
" # embeddings in a \"vectors\" key:\n",
" response_json = json.loads(output.read().decode(\"utf-8\"))\n",
" return response_json[\"vectors\"]\n",
"\n",
@ -60,7 +81,6 @@
"\n",
"\n",
"embeddings = SagemakerEndpointEmbeddings(\n",
" # endpoint_name=\"endpoint-name\",\n",
" # credentials_profile_name=\"credentials-profile-name\",\n",
" endpoint_name=\"huggingface-pytorch-inference-2023-03-21-16-14-03-834\",\n",
" region_name=\"us-east-1\",\n",

View File

@ -5,9 +5,9 @@
"id": "245a954a",
"metadata": {},
"source": [
"# ArXiv API Tool\n",
"# ArXiv\n",
"\n",
"This notebook goes over how to use the `arxiv` component. \n",
"This notebook goes over how to use the `arxiv` tool with an agent. \n",
"\n",
"First, you need to install `arxiv` python package."
]
@ -110,7 +110,7 @@
"source": [
"## The ArXiv API Wrapper\n",
"\n",
"The tool wraps the API Wrapper. Below, we can explore some of the features it provides."
"The tool uses the `API Wrapper`. Below, we explore some of the features it provides."
]
},
{
@ -167,7 +167,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "840f70c9-8f80-4680-bb38-46198e931bcf",
"metadata": {},
@ -250,7 +249,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -1,25 +1,23 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# AWS Lambda API"
"# AWS Lambda"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook goes over how to use the AWS Lambda Tool component.\n",
">`Amazon AWS Lambda` is a serverless computing service provided by `Amazon Web Services` (`AWS`). It helps developers to build and run applications and services without provisioning or managing servers. This serverless architecture enables you to focus on writing and deploying code, while AWS automatically takes care of scaling, patching, and managing the infrastructure required to run your applications.\n",
"\n",
"AWS Lambda is a serverless computing service provided by Amazon Web Services (AWS), designed to allow developers to build and run applications and services without the need for provisioning or managing servers. This serverless architecture enables you to focus on writing and deploying code, while AWS automatically takes care of scaling, patching, and managing the infrastructure required to run your applications.\n",
"This notebook goes over how to use the `AWS Lambda` Tool.\n",
"\n",
"By including a `awslambda` in the list of tools provided to an Agent, you can grant your Agent the ability to invoke code running in your AWS Cloud for whatever purposes you need.\n",
"\n",
"When an Agent uses the awslambda tool, it will provide an argument of type string which will in turn be passed into the Lambda function via the event parameter.\n",
"When an Agent uses the `AWS Lambda` tool, it will provide an argument of type string which will in turn be passed into the Lambda function via the event parameter.\n",
"\n",
"First, you need to install `boto3` python package."
]
@ -38,7 +36,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -48,7 +45,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -98,7 +94,7 @@
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -112,10 +108,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@ -5,11 +5,13 @@
"id": "8f210ec3",
"metadata": {},
"source": [
"# Shell Tool\n",
"# Shell (bash)\n",
"\n",
"Giving agents access to the shell is powerful (though risky outside a sandboxed environment).\n",
"\n",
"The LLM can use it to execute any shell commands. A common use case for this is letting the LLM interact with your local file system."
"The LLM can use it to execute any shell commands. A common use case for this is letting the LLM interact with your local file system.\n",
"\n",
"**Note:** Shell tool does not work with Windows OS."
]
},
{
@ -184,7 +186,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -1,12 +1,12 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# DataForSeo API Wrapper\n",
"This notebook demonstrates how to use the DataForSeo API wrapper to obtain search engine results. The DataForSeo API allows users to retrieve SERP from most popular search engines like Google, Bing, Yahoo. It also allows to get SERPs from different search engine types like Maps, News, Events, etc.\n"
"# DataForSeo\n",
"\n",
"This notebook demonstrates how to use the `DataForSeo API` to obtain search engine results. The `DataForSeo API` retrieves `SERP` from most popular search engines like `Google`, `Bing`, `Yahoo`. It also allows to get SERPs from different search engine types like `Maps`, `News`, `Events`, etc.\n"
]
},
{
@ -19,12 +19,12 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up the API wrapper with your credentials\n",
"You can obtain your API credentials by registering on the DataForSeo website."
"## Setting up the API credentials\n",
"\n",
"You can obtain your API credentials by registering on the `DataForSeo` website."
]
},
{
@ -42,7 +42,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -59,7 +58,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -72,7 +70,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -103,7 +100,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -127,7 +123,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -151,7 +146,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -178,7 +172,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -214,7 +207,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -228,10 +221,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@ -4,11 +4,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# File System Tools\n",
"# File System\n",
"\n",
"LangChain provides tools for interacting with a local file system out of the box. This notebook walks through some of them.\n",
"\n",
"Note: these tools are not recommended for use outside a sandboxed environment! "
"**Note:** these tools are not recommended for use outside a sandboxed environment! "
]
},
{
@ -187,7 +187,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,32 +5,35 @@
"id": "dc23c48e",
"metadata": {},
"source": [
"# Google Serper API\n",
"# Google Serper\n",
"\n",
"This notebook goes over how to use the Google Serper component to search the web. First you need to sign up for a free account at [serper.dev](https://serper.dev) and get your api key."
"This notebook goes over how to use the `Google Serper` component to search the web. First you need to sign up for a free account at [serper.dev](https://serper.dev) and get your api key."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "a8acfb24",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-04T00:56:29.336521Z",
"start_time": "2023-05-04T00:56:29.334173Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"is_executing": true
}
},
"outputs": [],
"source": [
"import os\n",
"import pprint\n",
"\n",
"os.environ[\"SERPER_API_KEY\"] = \"\""
],
"metadata": {
"collapsed": false,
"pycharm": {
"is_executing": true
},
"ExecuteTime": {
"end_time": "2023-05-04T00:56:29.336521Z",
"start_time": "2023-05-04T00:56:29.334173Z"
}
},
"id": "a8acfb24"
]
},
{
"cell_type": "code",
@ -75,7 +78,9 @@
"outputs": [
{
"data": {
"text/plain": "'Barack Hussein Obama II'"
"text/plain": [
"'Barack Hussein Obama II'"
]
},
"execution_count": 4,
"metadata": {},
@ -88,33 +93,41 @@
},
{
"cell_type": "markdown",
"id": "1f1c6c22",
"metadata": {},
"source": [
"## As part of a Self Ask With Search Chain"
],
"metadata": {
"collapsed": false
},
"id": "1f1c6c22"
]
},
{
"cell_type": "code",
"execution_count": 5,
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = \"\""
],
"id": "c1b5edd7",
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-05-04T00:54:14.311773Z",
"start_time": "2023-05-04T00:54:14.304389Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"id": "c1b5edd7"
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a8ccea61",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
@ -135,7 +148,9 @@
},
{
"data": {
"text/plain": "'El Palmar, Spain'"
"text/plain": [
"'El Palmar, Spain'"
]
},
"execution_count": 5,
"metadata": {},
@ -164,26 +179,34 @@
"self_ask_with_search.run(\n",
" \"What is the hometown of the reigning men's U.S. Open champion?\"\n",
")"
],
"metadata": {
"collapsed": false
},
"id": "a8ccea61"
]
},
{
"cell_type": "markdown",
"id": "3aee3682",
"metadata": {},
"source": [
"## Obtaining results with metadata\n",
"If you would also like to obtain the results in a structured way including metadata. For this we will be using the `results` method of the wrapper."
],
"metadata": {
"collapsed": false
},
"id": "3aee3682"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "073c3fc5",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-04T00:54:22.863413Z",
"start_time": "2023-05-04T00:54:20.827395Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"is_executing": true
}
},
"outputs": [
{
"name": "stdout",
@ -344,33 +367,31 @@
"search = GoogleSerperAPIWrapper()\n",
"results = search.results(\"Apple Inc.\")\n",
"pprint.pp(results)"
],
"metadata": {
"collapsed": false,
"pycharm": {
"is_executing": true
},
"ExecuteTime": {
"end_time": "2023-05-04T00:54:22.863413Z",
"start_time": "2023-05-04T00:54:20.827395Z"
}
},
"id": "073c3fc5"
]
},
{
"cell_type": "markdown",
"id": "b402c308",
"metadata": {},
"source": [
"## Searching for Google Images\n",
"We can also query Google Images using this wrapper. For example:"
],
"metadata": {
"collapsed": false
},
"id": "b402c308"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "7fb2b7e2",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-04T00:54:27.879867Z",
"start_time": "2023-05-04T00:54:26.380022Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
@ -501,30 +522,31 @@
"search = GoogleSerperAPIWrapper(type=\"images\")\n",
"results = search.results(\"Lion\")\n",
"pprint.pp(results)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-05-04T00:54:27.879867Z",
"start_time": "2023-05-04T00:54:26.380022Z"
}
},
"id": "7fb2b7e2"
]
},
{
"cell_type": "markdown",
"id": "85a3bed3",
"metadata": {},
"source": [
"## Searching for Google News\n",
"We can also query Google News using this wrapper. For example:"
],
"metadata": {
"collapsed": false
},
"id": "85a3bed3"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "afc48b39",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-04T00:54:34.984087Z",
"start_time": "2023-05-04T00:54:33.369231Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
@ -630,29 +652,30 @@
"search = GoogleSerperAPIWrapper(type=\"news\")\n",
"results = search.results(\"Tesla Inc.\")\n",
"pprint.pp(results)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-05-04T00:54:34.984087Z",
"start_time": "2023-05-04T00:54:33.369231Z"
}
},
"id": "afc48b39"
]
},
{
"cell_type": "markdown",
"id": "d42ee7b5",
"metadata": {},
"source": [
"If you want to only receive news articles published in the last hour, you can do the following:"
],
"metadata": {
"collapsed": false
},
"id": "d42ee7b5"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8e3824cb",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-04T00:54:41.786864Z",
"start_time": "2023-05-04T00:54:40.691905Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
@ -701,18 +724,12 @@
"search = GoogleSerperAPIWrapper(type=\"news\", tbs=\"qdr:h\")\n",
"results = search.results(\"Tesla Inc.\")\n",
"pprint.pp(results)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-05-04T00:54:41.786864Z",
"start_time": "2023-05-04T00:54:40.691905Z"
}
},
"id": "8e3824cb"
]
},
{
"cell_type": "markdown",
"id": "3f13e9f9",
"metadata": {},
"source": [
"Some examples of the `tbs` parameter:\n",
"\n",
@ -730,26 +747,31 @@
"`qdr:m2` (past 2 years)\n",
"\n",
"For all supported filters simply go to [Google Search](https://google.com), search for something, click on \"Tools\", add your date filter and check the URL for \"tbs=\".\n"
],
"metadata": {
"collapsed": false
},
"id": "3f13e9f9"
]
},
{
"cell_type": "markdown",
"id": "38d4402c",
"metadata": {},
"source": [
"## Searching for Google Places\n",
"We can also query Google Places using this wrapper. For example:"
],
"metadata": {
"collapsed": false
},
"id": "38d4402c"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e7881203",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-04T00:56:07.271164Z",
"start_time": "2023-05-04T00:56:05.645847Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"name": "stdout",
@ -858,15 +880,7 @@
"search = GoogleSerperAPIWrapper(type=\"places\")\n",
"results = search.results(\"Italian restaurants in Upper East Side\")\n",
"pprint.pp(results)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-05-04T00:56:07.271164Z",
"start_time": "2023-05-04T00:56:05.645847Z"
}
},
"id": "e7881203"
]
}
],
"metadata": {
@ -885,9 +899,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
}

View File

@ -5,11 +5,11 @@
"id": "c613812f",
"metadata": {},
"source": [
"# Gradio Tools\n",
"# Gradio\n",
"\n",
"There are many 1000s of Gradio apps on Hugging Face Spaces. This library puts them at the tips of your LLM's fingers 🦾\n",
"There are many 1000s of `Gradio` apps on `Hugging Face Spaces`. This library puts them at the tips of your LLM's fingers 🦾\n",
"\n",
"Specifically, gradio-tools is a Python library for converting Gradio apps into tools that can be leveraged by a large language model (LLM)-based agent to complete its task. For example, an LLM could use a Gradio tool to transcribe a voice recording it finds online and then summarize it for you. Or it could use a different Gradio tool to apply OCR to a document on your Google Drive and then answer questions about it.\n",
"Specifically, `gradio-tools` is a Python library for converting `Gradio` apps into tools that can be leveraged by a large language model (LLM)-based agent to complete its task. For example, an LLM could use a `Gradio` tool to transcribe a voice recording it finds online and then summarize it for you. Or it could use a different `Gradio` tool to apply OCR to a document on your Google Drive and then answer questions about it.\n",
"\n",
"It's very easy to create you own tool if you want to use a space that's not one of the pre-built tools. Please see this section of the gradio-tools documentation for information on how to do that. All contributions are welcome!"
]
@ -99,9 +99,7 @@
"cell_type": "code",
"execution_count": 13,
"id": "98e1e602",
"metadata": {
"scrolled": false
},
"metadata": {},
"outputs": [
{
"data": {
@ -244,7 +242,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -4,17 +4,17 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# GraphQL\n",
"\n",
"# GraphQL tool\n",
"This Jupyter Notebook demonstrates how to use the BaseGraphQLTool component with an Agent.\n",
">[GraphQL](https://graphql.org/) is a query language for APIs and a runtime for executing those queries against your data. `GraphQL` provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.\n",
"\n",
"GraphQL is a query language for APIs and a runtime for executing those queries against your data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.\n",
"By including a `BaseGraphQLTool` in the list of tools provided to an Agent, you can grant your Agent the ability to query data from GraphQL APIs for any purposes you need.\n",
"\n",
"By including a BaseGraphQLTool in the list of tools provided to an Agent, you can grant your Agent the ability to query data from GraphQL APIs for any purposes you need.\n",
"This Jupyter Notebook demonstrates how to use the `GraphQLAPIWrapper` component with an Agent.\n",
"\n",
"In this example, we'll be using the public Star Wars GraphQL API available at the following endpoint: https://swapi-graphql.netlify.app/.netlify/functions/index.\n",
"In this example, we'll be using the public `Star Wars GraphQL API` available at the following endpoint: https://swapi-graphql.netlify.app/.netlify/functions/index.\n",
"\n",
"First, you need to install httpx and gql Python packages."
"First, you need to install `httpx` and `gql` Python packages."
]
},
{
@ -131,7 +131,7 @@
"hash": "f85209c3c4c190dca7367d6a1e623da50a9a4392fd53313a7cf9d4bda9c4b85b"
},
"kernelspec": {
"display_name": "Python 3.9.16 ('langchain')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -145,10 +145,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@ -5,9 +5,9 @@
"id": "40a27d3c-4e5c-4b96-b290-4c49d4fd7219",
"metadata": {},
"source": [
"## HuggingFace Tools\n",
"# HuggingFace Hub Tools\n",
"\n",
"[Huggingface Tools](https://huggingface.co/docs/transformers/v4.29.0/en/custom_tools) supporting text I/O can be\n",
">[Huggingface Tools](https://huggingface.co/docs/transformers/v4.29.0/en/custom_tools) that supporting text I/O can be\n",
"loaded directly using the `load_huggingface_tool` function."
]
},
@ -94,7 +94,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -1,24 +1,23 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "16763ed3",
"metadata": {},
"source": [
"# Lemon AI NLP Workflow Automation\n",
"\\\n",
"Full docs are available at: https://github.com/felixbrock/lemonai-py-client\n",
"# Lemon Agent\n",
"\n",
">[Lemon Agent](https://github.com/felixbrock/lemon-agent) helps you build powerful AI assistants in minutes and automate workflows by allowing for accurate and reliable read and write operations in tools like `Airtable`, `Hubspot`, `Discord`, `Notion`, `Slack` and `Github`.\n",
"\n",
"See [full docs here](https://github.com/felixbrock/lemonai-py-client).\n",
"\n",
"**Lemon AI helps you build powerful AI assistants in minutes and automate workflows by allowing for accurate and reliable read and write operations in tools like Airtable, Hubspot, Discord, Notion, Slack and Github.**\n",
"\n",
"Most connectors available today are focused on read-only operations, limiting the potential of LLMs. Agents, on the other hand, have a tendency to hallucinate from time to time due to missing context or instructions.\n",
"\n",
"With Lemon AI, it is possible to give your agents access to well-defined APIs for reliable read and write operations. In addition, Lemon AI functions allow you to further reduce the risk of hallucinations by providing a way to statically define workflows that the model can rely on in case of uncertainty."
"With `Lemon AI`, it is possible to give your agents access to well-defined APIs for reliable read and write operations. In addition, `Lemon AI` functions allow you to further reduce the risk of hallucinations by providing a way to statically define workflows that the model can rely on in case of uncertainty."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4881b484-1b97-478f-b206-aec407ceff66",
"metadata": {},
@ -29,7 +28,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ff91b41a",
"metadata": {},
@ -46,7 +44,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "340ff63d",
"metadata": {},
@ -57,7 +54,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e845f402",
"metadata": {},
@ -66,7 +62,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d3ae6a82",
"metadata": {},
@ -75,7 +70,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "43476a22",
"metadata": {},
@ -84,7 +78,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "cb038670",
"metadata": {},
@ -93,7 +86,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e423ebbb",
"metadata": {},
@ -110,7 +102,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3fdb36ce",
"metadata": {},
@ -119,7 +110,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ebfb8b5d",
"metadata": {},
@ -140,7 +130,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c9d082cb",
"metadata": {},
@ -189,7 +178,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "aef3e801",
"metadata": {},
@ -225,7 +213,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -1,17 +1,16 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Nuclia Understanding API tool\n",
"# Nuclia Understanding\n",
"\n",
"[Nuclia](https://nuclia.com) automatically indexes your unstructured data from any internal and external source, providing optimized search results and generative answers. It can handle video and audio transcription, image content extraction, and document parsing.\n",
">[Nuclia](https://nuclia.com) automatically indexes your unstructured data from any internal and external source, providing optimized search results and generative answers. It can handle video and audio transcription, image content extraction, and document parsing.\n",
"\n",
"The Nuclia Understanding API supports the processing of unstructured data, including text, web pages, documents, and audio/video contents. It extracts all texts wherever it is (using speech-to-text or OCR when needed), it identifies entities, it aslo extracts metadata, embedded files (like images in a PDF), and web links. It also provides a summary of the content.\n",
"The `Nuclia Understanding API` supports the processing of unstructured data, including text, web pages, documents, and audio/video contents. It extracts all texts wherever it is (using speech-to-text or OCR when needed), it identifies entities, it aslo extracts metadata, embedded files (like images in a PDF), and web links. It also provides a summary of the content.\n",
"\n",
"To use the Nuclia Understanding API, you need to have a Nuclia account. You can create one for free at [https://nuclia.cloud](https://nuclia.cloud), and then [create a NUA key](https://docs.nuclia.dev/docs/docs/using/understanding/intro)."
"To use the `Nuclia Understanding API`, you need to have a `Nuclia` account. You can create one for free at [https://nuclia.cloud](https://nuclia.cloud), and then [create a NUA key](https://docs.nuclia.dev/docs/docs/using/understanding/intro)."
]
},
{
@ -48,7 +47,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -66,7 +64,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -94,7 +91,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -121,7 +117,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -150,7 +145,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "langchain",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -164,10 +159,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.5"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@ -5,11 +5,11 @@
"id": "245a954a",
"metadata": {},
"source": [
"# OpenWeatherMap API\n",
"# OpenWeatherMap\n",
"\n",
"This notebook goes over how to use the OpenWeatherMap component to fetch weather information.\n",
"This notebook goes over how to use the `OpenWeatherMap` component to fetch weather information.\n",
"\n",
"First, you need to sign up for an OpenWeatherMap API key:\n",
"First, you need to sign up for an `OpenWeatherMap API` key:\n",
"\n",
"1. Go to OpenWeatherMap and sign up for an API key [here](https://openweathermap.org/api/)\n",
"2. pip install pyowm\n",
@ -162,7 +162,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,11 +5,11 @@
"id": "64f20f38",
"metadata": {},
"source": [
"# PubMed Tool\n",
"# PubMed\n",
"\n",
"This notebook goes over how to use PubMed as a tool\n",
">[PubMed®](https://pubmed.ncbi.nlm.nih.gov/) comprises more than 35 million citations for biomedical literature from `MEDLINE`, life science journals, and online books. Citations may include links to full text content from PubMed Central and publisher web sites.\n",
"\n",
"PubMed® comprises more than 35 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full text content from PubMed Central and publisher web sites."
"This notebook goes over how to use `PubMed` as a tool."
]
},
{
@ -78,7 +78,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -6,11 +6,11 @@
"jukit_cell_id": "DUXgyWySl5"
},
"source": [
"# SearxNG Search API\n",
"# SearxNG Search\n",
"\n",
"This notebook goes over how to use a self hosted SearxNG search API to search the web.\n",
"This notebook goes over how to use a self hosted `SearxNG` search API to search the web.\n",
"\n",
"You can [check this link](https://docs.searxng.org/dev/search_api.html) for more informations about Searx API parameters."
"You can [check this link](https://docs.searxng.org/dev/search_api.html) for more informations about `Searx API` parameters."
]
},
{
@ -611,7 +611,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,9 +5,9 @@
"id": "acb64858",
"metadata": {},
"source": [
"# YouTubeSearchTool\n",
"# YouTube (youtube_search)\n",
"\n",
"This notebook shows how to use a tool to search YouTube\n",
"This notebook shows how to use a tool to search `YouTube` using `youtube_search` package.\n",
"\n",
"Adapted from [https://github.com/venuv/langchain_yt_tools](https://github.com/venuv/langchain_yt_tools)"
]
@ -117,7 +117,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,15 +5,12 @@
"id": "16763ed3",
"metadata": {},
"source": [
"# Zapier Natural Language Actions API\n",
"\\\n",
"Full docs here: https://nla.zapier.com/start/\n",
"# Zapier Natural Language Actions\n",
"\n",
"**Zapier Natural Language Actions** gives you access to the 5k+ apps, 20k+ actions on Zapier's platform through a natural language API interface.\n",
"\n",
"NLA supports apps like Gmail, Salesforce, Trello, Slack, Asana, HubSpot, Google Sheets, Microsoft Teams, and thousands more apps: https://zapier.com/apps\n",
"\n",
"Zapier NLA handles ALL the underlying API auth and translation from natural language --> underlying API call --> return simplified output for LLMs. The key idea is you, or your users, expose a set of actions via an oauth-like setup window, which you can then query and execute via a REST API.\n",
">[Zapier Natural Language Actions](https://nla.zapier.com/start/) gives you access to the 5k+ apps, 20k+ actions on Zapier's platform through a natural language API interface.\n",
">\n",
">NLA supports apps like `Gmail`, `Salesforce`, `Trello`, `Slack`, `Asana`, `HubSpot`, `Google Sheets`, `Microsoft Teams`, and thousands more apps: https://zapier.com/apps\n",
">`Zapier NLA` handles ALL the underlying API auth and translation from natural language --> underlying API call --> return simplified output for LLMs. The key idea is you, or your users, expose a set of actions via an oauth-like setup window, which you can then query and execute via a REST API.\n",
"\n",
"NLA offers both API Key and OAuth for signing NLA API requests.\n",
"\n",
@ -21,7 +18,7 @@
"\n",
"2. User-facing (Oauth): for production scenarios where you are deploying an end-user facing application and LangChain needs access to end-user's exposed actions and connected accounts on Zapier.com\n",
"\n",
"This quick start will focus mostly on the server-side use case for brevity. Jump to [Example Using OAuth Access Token](#oauth) to see a short example how to set up Zapier for user-facing situations. Review [full docs](https://nla.zapier.com/start/) for full user-facing oauth developer support.\n",
"This quick start focus mostly on the server-side use case for brevity. Jump to [Example Using OAuth Access Token](#oauth) to see a short example how to set up Zapier for user-facing situations. Review [full docs](https://nla.zapier.com/start/) for full user-facing oauth developer support.\n",
"\n",
"This example goes over how to use the Zapier integration with a `SimpleSequentialChain`, then an `Agent`.\n",
"In code, below:"
@ -369,7 +366,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -0,0 +1,440 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Neo4j Vector Index\n",
"\n",
">[Neo4j](https://neo4j.com/) is an open-source graph database with integrated support for vector similarity search\n",
"\n",
"It supports:\n",
"- approximate nearest neighbor search\n",
"- L2 distance and cosine distance\n",
"\n",
"This notebook shows how to use the Neo4j vector index (`Neo4jVector`)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See the [installation instruction](https://neo4j.com/docs/operations-manual/current/installation/)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: neo4j in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (5.11.0)\n",
"Requirement already satisfied: pytz in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from neo4j) (2023.3)\n",
"Requirement already satisfied: openai in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (0.27.6)\n",
"Requirement already satisfied: requests>=2.20 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from openai) (2.31.0)\n",
"Requirement already satisfied: tqdm in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from openai) (4.66.1)\n",
"Requirement already satisfied: aiohttp in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from openai) (3.8.5)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.20->openai) (3.2.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.20->openai) (3.4)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.20->openai) (2.0.4)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.20->openai) (2023.7.22)\n",
"Requirement already satisfied: attrs>=17.3.0 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from aiohttp->openai) (23.1.0)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from aiohttp->openai) (6.0.4)\n",
"Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from aiohttp->openai) (4.0.3)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from aiohttp->openai) (1.9.2)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from aiohttp->openai) (1.4.0)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from aiohttp->openai) (1.3.1)\n",
"Requirement already satisfied: tiktoken in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (0.4.0)\n",
"Requirement already satisfied: regex>=2022.1.18 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from tiktoken) (2023.8.8)\n",
"Requirement already satisfied: requests>=2.26.0 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from tiktoken) (2.31.0)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (3.2.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (3.4)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (2.0.4)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /home/tomaz/anaconda3/envs/myenv/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (2023.7.22)\n"
]
}
],
"source": [
"# Pip install necessary package\n",
"!pip install neo4j\n",
"!pip install openai\n",
"!pip install tiktoken"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Neo4jVector\n",
"from langchain.document_loaders import TextLoader\n",
"from langchain.docstore.document import Document"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# Neo4jVector requires the Neo4j database credentials\n",
"\n",
"url = \"bolt://localhost:7687\"\n",
"username = \"neo4j\"\n",
"password = \"pleaseletmein\"\n",
"\n",
"# You can also use environment variables instead of directly passing named parameters\n",
"#os.environ[\"NEO4J_URL\"] = \"bolt://localhost:7687\"\n",
"#os.environ[\"NEO4J_USERNAME\"] = \"neo4j\"\n",
"#os.environ[\"NEO4J_PASSWORD\"] = \"pleaseletmein\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Similarity Search with Cosine Distance (Default)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# The Neo4jVector Module will connect to Neo4j and create a vector index if needed.\n",
"\n",
"db = Neo4jVector.from_documents(\n",
" docs, OpenAIEmbeddings(), url=url, username=username, password=password\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs_with_score = db.similarity_search_with_score(query)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--------------------------------------------------------------------------------\n",
"Score: 0.9077161550521851\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.9077161550521851\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.891287088394165\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
"\n",
"Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
"--------------------------------------------------------------------------------\n",
"--------------------------------------------------------------------------------\n",
"Score: 0.891287088394165\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
"\n",
"We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling. \n",
"\n",
"Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n",
"\n",
"Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
"\n",
"Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
"--------------------------------------------------------------------------------\n"
]
}
],
"source": [
"for doc, score in docs_with_score:\n",
" print(\"-\" * 80)\n",
" print(\"Score: \", score)\n",
" print(doc.page_content)\n",
" print(\"-\" * 80)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Working with vectorstore\n",
"\n",
"Above, we created a vectorstore from scratch. However, often times we want to work with an existing vectorstore.\n",
"In order to do that, we can initialize it directly."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"index_name = \"vector\" # default index name\n",
"\n",
"store = Neo4jVector.from_existing_index(\n",
" OpenAIEmbeddings(),\n",
" url=url,\n",
" username=username,\n",
" password=password,\n",
" index_name=index_name,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Add documents\n",
"We can add documents to the existing vectorstore."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['2f70679a-4416-11ee-b7c3-d46a6aa24f5b']"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"store.add_documents([Document(page_content=\"foo\")])"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"docs_with_score = store.similarity_search_with_score(\"foo\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='foo', metadata={}), 1.0)"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs_with_score[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Retriever options\n",
"\n",
"This section shows how to use `Neo4jVector` as a retriever."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../modules/state_of_the_union.txt'})"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever = store.as_retriever()\n",
"retriever.get_relevant_documents(query)[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Question Answering with Sources\n",
"\n",
"This section goes over how to do question-answering with sources over an Index. It does this by using the `RetrievalQAWithSourcesChain`, which does the lookup of the documents from an Index. "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import RetrievalQAWithSourcesChain\n",
"from langchain.chat_models import ChatOpenAI"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"chain = RetrievalQAWithSourcesChain.from_chain_type(\n",
" ChatOpenAI(temperature=0), chain_type=\"stuff\", retriever=retriever\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'answer': \"The president honored Justice Stephen Breyer, who is retiring from the United States Supreme Court, and thanked him for his service. The president also mentioned that he nominated Circuit Court of Appeals Judge Ketanji Brown Jackson to continue Justice Breyer's legacy of excellence. \\n\",\n",
" 'sources': '../../modules/state_of_the_union.txt'}"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain(\n",
" {\"question\": \"What did the president say about Justice Breyer\"},\n",
" return_only_outputs=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
@ -60,7 +59,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "eeead681",
"metadata": {},
@ -73,7 +71,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "04a1f1a0",
"metadata": {},
"outputs": [],
@ -86,12 +84,12 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "be0a4973",
"metadata": {},
"outputs": [],
"source": [
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"loader = TextLoader(\"../../modules/state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)"
@ -99,7 +97,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "8429667e",
"metadata": {
"ExecuteTime": {
@ -118,7 +116,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "90dbf3e7",
"metadata": {},
@ -133,7 +130,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "85ef3468",
"metadata": {},
"outputs": [],
@ -165,7 +162,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1f9215c8",
"metadata": {
@ -182,7 +178,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"id": "a8c513ab",
"metadata": {
"ExecuteTime": {
@ -201,7 +197,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"id": "fc516993",
"metadata": {
"ExecuteTime": {
@ -215,13 +211,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson.\n"
]
}
],
@ -230,7 +220,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1bda9bf5",
"metadata": {},
@ -242,7 +231,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 7,
"id": "8804a21d",
"metadata": {
"ExecuteTime": {
@ -254,13 +243,13 @@
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"found_docs = vectara.similarity_search_with_score(\n",
" query, filter=\"doc.speech = 'state-of-the-union'\"\n",
" query, filter=\"doc.speech = 'state-of-the-union'\", score_threshold=0.2,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"id": "756a6887",
"metadata": {
"ExecuteTime": {
@ -273,15 +262,9 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence. A former top litigator in private practice.\n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n",
"\n",
"Score: 0.4917977\n"
"Score: 0.786569\n"
]
}
],
@ -292,7 +275,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1f9876a8",
"metadata": {},
@ -302,7 +284,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 9,
"id": "47784de5",
"metadata": {},
"outputs": [
@ -310,22 +292,43 @@
"name": "stdout",
"output_type": "stream",
"text": [
"(Document(page_content='We must forever conduct our struggle on the high plane of dignity and discipline.', metadata={'section': '1'}), 0.7962591)\n",
"(Document(page_content='We must not allow our\\ncreative protests to degenerate into physical violence. . . .', metadata={'section': '1'}), 0.25983918)\n"
"With this threshold of 1.2 we have 0 documents\n"
]
}
],
"source": [
"query = \"We must forever conduct our struggle\"\n",
"min_score = 1.2\n",
"found_docs = vectara.similarity_search_with_score(\n",
" query, filter=\"doc.speech = 'I-have-a-dream'\"\n",
" query, filter=\"doc.speech = 'I-have-a-dream'\", score_threshold=min_score,\n",
")\n",
"print(found_docs[0])\n",
"print(found_docs[1])"
"print(f\"With this threshold of {min_score} we have {len(found_docs)} documents\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "3e22949f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"With this threshold of 0.2 we have 3 documents\n"
]
}
],
"source": [
"query = \"We must forever conduct our struggle\"\n",
"min_score = 0.2\n",
"found_docs = vectara.similarity_search_with_score(\n",
" query, filter=\"doc.speech = 'I-have-a-dream'\", score_threshold=min_score,\n",
")\n",
"print(f\"With this threshold of {min_score} we have {len(found_docs)} documents\")\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "691a82d6",
"metadata": {},
@ -349,7 +352,7 @@
{
"data": {
"text/plain": [
"VectaraRetriever(vectorstore=<langchain.vectorstores.vectara.Vectara object at 0x12772caf0>, search_type='similarity', search_kwargs={'lambda_val': 0.025, 'k': 5, 'filter': '', 'n_sentence_context': '0'})"
"VectaraRetriever(tags=['Vectara'], metadata=None, vectorstore=<langchain.vectorstores.vectara.Vectara object at 0x1586bd330>, search_type='similarity', search_kwargs={'lambda_val': 0.025, 'k': 5, 'filter': '', 'n_sentence_context': '2'})"
]
},
"execution_count": 11,
@ -376,7 +379,7 @@
{
"data": {
"text/plain": [
"Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'})"
"Document(page_content='Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence. A former top litigator in private practice.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '596', 'len': '97', 'speech': 'state-of-the-union'})"
]
},
"execution_count": 12,
@ -414,7 +417,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.9"
}
},
"nbformat": 4,

View File

@ -52,7 +52,7 @@
},
"outputs": [],
"source": [
"!pip install xata==1.0.0a7 openai tiktoken langchain"
"!pip install xata openai tiktoken langchain"
]
},
{

View File

@ -7,9 +7,11 @@
"tags": []
},
"source": [
"# How to add Memory to an LLMChain\n",
"# Memory in LLMChain\n",
"\n",
"This notebook goes over how to use the Memory class with an LLMChain. For the purposes of this walkthrough, we will add the [ConversationBufferMemory](https://api.python.langchain.com/en/latest/memory/langchain.memory.buffer.ConversationBufferMemory.html#langchain.memory.buffer.ConversationBufferMemory) class, although this can be any memory class."
"This notebook goes over how to use the Memory class with an LLMChain. \n",
"\n",
"We will add the [ConversationBufferMemory](https://api.python.langchain.com/en/latest/memory/langchain.memory.buffer.ConversationBufferMemory.html#langchain.memory.buffer.ConversationBufferMemory) class, although this can be any memory class."
]
},
{
@ -321,7 +323,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,9 +5,9 @@
"id": "e42733c5",
"metadata": {},
"source": [
"# How to add memory to a Multi-Input Chain\n",
"# Memory in the Multi-Input Chain\n",
"\n",
"Most memory objects assume a single input. In this notebook, we go over how to add memory to a chain that has multiple inputs. As an example of such a chain, we will add memory to a question/answering chain. This chain takes as inputs both related documents and a user question."
"Most memory objects assume a single input. In this notebook, we go over how to add memory to a chain that has multiple inputs. We will add memory to a question/answering chain. This chain takes as inputs both related documents and a user question."
]
},
{
@ -178,7 +178,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,11 +5,11 @@
"id": "fa6802ac",
"metadata": {},
"source": [
"# How to add Memory to an Agent\n",
"# Memory in Agent\n",
"\n",
"This notebook goes over adding memory to an Agent. Before going through this notebook, please walkthrough the following notebooks, as this will build on top of both of them:\n",
"\n",
"- [Adding memory to an LLM Chain](/docs/modules/memory/how_to/adding_memory.html)\n",
"- [Memory in LLMChain](/docs/modules/memory/how_to/adding_memory.html)\n",
"- [Custom Agents](/docs/modules/agents/how_to/custom_agent.html)\n",
"\n",
"In order to add a memory to an agent we are going to the the following steps:\n",
@ -317,7 +317,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,13 +5,13 @@
"id": "fa6802ac",
"metadata": {},
"source": [
"# Adding Message Memory backed by a database to an Agent\n",
"# Message Memory in Agent backed by a database\n",
"\n",
"This notebook goes over adding memory to an Agent where the memory uses an external message store. Before going through this notebook, please walkthrough the following notebooks, as this will build on top of both of them:\n",
"\n",
"- [Adding memory to an LLM Chain](/docs/modules/memory/how_to/adding_memory.html)\n",
"- [Memory in LLMChain](/docs/modules/memory/how_to/adding_memory.html)\n",
"- [Custom Agents](/docs/modules/agents/how_to/custom_agent.html)\n",
"- [Agent with Memory](/docs/modules/memory/how_to/agent_with_memory.html)\n",
"- [Memory in Agent](/docs/modules/memory/how_to/agent_with_memory.html)\n",
"\n",
"In order to add a memory with an external message store to an agent we are going to do the following steps:\n",
"\n",
@ -348,7 +348,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,7 +5,7 @@
"id": "69e35d6f",
"metadata": {},
"source": [
"# How to customize conversational memory\n",
"# Customizing Conversational Memory\n",
"\n",
"This notebook walks through a few ways to customize conversational memory."
]
@ -373,7 +373,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,7 +5,8 @@
"id": "94e33ebe",
"metadata": {},
"source": [
"# How to create a custom Memory class\n",
"# Custom Memory\n",
"\n",
"Although there are a few predefined types of memory in LangChain, it is highly possible you will want to add your own type of memory that is optimal for your application. This notebook covers how to do that."
]
},
@ -295,7 +296,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,8 +5,9 @@
"id": "d9fec22e",
"metadata": {},
"source": [
"# How to use multiple memory classes in the same chain\n",
"It is also possible to use multiple memory classes in the same chain. To combine multiple memory classes, we can initialize the `CombinedMemory` class, and then use that."
"# Multiple Memory classes\n",
"\n",
"We can use multiple memory classes in the same chain. To combine multiple memory classes, we initialize and use the `CombinedMemory` class."
]
},
{
@ -158,7 +159,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,11 +5,17 @@
"id": "44c9933a",
"metadata": {},
"source": [
"# Conversation Knowledge Graph Memory\n",
"# Conversation Knowledge Graph\n",
"\n",
"This type of memory uses a knowledge graph to recreate memory.\n",
"\n",
"Let's first walk through how to use the utilities"
"This type of memory uses a knowledge graph to recreate memory.\n"
]
},
{
"cell_type": "markdown",
"id": "0c798006-ca04-4de3-83eb-cf167fb2bd01",
"metadata": {},
"source": [
"## Using memory with LLM"
]
},
{
@ -162,6 +168,7 @@
"metadata": {},
"source": [
"## Using in a chain\n",
"\n",
"Let's now use this in a chain!"
]
},
@ -348,7 +355,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,13 +5,22 @@
"id": "ff4be5f3",
"metadata": {},
"source": [
"# ConversationSummaryBufferMemory\n",
"# Conversation Summary Buffer\n",
"\n",
"`ConversationSummaryBufferMemory` combines the last two ideas. It keeps a buffer of recent interactions in memory, but rather than just completely flushing old interactions it compiles them into a summary and uses both. Unlike the previous implementation though, it uses token length rather than number of interactions to determine when to flush interactions.\n",
"`ConversationSummaryBufferMemory` combines the two ideas. It keeps a buffer of recent interactions in memory, but rather than just completely flushing old interactions it compiles them into a summary and uses both. \n",
"It uses token length rather than number of interactions to determine when to flush interactions.\n",
"\n",
"Let's first walk through how to use the utilities"
]
},
{
"cell_type": "markdown",
"id": "0309636e-a530-4d2a-ba07-0916ea18bb20",
"metadata": {},
"source": [
"## Using memory with LLM"
]
},
{
"cell_type": "code",
"execution_count": 1,
@ -320,7 +329,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -5,13 +5,21 @@
"id": "ff4be5f3",
"metadata": {},
"source": [
"# ConversationTokenBufferMemory\n",
"# Conversation Token Buffer\n",
"\n",
"`ConversationTokenBufferMemory` keeps a buffer of recent interactions in memory, and uses token length rather than number of interactions to determine when to flush interactions.\n",
"\n",
"Let's first walk through how to use the utilities"
]
},
{
"cell_type": "markdown",
"id": "0e528ef0-7b04-4a4a-8ff2-493c02027e83",
"metadata": {},
"source": [
"## Using memory with LLM"
]
},
{
"cell_type": "code",
"execution_count": 1,
@ -286,7 +294,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@ -62,7 +62,7 @@
"\n",
"# Set env var OPENAI_API_KEY or load from a .env file:\n",
"# import dotenv\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{
@ -145,7 +145,7 @@
"source": [
"## Functions \n",
"\n",
"We can unpack what is hapening when we use the functions to calls external APIs.\n",
"We can unpack what is happening when we use the functions to call external APIs.\n",
"\n",
"Let's look at the [LangSmith trace](https://smith.langchain.com/public/76a58b85-193f-4eb7-ba40-747f0d5dd56e/r):\n",
"\n",
@ -155,10 +155,10 @@
"https://www.klarna.com/us/shopping/public/openai/v0/api-docs/\n",
"```\n",
"\n",
"* The prompt then tells the LLM to use the API spec wiith input question:\n",
"* The prompt then tells the LLM to use the API spec with input question:\n",
"\n",
"```\n",
"Use the provided API's to respond to this user query:\n",
"Use the provided APIs to respond to this user query:\n",
"What are some options for a men's large blue button down shirt\n",
"```\n",
"\n",
@ -278,7 +278,7 @@
"![Image description](/img/api_chain.png)\n",
"\n",
"* [Here](https://github.com/langchain-ai/langchain/blob/bbd22b9b761389a5e40fc45b0570e1830aabb707/libs/langchain/langchain/chains/api/base.py#L82) we make the API request with the API url.\n",
"* The `api_answer_chain` takes the response from the API and provides us with a natural langugae response:\n",
"* The `api_answer_chain` takes the response from the API and provides us with a natural language response:\n",
"\n",
"![Image description](/img/api_chain_response.png)"
]

View File

@ -54,7 +54,7 @@
"\n",
"# Set env var OPENAI_API_KEY or load from a .env file:\n",
"# import dotenv\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -25,7 +25,7 @@
"In particular, we can employ a [splitting strategy](https://python.langchain.com/docs/integrations/document_loaders/source_code) that does a few things:\n",
"\n",
"* Keeps each top-level function and class in the code is loaded into separate documents. \n",
"* Puts remaining into a seperate document.\n",
"* Puts remaining into a separate document.\n",
"* Retains metadata about where each split comes from\n",
"\n",
"## Quickstart"
@ -42,7 +42,7 @@
"# Set env var OPENAI_API_KEY or load from a .env file\n",
"# import dotenv\n",
"\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{
@ -94,7 +94,7 @@
"We load the py code using [`LanguageParser`](https://python.langchain.com/docs/integrations/document_loaders/source_code), which will:\n",
"\n",
"* Keep top-level functions and classes together (into a single document)\n",
"* Put remaining code into a seperate document\n",
"* Put remaining code into a separate document\n",
"* Retains metadata about where each split comes from"
]
},

View File

@ -73,7 +73,7 @@
"\n",
"# Set env var OPENAI_API_KEY or load from a .env file:\n",
"# import dotenv\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -70,7 +70,7 @@
"# Set env var OPENAI_API_KEY and SERPAPI_API_KEY or load from a .env file\n",
"# import dotenv\n",
"\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -0,0 +1,154 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# FalkorDBQAChain"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook shows how to use LLMs to provide a natural language interface to FalkorDB database.\n",
"\n",
"FalkorDB is a low latency property graph database management system. You can simply run its docker locally:\n",
"\n",
"```bash\n",
"docker run -p 6379:6379 -it --rm falkordb/falkordb:edge\n",
"```\n",
"\n",
"Once launched, you can simply start creating a database on the local machine and connect to it."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.graphs import FalkorDBGraph\n",
"from langchain.chains import FalkorDBQAChain"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"graph = FalkorDBGraph(database=\"movies\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"graph.query(\n",
" \"\"\"\n",
"MERGE (m:Movie {name:\"Top Gun\"})\n",
"WITH m\n",
"UNWIND [\"Tom Cruise\", \"Val Kilmer\", \"Anthony Edwards\", \"Meg Ryan\"] AS actor\n",
"MERGE (a:Actor {name:actor})\n",
"MERGE (a)-[:ACTED_IN]->(m)\n",
"\"\"\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"graph.refresh_schema()\n",
"import os\n",
"os.environ['OPENAI_API_KEY']='API_KEY_HERE'\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"chain = FalkorDBQAChain.from_llm(\n",
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new FalkorDBQAChain chain...\u001b[0m\n",
"Generated Cypher:\n",
"\u001b[32;1m\u001b[1;3mMATCH (:Movie {title: 'Top Gun'})<-[:ACTED_IN]-(actor:Person)\n",
"RETURN actor.name AS output\u001b[0m\n",
"Full Context:\n",
"\u001b[32;1m\u001b[1;3m[]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The actor who played in Top Gun is Tom Cruise.'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"Who played in Top Gun?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@ -47,7 +47,7 @@
"# Set env var OPENAI_API_KEY or load from a .env file\n",
"# import dotenv\n",
"\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -50,7 +50,7 @@
"# Set env var OPENAI_API_KEY or load from a .env file\n",
"# import dotenv\n",
"\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -76,7 +76,7 @@
"# Set env var OPENAI_API_KEY or load from a .env file\n",
"# import dotenv\n",
"\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -44,7 +44,7 @@
"\n",
"# Set env var OPENAI_API_KEY or load from a .env file:\n",
"# import dotenv\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -41,7 +41,7 @@
"\n",
"# Set env var OPENAI_API_KEY or load from a .env file:\n",
"# import dotenv\n",
"# dotenv.load_env()"
"# dotenv.load_dotenv()"
]
},
{

View File

@ -33,7 +33,7 @@ class BaseRetriever(ABC):
...
```
It's that simple! You can call `get_relevant_documents` or the async `get_relevant_documents` methods to retrieve documents relevant to a query, where "relevance" is defined by
It's that simple! You can call `get_relevant_documents` or the async `aget_relevant_documents` methods to retrieve documents relevant to a query, where "relevance" is defined by
the specific retriever object you are calling.
Of course, we also help construct what we think useful Retrievers are. The main type of Retriever that we focus on is a Vectorstore retriever. We will focus on that for the rest of this guide.

View File

@ -1,3 +1,16 @@
# 🦜️🧪 LangChain Experimental
This repository holds more experimental LangChain code.
This package holds experimental LangChain code, intended for research and experimental
uses.
> [!WARNING]
> Portions of the code in this package may be dangerous if not properly deployed
> in a sandboxed environment. Please be wary of deploying experimental code
> to production unless you've taken appropriate precautions and
> have already discussed it with your security team.
Some of the code here may be marked with security notices. However,
given the exploratory and experimental nature of the code in this package,
the lack of a security notice on a piece of code does not mean that
the code in question does not require additional security considerations
in order to be safe to use.

View File

@ -131,13 +131,34 @@ class InterventionChain(_BaseStoryElementChain):
class QueryChain(_BaseStoryElementChain):
"""Query the outcome table using SQL."""
"""Query the outcome table using SQL.
*Security note*: This class implements an AI technique that generates SQL code.
If those SQL commands are executed, it's critical to ensure they use credentials
that are narrowly-scoped to only include the permissions this chain needs.
Failure to do so may result in data corruption or loss, since this chain may
attempt commands like `DROP TABLE` or `INSERT` if appropriately prompted.
The best way to guard against such negative outcomes is to (as appropriate)
limit the permissions granted to the credentials used with this chain.
"""
pydantic_model: ClassVar[Type[pydantic.BaseModel]] = QueryModel
template: ClassVar[str] = query_template # TODO: incl. table schema
class CPALChain(_BaseStoryElementChain):
"""Causal program-aided language (CPAL) chain implementation.
*Security note*: The building blocks of this class include the implementation
of an AI technique that generates SQL code. If those SQL commands
are executed, it's critical to ensure they use credentials that
are narrowly-scoped to only include the permissions this chain needs.
Failure to do so may result in data corruption or loss, since this chain may
attempt commands like `DROP TABLE` or `INSERT` if appropriately prompted.
The best way to guard against such negative outcomes is to (as appropriate)
limit the permissions granted to the credentials used with this chain.
"""
llm: BaseLanguageModel
narrative_chain: Optional[NarrativeChain] = None
causal_chain: Optional[CausalChain] = None
@ -151,7 +172,17 @@ class CPALChain(_BaseStoryElementChain):
llm: BaseLanguageModel,
**kwargs: Any,
) -> CPALChain:
"""instantiation depends on component chains"""
"""instantiation depends on component chains
*Security note*: The building blocks of this class include the implementation
of an AI technique that generates SQL code. If those SQL commands
are executed, it's critical to ensure they use credentials that
are narrowly-scoped to only include the permissions this chain needs.
Failure to do so may result in data corruption or loss, since this chain may
attempt commands like `DROP TABLE` or `INSERT` if appropriately prompted.
The best way to guard against such negative outcomes is to (as appropriate)
limit the permissions granted to the credentials used with this chain.
"""
return cls(
llm=llm,
chain=LLMChain(

View File

@ -90,6 +90,15 @@ class PALChain(Chain):
This class implements the Program-Aided Language Models (PAL) for generating code
solutions. PAL is a technique described in the paper "Program-Aided Language Models"
(https://arxiv.org/pdf/2211.10435.pdf).
*Security note*: This class implements an AI technique that generates and evaluates
Python code, which can be dangerous and requires a specially sandboxed
environment to be safely used. While this class implements some basic guardrails
by limiting available locals/globals and by parsing and inspecting
the generated Python AST using `PALValidation`, those guardrails will not
deter sophisticated attackers and are not a replacement for a proper sandbox.
Do not use this class on untrusted inputs, with elevated permissions,
or without consulting your security team about proper sandboxing!
"""
llm_chain: LLMChain

View File

@ -1,6 +1,6 @@
[tool.poetry]
name = "langchain-experimental"
version = "0.0.11"
version = "0.0.12"
description = "Building applications with LLMs through composability"
authors = []
license = "MIT"

View File

@ -686,12 +686,15 @@ s
cls,
agent: Union[BaseSingleActionAgent, BaseMultiActionAgent],
tools: Sequence[BaseTool],
callback_manager: Optional[BaseCallbackManager] = None,
callbacks: Callbacks = None,
**kwargs: Any,
) -> AgentExecutor:
"""Create from agent and tools."""
return cls(
agent=agent, tools=tools, callback_manager=callback_manager, **kwargs
agent=agent,
tools=tools,
callbacks=callbacks,
**kwargs,
)
@root_validator()

View File

@ -6,7 +6,7 @@ from langchain.agents.agent_toolkits.spark_sql.prompt import SQL_PREFIX, SQL_SUF
from langchain.agents.agent_toolkits.spark_sql.toolkit import SparkSQLToolkit
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.base import BaseCallbackManager, Callbacks
from langchain.chains.llm import LLMChain
from langchain.schema.language_model import BaseLanguageModel
@ -15,6 +15,7 @@ def create_spark_sql_agent(
llm: BaseLanguageModel,
toolkit: SparkSQLToolkit,
callback_manager: Optional[BaseCallbackManager] = None,
callbacks: Callbacks = None,
prefix: str = SQL_PREFIX,
suffix: str = SQL_SUFFIX,
format_instructions: str = FORMAT_INSTRUCTIONS,
@ -41,6 +42,7 @@ def create_spark_sql_agent(
llm=llm,
prompt=prompt,
callback_manager=callback_manager,
callbacks=callbacks,
)
tool_names = [tool.name for tool in tools]
agent = ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names, **kwargs)
@ -48,6 +50,7 @@ def create_spark_sql_agent(
agent=agent,
tools=tools,
callback_manager=callback_manager,
callbacks=callbacks,
verbose=verbose,
max_iterations=max_iterations,
max_execution_time=max_execution_time,

View File

@ -19,6 +19,7 @@ from langchain.callbacks.flyte_callback import FlyteCallbackHandler
from langchain.callbacks.human import HumanApprovalCallbackHandler
from langchain.callbacks.infino_callback import InfinoCallbackHandler
from langchain.callbacks.labelstudio_callback import LabelStudioCallbackHandler
from langchain.callbacks.llmonitor_callback import LLMonitorCallbackHandler
from langchain.callbacks.manager import (
collect_runs,
get_openai_callback,
@ -54,6 +55,7 @@ __all__ = [
"HumanApprovalCallbackHandler",
"InfinoCallbackHandler",
"MlflowCallbackHandler",
"LLMonitorCallbackHandler",
"OpenAICallbackHandler",
"StdOutCallbackHandler",
"AsyncIteratorCallbackHandler",

View File

@ -0,0 +1,329 @@
import os
import traceback
from datetime import datetime
from typing import Any, Dict, List, Literal, Union
from uuid import UUID
import requests
from langchain.callbacks.base import BaseCallbackHandler
from langchain.schema.agent import AgentAction, AgentFinish
from langchain.schema.messages import BaseMessage
from langchain.schema.output import LLMResult
DEFAULT_API_URL = "https://app.llmonitor.com"
def _parse_lc_role(
role: str,
) -> Union[Literal["user", "ai", "system", "function"], None]:
if role == "human":
return "user"
elif role == "ai":
return "ai"
elif role == "system":
return "system"
elif role == "function":
return "function"
else:
return None
def _serialize_lc_message(message: BaseMessage) -> Dict[str, Any]:
return {"text": message.content, "role": _parse_lc_role(message.type)}
class LLMonitorCallbackHandler(BaseCallbackHandler):
"""Initializes the `LLMonitorCallbackHandler`.
#### Parameters:
- `app_id`: The app id of the app you want to report to. Defaults to
`None`, which means that `LLMONITOR_APP_ID` will be used.
- `api_url`: The url of the LLMonitor API. Defaults to `None`,
which means that either `LLMONITOR_API_URL` environment variable
or `https://app.llmonitor.com` will be used.
#### Raises:
- `ValueError`: if `app_id` is not provided either as an
argument or as an environment variable.
- `ConnectionError`: if the connection to the API fails.
#### Example:
```python
from langchain.llms import OpenAI
from langchain.callbacks import LLMonitorCallbackHandler
llmonitor_callback = LLMonitorCallbackHandler()
llm = OpenAI(callbacks=[llmonitor_callback],
metadata={"userId": "user-123"})
llm.predict("Hello, how are you?")
```
"""
__api_url: str
__app_id: str
def __init__(
self, app_id: Union[str, None] = None, api_url: Union[str, None] = None
) -> None:
super().__init__()
self.__api_url = api_url or os.getenv("LLMONITOR_API_URL") or DEFAULT_API_URL
_app_id = app_id or os.getenv("LLMONITOR_APP_ID")
if _app_id is None:
raise ValueError(
"""app_id must be provided either as an argument or as
an environment variable"""
)
self.__app_id = _app_id
try:
res = requests.get(f"{self.__api_url}/api/app/{self.__app_id}")
if not res.ok:
raise ConnectionError()
except Exception as e:
raise ConnectionError(
f"Could not connect to the LLMonitor API at {self.__api_url}"
) from e
def __send_event(self, event: Dict[str, Any]) -> None:
headers = {"Content-Type": "application/json"}
event = {**event, "app": self.__app_id, "timestamp": str(datetime.utcnow())}
data = {"events": event}
requests.post(headers=headers, url=f"{self.__api_url}/api/report", json=data)
def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
tags: Union[List[str], None] = None,
metadata: Union[Dict[str, Any], None] = None,
**kwargs: Any,
) -> None:
event = {
"event": "start",
"type": "llm",
"userId": (metadata or {}).get("userId"),
"runId": str(run_id),
"parentRunId": str(parent_run_id) if parent_run_id else None,
"input": prompts[0],
"name": kwargs.get("invocation_params", {}).get("model_name"),
"tags": tags,
"metadata": metadata,
}
self.__send_event(event)
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
tags: Union[List[str], None] = None,
metadata: Union[Dict[str, Any], None] = None,
**kwargs: Any,
) -> Any:
event = {
"event": "start",
"type": "llm",
"userId": (metadata or {}).get("userId"),
"runId": str(run_id),
"parentRunId": str(parent_run_id) if parent_run_id else None,
"input": [_serialize_lc_message(message[0]) for message in messages],
"name": kwargs.get("invocation_params", {}).get("model_name"),
"tags": tags,
"metadata": metadata,
}
self.__send_event(event)
def on_llm_end(
self,
response: LLMResult,
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
**kwargs: Any,
) -> None:
token_usage = (response.llm_output or {}).get("token_usage", {})
event = {
"event": "end",
"type": "llm",
"runId": str(run_id),
"parent_run_id": str(parent_run_id) if parent_run_id else None,
"output": {"text": response.generations[0][0].text, "role": "ai"},
"tokensUsage": {
"prompt": token_usage.get("prompt_tokens", 0),
"completion": token_usage.get("completion_tokens", 0),
},
}
self.__send_event(event)
def on_llm_error(
self,
error: Union[Exception, KeyboardInterrupt],
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
**kwargs: Any,
) -> Any:
event = {
"event": "error",
"type": "llm",
"runId": str(run_id),
"parent_run_id": str(parent_run_id) if parent_run_id else None,
"error": {"message": str(error), "stack": traceback.format_exc()},
}
self.__send_event(event)
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
tags: Union[List[str], None] = None,
metadata: Union[Dict[str, Any], None] = None,
**kwargs: Any,
) -> None:
event = {
"event": "start",
"type": "tool",
"userId": (metadata or {}).get("userId"),
"runId": str(run_id),
"parentRunId": str(parent_run_id) if parent_run_id else None,
"name": serialized.get("name"),
"input": input_str,
"tags": tags,
"metadata": metadata,
}
self.__send_event(event)
def on_tool_end(
self,
output: str,
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
tags: Union[List[str], None] = None,
**kwargs: Any,
) -> None:
event = {
"event": "end",
"type": "tool",
"runId": str(run_id),
"parent_run_id": str(parent_run_id) if parent_run_id else None,
"output": output,
}
self.__send_event(event)
def on_chain_start(
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
tags: Union[List[str], None] = None,
metadata: Union[Dict[str, Any], None] = None,
**kwargs: Any,
) -> Any:
name = serialized.get("id", [None, None, None, None])[3]
type = "chain"
agentName = (metadata or {}).get("agentName")
if agentName is not None:
type = "agent"
name = agentName
if name == "AgentExecutor" or name == "PlanAndExecute":
type = "agent"
event = {
"event": "start",
"type": type,
"userId": (metadata or {}).get("userId"),
"runId": str(run_id),
"parentRunId": str(parent_run_id) if parent_run_id else None,
"input": inputs.get("input", inputs),
"tags": tags,
"metadata": metadata,
"name": serialized.get("id", [None, None, None, None])[3],
}
self.__send_event(event)
def on_chain_end(
self,
outputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
**kwargs: Any,
) -> Any:
event = {
"event": "end",
"type": "chain",
"runId": str(run_id),
"output": outputs.get("output", outputs),
}
self.__send_event(event)
def on_chain_error(
self,
error: Union[Exception, KeyboardInterrupt],
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
**kwargs: Any,
) -> Any:
event = {
"event": "error",
"type": "chain",
"runId": str(run_id),
"parent_run_id": str(parent_run_id) if parent_run_id else None,
"error": {"message": str(error), "stack": traceback.format_exc()},
}
self.__send_event(event)
def on_agent_action(
self,
action: AgentAction,
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
**kwargs: Any,
) -> Any:
event = {
"event": "start",
"type": "tool",
"runId": str(run_id),
"parentRunId": str(parent_run_id) if parent_run_id else None,
"name": action.tool,
"input": action.tool_input,
}
self.__send_event(event)
def on_agent_finish(
self,
finish: AgentFinish,
*,
run_id: UUID,
parent_run_id: Union[UUID, None] = None,
**kwargs: Any,
) -> Any:
event = {
"event": "end",
"type": "agent",
"runId": str(run_id),
"parentRunId": str(parent_run_id) if parent_run_id else None,
"output": finish.return_values,
}
self.__send_event(event)
__all__ = ["LLMonitorCallbackHandler"]

View File

@ -242,7 +242,7 @@ class MlflowCallbackHandler(BaseMetadataCallbackHandler, BaseCallbackHandler):
self,
name: Optional[str] = "langchainrun-%",
experiment: Optional[str] = "langchain",
tags: Optional[Dict] = {},
tags: Optional[Dict] = None,
tracking_uri: Optional[str] = None,
) -> None:
"""Initialize callback handler."""
@ -254,7 +254,7 @@ class MlflowCallbackHandler(BaseMetadataCallbackHandler, BaseCallbackHandler):
self.name = name
self.experiment = experiment
self.tags = tags
self.tags = tags or {}
self.tracking_uri = tracking_uri
self.temp_dir = tempfile.TemporaryDirectory()

View File

@ -40,12 +40,12 @@ class PromptLayerCallbackHandler(BaseCallbackHandler):
def __init__(
self,
pl_id_callback: Optional[Callable[..., Any]] = None,
pl_tags: Optional[List[str]] = [],
pl_tags: Optional[List[str]] = None,
) -> None:
"""Initialize the PromptLayerCallbackHandler."""
_lazy_import_promptlayer()
self.pl_id_callback = pl_id_callback
self.pl_tags = pl_tags
self.pl_tags = pl_tags or []
self.runs: Dict[UUID, Dict[str, Any]] = {}
def on_chat_model_start(

View File

@ -3,6 +3,7 @@ from __future__ import annotations
import logging
import os
import weakref
from concurrent.futures import Future, ThreadPoolExecutor, wait
from datetime import datetime
from typing import Any, Callable, Dict, List, Optional, Set, Union
@ -18,8 +19,10 @@ from langchain.schema.messages import BaseMessage
logger = logging.getLogger(__name__)
_LOGGED = set()
_TRACERS: List[LangChainTracer] = []
_TRACERS: weakref.WeakSet[LangChainTracer] = weakref.WeakSet()
_CLIENT: Optional[Client] = None
_MAX_EXECUTORS = 10 # TODO: Remove once write queue is implemented
_EXECUTORS: List[ThreadPoolExecutor] = []
def log_error_once(method: str, exception: Exception) -> None:
@ -34,8 +37,9 @@ def log_error_once(method: str, exception: Exception) -> None:
def wait_for_all_tracers() -> None:
"""Wait for all tracers to finish."""
global _TRACERS
for tracer in _TRACERS:
tracer.wait_for_futures()
for tracer in list(_TRACERS):
if tracer is not None:
tracer.wait_for_futures()
def _get_client() -> Client:
@ -68,17 +72,22 @@ class LangChainTracer(BaseTracer):
"LANGCHAIN_PROJECT", os.getenv("LANGCHAIN_SESSION", "default")
)
if use_threading:
# set max_workers to 1 to process tasks in order
self.executor: Optional[ThreadPoolExecutor] = ThreadPoolExecutor(
max_workers=1
)
global _MAX_EXECUTORS
if len(_EXECUTORS) < _MAX_EXECUTORS:
self.executor: Optional[ThreadPoolExecutor] = ThreadPoolExecutor(
max_workers=1
)
_EXECUTORS.append(self.executor)
else:
self.executor = _EXECUTORS.pop(0)
_EXECUTORS.append(self.executor)
else:
self.executor = None
self.client = client or _get_client()
self._futures: Set[Future] = set()
self.tags = tags or []
global _TRACERS
_TRACERS.append(self)
_TRACERS.add(self)
def on_chat_model_start(
self,

View File

@ -36,6 +36,7 @@ from langchain.chains.flare.base import FlareChain
from langchain.chains.graph_qa.arangodb import ArangoGraphQAChain
from langchain.chains.graph_qa.base import GraphQAChain
from langchain.chains.graph_qa.cypher import GraphCypherQAChain
from langchain.chains.graph_qa.falkordb import FalkorDBQAChain
from langchain.chains.graph_qa.hugegraph import HugeGraphQAChain
from langchain.chains.graph_qa.kuzu import KuzuQAChain
from langchain.chains.graph_qa.nebulagraph import NebulaGraphQAChain
@ -85,6 +86,7 @@ __all__ = [
"ConstitutionalChain",
"ConversationChain",
"ConversationalRetrievalChain",
"FalkorDBQAChain",
"FlareChain",
"GraphCypherQAChain",
"GraphQAChain",

View File

@ -136,6 +136,12 @@ class Chain(Serializable, Runnable[Dict[str, Any], Dict[str, Any]], ABC):
def raise_callback_manager_deprecation(cls, values: Dict) -> Dict:
"""Raise deprecation warning if callback_manager is used."""
if values.get("callback_manager") is not None:
if values.get("callbacks") is not None:
raise ValueError(
"Cannot specify both callback_manager and callbacks. "
"callback_manager is deprecated, callbacks is the preferred "
"parameter to pass in."
)
warnings.warn(
"callback_manager is deprecated. Please use callbacks instead.",
DeprecationWarning,

View File

@ -0,0 +1,141 @@
"""Question answering over a graph."""
from __future__ import annotations
import re
from typing import Any, Dict, List, Optional
from langchain.base_language import BaseLanguageModel
from langchain.callbacks.manager import CallbackManagerForChainRun
from langchain.chains.base import Chain
from langchain.chains.graph_qa.prompts import CYPHER_GENERATION_PROMPT, CYPHER_QA_PROMPT
from langchain.chains.llm import LLMChain
from langchain.graphs import FalkorDBGraph
from langchain.pydantic_v1 import Field
from langchain.schema import BasePromptTemplate
INTERMEDIATE_STEPS_KEY = "intermediate_steps"
def extract_cypher(text: str) -> str:
"""
Extract Cypher code from a text.
Args:
text: Text to extract Cypher code from.
Returns:
Cypher code extracted from the text.
"""
# The pattern to find Cypher code enclosed in triple backticks
pattern = r"```(.*?)```"
# Find all matches in the input text
matches = re.findall(pattern, text, re.DOTALL)
return matches[0] if matches else text
class FalkorDBQAChain(Chain):
"""Chain for question-answering against a graph by generating Cypher statements."""
graph: FalkorDBGraph = Field(exclude=True)
cypher_generation_chain: LLMChain
qa_chain: LLMChain
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
top_k: int = 10
"""Number of results to return from the query"""
return_intermediate_steps: bool = False
"""Whether or not to return the intermediate steps along with the final answer."""
return_direct: bool = False
"""Whether or not to return the result of querying the graph directly."""
@property
def input_keys(self) -> List[str]:
"""Return the input keys.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return the output keys.
:meta private:
"""
_output_keys = [self.output_key]
return _output_keys
@property
def _chain_type(self) -> str:
return "graph_cypher_chain"
@classmethod
def from_llm(
cls,
llm: BaseLanguageModel,
*,
qa_prompt: BasePromptTemplate = CYPHER_QA_PROMPT,
cypher_prompt: BasePromptTemplate = CYPHER_GENERATION_PROMPT,
**kwargs: Any,
) -> FalkorDBQAChain:
"""Initialize from LLM."""
qa_chain = LLMChain(llm=llm, prompt=qa_prompt)
cypher_generation_chain = LLMChain(llm=llm, prompt=cypher_prompt)
return cls(
qa_chain=qa_chain,
cypher_generation_chain=cypher_generation_chain,
**kwargs,
)
def _call(
self,
inputs: Dict[str, Any],
run_manager: Optional[CallbackManagerForChainRun] = None,
) -> Dict[str, Any]:
"""Generate Cypher statement, use it to look up in db and answer question."""
_run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
callbacks = _run_manager.get_child()
question = inputs[self.input_key]
intermediate_steps: List = []
generated_cypher = self.cypher_generation_chain.run(
{"question": question, "schema": self.graph.schema}, callbacks=callbacks
)
# Extract Cypher code if it is wrapped in backticks
generated_cypher = extract_cypher(generated_cypher)
_run_manager.on_text("Generated Cypher:", end="\n", verbose=self.verbose)
_run_manager.on_text(
generated_cypher, color="green", end="\n", verbose=self.verbose
)
intermediate_steps.append({"query": generated_cypher})
# Retrieve and limit the number of results
context = self.graph.query(generated_cypher)[: self.top_k]
if self.return_direct:
final_result = context
else:
_run_manager.on_text("Full Context:", end="\n", verbose=self.verbose)
_run_manager.on_text(
str(context), color="green", end="\n", verbose=self.verbose
)
intermediate_steps.append({"context": context})
result = self.qa_chain(
{"question": question, "context": context},
callbacks=callbacks,
)
final_result = result[self.qa_chain.output_key]
chain_result: Dict[str, Any] = {self.output_key: final_result}
if self.return_intermediate_steps:
chain_result[INTERMEDIATE_STEPS_KEY] = intermediate_steps
return chain_result

View File

@ -120,9 +120,9 @@ class BaseQAWithSourcesChain(Chain, ABC):
def _split_sources(self, answer: str) -> Tuple[str, str]:
"""Split sources from answer."""
if re.search(r"SOURCES?[:\s]", answer, re.IGNORECASE):
if re.search(r"SOURCES?:", answer, re.IGNORECASE):
answer, sources = re.split(
r"SOURCES?[:\s]|QUESTION:\s", answer, flags=re.IGNORECASE
r"SOURCES?:|QUESTION:\s", answer, flags=re.IGNORECASE
)[:2]
sources = re.split(r"\n", sources)[0].strip()
else:

View File

@ -37,7 +37,7 @@ class IMessageChatLoader(chat_loaders.BaseChatLoader):
if not self.db_path.exists():
raise FileNotFoundError(f"File {self.db_path} not found")
try:
pass # type: ignore
import sqlite3 # noqa: F401
except ImportError as e:
raise ImportError(
"The sqlite3 module is required to load iMessage chats.\n"
@ -93,6 +93,7 @@ class IMessageChatLoader(chat_loaders.BaseChatLoader):
Yields:
ChatSession: Loaded chat session.
"""
import sqlite3
try:
conn = sqlite3.connect(self.db_path)

View File

@ -57,12 +57,25 @@ class ErnieBotChat(BaseChatModel):
"""
ernie_client_id: Optional[str] = None
"""Baidu application client id"""
ernie_client_secret: Optional[str] = None
"""Baidu application client secret"""
access_token: Optional[str] = None
"""access token is generated by client id and client secret,
setting this value directly will cause an error"""
model_name: str = "ERNIE-Bot-turbo"
"""model name of ernie, default is `ERNIE-Bot-turbo`.
Currently supported `ERNIE-Bot-turbo`, `ERNIE-Bot`"""
request_timeout: Optional[int] = 60
"""request timeout for chat http requests"""
streaming: Optional[bool] = False
"""streaming mode. not supported yet."""
top_p: Optional[float] = 0.8
temperature: Optional[float] = 0.95
penalty_score: Optional[float] = 1
@ -93,6 +106,7 @@ class ErnieBotChat(BaseChatModel):
raise ValueError(f"Got unknown model_name {self.model_name}")
resp = requests.post(
url,
timeout=self.request_timeout,
headers={
"Content-Type": "application/json",
},
@ -107,6 +121,7 @@ class ErnieBotChat(BaseChatModel):
base_url: str = "https://aip.baidubce.com/oauth/2.0/token"
resp = requests.post(
base_url,
timeout=10,
headers={
"Content-Type": "application/json",
"Accept": "application/json",

View File

@ -33,7 +33,7 @@ class AsyncHtmlLoader(BaseLoader):
verify_ssl: Optional[bool] = True,
proxies: Optional[dict] = None,
requests_per_second: int = 2,
requests_kwargs: Dict[str, Any] = {},
requests_kwargs: Optional[Dict[str, Any]] = None,
raise_for_status: bool = False,
):
"""Initialize with a webpage path."""
@ -67,7 +67,7 @@ class AsyncHtmlLoader(BaseLoader):
self.session.proxies.update(proxies)
self.requests_per_second = requests_per_second
self.requests_kwargs = requests_kwargs
self.requests_kwargs = requests_kwargs or {}
self.raise_for_status = raise_for_status
async def _fetch(

View File

@ -113,27 +113,39 @@ class CubeSemanticLoader(BaseLoader):
- column_title
- column_description
- column_values
- cube_data_obj_type
"""
headers = {
"Content-Type": "application/json",
"Authorization": self.cube_api_token,
}
logger.info(f"Loading metadata from {self.cube_api_url}...")
response = requests.get(f"{self.cube_api_url}/meta", headers=headers)
response.raise_for_status()
raw_meta_json = response.json()
cubes = raw_meta_json.get("cubes", [])
cube_data_objects = raw_meta_json.get("cubes", [])
logger.info(f"Found {len(cube_data_objects)} cube data objects in metadata.")
if not cube_data_objects:
raise ValueError("No cubes found in metadata.")
docs = []
for cube in cubes:
if cube.get("type") != "view":
for cube_data_obj in cube_data_objects:
cube_data_obj_name = cube_data_obj.get("name")
cube_data_obj_type = cube_data_obj.get("type")
cube_data_obj_is_public = cube_data_obj.get("public")
measures = cube_data_obj.get("measures", [])
dimensions = cube_data_obj.get("dimensions", [])
logger.info(f"Processing {cube_data_obj_name}...")
if not cube_data_obj_is_public:
logger.info(f"Skipping {cube_data_obj_name} because it is not public.")
continue
cube_name = cube.get("name")
measures = cube.get("measures", [])
dimensions = cube.get("dimensions", [])
for item in measures + dimensions:
column_member_type = "measure" if item in measures else "dimension"
dimension_values = []
@ -148,13 +160,14 @@ class CubeSemanticLoader(BaseLoader):
dimension_values = self._get_dimension_values(item_name)
metadata = dict(
table_name=str(cube_name),
table_name=str(cube_data_obj_name),
column_name=item_name,
column_data_type=item_type,
column_title=str(item.get("title")),
column_description=str(item.get("description")),
column_member_type=column_member_type,
column_values=dimension_values,
cube_data_obj_type=cube_data_obj_type,
)
page_content = f"{str(item.get('title'))}, "

View File

@ -74,6 +74,7 @@ class SeleniumURLLoader(BaseLoader):
if self.browser.lower() == "chrome":
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options as ChromeOptions
from selenium.webdriver.chrome.service import Service
chrome_options = ChromeOptions()
@ -87,10 +88,14 @@ class SeleniumURLLoader(BaseLoader):
chrome_options.binary_location = self.binary_location
if self.executable_path is None:
return Chrome(options=chrome_options)
return Chrome(executable_path=self.executable_path, options=chrome_options)
return Chrome(
options=chrome_options,
service=Service(executable_path=self.executable_path),
)
elif self.browser.lower() == "firefox":
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium.webdriver.firefox.service import Service
firefox_options = FirefoxOptions()
@ -104,7 +109,8 @@ class SeleniumURLLoader(BaseLoader):
if self.executable_path is None:
return Firefox(options=firefox_options)
return Firefox(
executable_path=self.executable_path, options=firefox_options
options=firefox_options,
service=Service(executable_path=self.executable_path),
)
else:
raise ValueError("Invalid browser specified. Use 'chrome' or 'firefox'.")

View File

@ -1,6 +1,7 @@
"""**Graphs** provide a natural language interface to graph databases."""
from langchain.graphs.arangodb_graph import ArangoGraph
from langchain.graphs.falkordb_graph import FalkorDBGraph
from langchain.graphs.hugegraph import HugeGraph
from langchain.graphs.kuzu_graph import KuzuGraph
from langchain.graphs.memgraph_graph import MemgraphGraph
@ -20,4 +21,5 @@ __all__ = [
"HugeGraph",
"RdfGraph",
"ArangoGraph",
"FalkorDBGraph",
]

View File

@ -0,0 +1,67 @@
from typing import Any, Dict, List
node_properties_query = """
MATCH (n)
UNWIND labels(n) as l
UNWIND keys(n) as p
RETURN {label:l, properties: collect(distinct p)} AS output
"""
rel_properties_query = """
MATCH ()-[r]->()
UNWIND keys(r) as p
RETURN {type:type(r), properties: collect(distinct p)} AS output
"""
rel_query = """
MATCH (n)-[r]->(m)
WITH labels(n)[0] AS src, labels(m)[0] AS dst, type(r) AS type
RETURN DISTINCT "(:" + src + ")-[:" + type + "]->(:" + dst + ")" AS output
"""
class FalkorDBGraph:
"""FalkorDB wrapper for graph operations."""
def __init__(
self, database: str, host: str = "localhost", port: int = 6379
) -> None:
"""Create a new FalkorDB graph wrapper instance."""
try:
import redis
from redis.commands.graph import Graph
except ImportError:
raise ImportError(
"Could not import redis python package. "
"Please install it with `pip install redis`."
)
self._driver = redis.Redis(host=host, port=port)
self._graph = Graph(self._driver, database)
try:
self.refresh_schema()
except Exception as e:
raise ValueError(f"Could not refresh schema. Error: {e}")
@property
def get_schema(self) -> str:
"""Returns the schema of the FalkorDB database"""
return self.schema
def refresh_schema(self) -> None:
"""Refreshes the schema of the FalkorDB database"""
self.schema = (
f"Node properties: {node_properties_query}\n"
f"Relationships properties: {rel_properties_query}\n"
f"Relationships: {rel_query}\n"
)
def query(self, query: str, params: dict = {}) -> List[Dict[str, Any]]:
"""Query FalkorDB database."""
try:
data = self._graph.query(query, params)
return data.result_set
except Exception as e:
raise ValueError("Generated Cypher Statement is not valid\n" f"{e}")

View File

@ -1,6 +1,6 @@
import logging
from string import Template
from typing import Any, Dict
from typing import Any, Dict, Optional
logger = logging.getLogger(__name__)
@ -106,11 +106,12 @@ class NebulaGraph:
"""Returns the schema of the NebulaGraph database"""
return self.schema
def execute(self, query: str, params: dict = {}, retry: int = 0) -> Any:
def execute(self, query: str, params: Optional[dict] = None, retry: int = 0) -> Any:
"""Query NebulaGraph database."""
from nebula3.Exception import IOErrorException, NoValidSessionException
from nebula3.fbthrift.transport.TTransport import TTransportException
params = params or {}
try:
result = self.session_pool.execute_parameter(query, params)
if not result.is_succeeded():

View File

@ -90,8 +90,8 @@ class PromptGuard(LLM):
_run_manager = run_manager or CallbackManagerForLLMRun.get_noop_manager()
# sanitize the prompt by replacing the sensitive information with a placeholder
sanitize_response: pg.SanitizeResponse = pg.sanitize(prompt)
sanitized_prompt_value_str = sanitize_response.sanitized_text
sanitize_response: pg.SanitizeResponse = pg.sanitize([prompt])
sanitized_prompt_value_str = sanitize_response.sanitized_texts[0]
# TODO: Add in callbacks once child runs for LLMs are supported by LangSmith.
# call the LLM with the sanitized prompt and get the response

View File

@ -183,9 +183,10 @@ def make_request(
instruction: str,
conversation: str,
url: str = f"{DEFAULT_NEBULA_SERVICE_URL}{DEFAULT_NEBULA_SERVICE_PATH}",
params: Dict = {},
params: Optional[Dict] = None,
) -> Any:
"""Generate text from the model."""
params = params or {}
headers = {
"Content-Type": "application/json",
"ApiKey": f"{self.nebula_api_key}",

View File

@ -114,7 +114,13 @@ class GoogleCloudEnterpriseSearchRetriever(BaseRetriever):
def __init__(self, **data: Any) -> None:
"""Initializes private fields."""
from google.cloud.discoveryengine_v1beta import SearchServiceClient
try:
from google.cloud.discoveryengine_v1beta import SearchServiceClient
except ImportError:
raise ImportError(
"google.cloud.discoveryengine is not installed."
"Please install it with pip install google-cloud-discoveryengine"
)
super().__init__(**data)
self._client = SearchServiceClient(credentials=self.credentials)
@ -137,7 +143,7 @@ class GoogleCloudEnterpriseSearchRetriever(BaseRetriever):
document_dict = MessageToDict(
result.document._pb, preserving_proto_field_name=True
)
derived_struct_data = document_dict.get("derived_struct_data", None)
derived_struct_data = document_dict.get("derived_struct_data")
if not derived_struct_data:
continue
@ -150,7 +156,7 @@ class GoogleCloudEnterpriseSearchRetriever(BaseRetriever):
else "extractive_segments"
)
for chunk in getattr(derived_struct_data, chunk_type, []):
for chunk in derived_struct_data.get(chunk_type, []):
doc_metadata["source"] = derived_struct_data.get("link", "")
if chunk_type == "extractive_answers":

View File

@ -1,8 +1,7 @@
from typing import List
from pydantic import Field
from langchain.callbacks.manager import CallbackManagerForRetrieverRun
from langchain.pydantic_v1 import Field
from langchain.schema import BaseRetriever, BaseStore, Document
from langchain.vectorstores import VectorStore

View File

@ -1,8 +1,9 @@
"""**Schemas** are the LangChain Base Classes and Interfaces."""
from langchain.schema.agent import AgentAction, AgentFinish
from langchain.schema.chat_history import BaseChatMessageHistory
from langchain.schema.document import BaseDocumentTransformer, Document
from langchain.schema.exceptions import LangChainException
from langchain.schema.memory import BaseChatMessageHistory, BaseMemory
from langchain.schema.memory import BaseMemory
from langchain.schema.messages import (
AIMessage,
BaseMessage,
@ -40,10 +41,10 @@ Memory = BaseMemory
__all__ = [
"BaseMemory",
"BaseStore",
"BaseChatMessageHistory",
"AgentFinish",
"AgentAction",
"Document",
"BaseChatMessageHistory",
"BaseDocumentTransformer",
"BaseMessage",
"ChatMessage",

View File

@ -0,0 +1,67 @@
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import List
from langchain.schema.messages import AIMessage, BaseMessage, HumanMessage
class BaseChatMessageHistory(ABC):
"""Abstract base class for storing chat message history.
See `ChatMessageHistory` for default implementation.
Example:
.. code-block:: python
class FileChatMessageHistory(BaseChatMessageHistory):
storage_path: str
session_id: str
@property
def messages(self):
with open(os.path.join(storage_path, session_id), 'r:utf-8') as f:
messages = json.loads(f.read())
return messages_from_dict(messages)
def add_message(self, message: BaseMessage) -> None:
messages = self.messages.append(_message_to_dict(message))
with open(os.path.join(storage_path, session_id), 'w') as f:
json.dump(f, messages)
def clear(self):
with open(os.path.join(storage_path, session_id), 'w') as f:
f.write("[]")
"""
messages: List[BaseMessage]
"""A list of Messages stored in-memory."""
def add_user_message(self, message: str) -> None:
"""Convenience method for adding a human message string to the store.
Args:
message: The string contents of a human message.
"""
self.add_message(HumanMessage(content=message))
def add_ai_message(self, message: str) -> None:
"""Convenience method for adding an AI message string to the store.
Args:
message: The string contents of an AI message.
"""
self.add_message(AIMessage(content=message))
@abstractmethod
def add_message(self, message: BaseMessage) -> None:
"""Add a Message object to the store.
Args:
message: A BaseMessage object to store.
"""
raise NotImplementedError()
@abstractmethod
def clear(self) -> None:
"""Remove all messages from the store"""

View File

@ -4,7 +4,6 @@ from abc import ABC, abstractmethod
from typing import Any, Dict, List
from langchain.load.serializable import Serializable
from langchain.schema.messages import AIMessage, BaseMessage, HumanMessage
class BaseMemory(Serializable, ABC):
@ -58,64 +57,3 @@ class BaseMemory(Serializable, ABC):
@abstractmethod
def clear(self) -> None:
"""Clear memory contents."""
class BaseChatMessageHistory(ABC):
"""Abstract base class for storing chat message history.
See `ChatMessageHistory` for default implementation.
Example:
.. code-block:: python
class FileChatMessageHistory(BaseChatMessageHistory):
storage_path: str
session_id: str
@property
def messages(self):
with open(os.path.join(storage_path, session_id), 'r:utf-8') as f:
messages = json.loads(f.read())
return messages_from_dict(messages)
def add_message(self, message: BaseMessage) -> None:
messages = self.messages.append(_message_to_dict(message))
with open(os.path.join(storage_path, session_id), 'w') as f:
json.dump(f, messages)
def clear(self):
with open(os.path.join(storage_path, session_id), 'w') as f:
f.write("[]")
"""
messages: List[BaseMessage]
"""A list of Messages stored in-memory."""
def add_user_message(self, message: str) -> None:
"""Convenience method for adding a human message string to the store.
Args:
message: The string contents of a human message.
"""
self.add_message(HumanMessage(content=message))
def add_ai_message(self, message: str) -> None:
"""Convenience method for adding an AI message string to the store.
Args:
message: The string contents of an AI message.
"""
self.add_message(AIMessage(content=message))
@abstractmethod
def add_message(self, message: BaseMessage) -> None:
"""Add a Message object to the store.
Args:
message: A BaseMessage object to store.
"""
raise NotImplementedError()
@abstractmethod
def clear(self) -> None:
"""Remove all messages from the store"""

View File

@ -48,6 +48,8 @@ class EvalConfig(BaseModel):
for field, val in self:
if field == "evaluator_type":
continue
elif val is None:
continue
kwargs[field] = val
return kwargs

View File

@ -2,6 +2,7 @@
from __future__ import annotations
import asyncio
import inspect
import warnings
from abc import abstractmethod
from functools import partial
@ -437,7 +438,7 @@ class Tool(BaseTool):
"""Tool that takes in function or coroutine directly."""
description: str = ""
func: Callable[..., str]
func: Optional[Callable[..., str]]
"""The function to run when the tool is called."""
coroutine: Optional[Callable[..., Awaitable[str]]] = None
"""The asynchronous version of the function."""
@ -488,16 +489,18 @@ class Tool(BaseTool):
**kwargs: Any,
) -> Any:
"""Use the tool."""
new_argument_supported = signature(self.func).parameters.get("callbacks")
return (
self.func(
*args,
callbacks=run_manager.get_child() if run_manager else None,
**kwargs,
if self.func:
new_argument_supported = signature(self.func).parameters.get("callbacks")
return (
self.func(
*args,
callbacks=run_manager.get_child() if run_manager else None,
**kwargs,
)
if new_argument_supported
else self.func(*args, **kwargs)
)
if new_argument_supported
else self.func(*args, **kwargs)
)
raise NotImplementedError("Tool does not support sync")
async def _arun(
self,
@ -523,7 +526,7 @@ class Tool(BaseTool):
# TODO: this is for backwards compatibility, remove in future
def __init__(
self, name: str, func: Callable, description: str, **kwargs: Any
self, name: str, func: Optional[Callable], description: str, **kwargs: Any
) -> None:
"""Initialize tool."""
super(Tool, self).__init__(
@ -533,17 +536,23 @@ class Tool(BaseTool):
@classmethod
def from_function(
cls,
func: Callable,
func: Optional[Callable],
name: str, # We keep these required to support backwards compatibility
description: str,
return_direct: bool = False,
args_schema: Optional[Type[BaseModel]] = None,
coroutine: Optional[
Callable[..., Awaitable[Any]]
] = None, # This is last for compatibility, but should be after func
**kwargs: Any,
) -> Tool:
"""Initialize tool from a function."""
if func is None and coroutine is None:
raise ValueError("Function and/or coroutine must be provided")
return cls(
name=name,
func=func,
coroutine=coroutine,
description=description,
return_direct=return_direct,
args_schema=args_schema,
@ -557,7 +566,7 @@ class StructuredTool(BaseTool):
description: str = ""
args_schema: Type[BaseModel] = Field(..., description="The tool schema.")
"""The input arguments' schema."""
func: Callable[..., Any]
func: Optional[Callable[..., Any]]
"""The function to run when the tool is called."""
coroutine: Optional[Callable[..., Awaitable[Any]]] = None
"""The asynchronous version of the function."""
@ -592,16 +601,18 @@ class StructuredTool(BaseTool):
**kwargs: Any,
) -> Any:
"""Use the tool."""
new_argument_supported = signature(self.func).parameters.get("callbacks")
return (
self.func(
*args,
callbacks=run_manager.get_child() if run_manager else None,
**kwargs,
if self.func:
new_argument_supported = signature(self.func).parameters.get("callbacks")
return (
self.func(
*args,
callbacks=run_manager.get_child() if run_manager else None,
**kwargs,
)
if new_argument_supported
else self.func(*args, **kwargs)
)
if new_argument_supported
else self.func(*args, **kwargs)
)
raise NotImplementedError("Tool does not support sync")
async def _arun(
self,
@ -628,7 +639,8 @@ class StructuredTool(BaseTool):
@classmethod
def from_function(
cls,
func: Callable,
func: Optional[Callable] = None,
coroutine: Optional[Callable[..., Awaitable[Any]]] = None,
name: Optional[str] = None,
description: Optional[str] = None,
return_direct: bool = False,
@ -642,6 +654,7 @@ class StructuredTool(BaseTool):
Args:
func: The function from which to create a tool
coroutine: The async function from which to create a tool
name: The name of the tool. Defaults to the function name
description: The description of the tool. Defaults to the function docstring
return_direct: Whether to return the result directly or as a callback
@ -662,21 +675,31 @@ class StructuredTool(BaseTool):
tool = StructuredTool.from_function(add)
tool.run(1, 2) # 3
"""
name = name or func.__name__
description = description or func.__doc__
assert (
description is not None
), "Function must have a docstring if description not provided."
if func is not None:
source_function = func
elif coroutine is not None:
source_function = coroutine
else:
raise ValueError("Function and/or coroutine must be provided")
name = name or source_function.__name__
description = description or source_function.__doc__
if description is None:
raise ValueError(
"Function must have a docstring if description not provided."
)
# Description example:
# search_api(query: str) - Searches the API for the query.
description = f"{name}{signature(func)} - {description.strip()}"
sig = signature(source_function)
description = f"{name}{sig} - {description.strip()}"
_args_schema = args_schema
if _args_schema is None and infer_schema:
_args_schema = create_schema_from_function(f"{name}Schema", func)
_args_schema = create_schema_from_function(f"{name}Schema", source_function)
return cls(
name=name,
func=func,
coroutine=coroutine,
args_schema=_args_schema,
description=description,
return_direct=return_direct,
@ -720,10 +743,18 @@ def tool(
"""
def _make_with_name(tool_name: str) -> Callable:
def _make_tool(func: Callable) -> BaseTool:
def _make_tool(dec_func: Callable) -> BaseTool:
if inspect.iscoroutinefunction(dec_func):
coroutine = dec_func
func = None
else:
coroutine = None
func = dec_func
if infer_schema or args_schema is not None:
return StructuredTool.from_function(
func,
coroutine,
name=tool_name,
return_direct=return_direct,
args_schema=args_schema,
@ -731,12 +762,17 @@ def tool(
)
# If someone doesn't want a schema applied, we must treat it as
# a simple string->string function
assert func.__doc__ is not None, "Function must have a docstring"
if func.__doc__ is None:
raise ValueError(
"Function must have a docstring if "
"description not provided and infer_schema is False."
)
return Tool(
name=tool_name,
func=func,
description=f"{tool_name} tool",
return_direct=return_direct,
coroutine=coroutine,
)
return _make_tool

View File

@ -1,4 +1,3 @@
import json
from typing import Dict, Union
@ -41,9 +40,9 @@ def sanitize(
if isinstance(input, str):
# the input could be a string, so we sanitize the string
sanitize_response: pg.SanitizeResponse = pg.sanitize(input)
sanitize_response: pg.SanitizeResponse = pg.sanitize([input])
return {
"sanitized_input": sanitize_response.sanitized_text,
"sanitized_input": sanitize_response.sanitized_texts[0],
"secure_context": sanitize_response.secure_context,
}
@ -54,13 +53,12 @@ def sanitize(
# get the values from the dict
for key in input:
values.append(input[key])
input_value_str = json.dumps(values)
# sanitize the values
sanitize_values_response: pg.SanitizeResponse = pg.sanitize(input_value_str)
sanitize_values_response: pg.SanitizeResponse = pg.sanitize(values)
# reconstruct the dict with the sanitized values
sanitized_input_values = json.loads(sanitize_values_response.sanitized_text)
sanitized_input_values = sanitize_values_response.sanitized_texts
idx = 0
sanitized_input = dict()
for key in input:

View File

@ -52,6 +52,7 @@ from langchain.vectorstores.meilisearch import Meilisearch
from langchain.vectorstores.milvus import Milvus
from langchain.vectorstores.mongodb_atlas import MongoDBAtlasVectorSearch
from langchain.vectorstores.myscale import MyScale, MyScaleSettings
from langchain.vectorstores.neo4j_vector import Neo4jVector
from langchain.vectorstores.opensearch_vector_search import OpenSearchVectorSearch
from langchain.vectorstores.pgembedding import PGEmbedding
from langchain.vectorstores.pgvector import PGVector
@ -110,6 +111,7 @@ __all__ = [
"MongoDBAtlasVectorSearch",
"MyScale",
"MyScaleSettings",
"Neo4jVector",
"OpenSearchVectorSearch",
"OpenSearchVectorSearch",
"PGEmbedding",

View File

@ -62,6 +62,7 @@ class DeepLake(VectorStore):
num_workers: int = 0,
verbose: bool = True,
exec_option: Optional[str] = None,
runtime: Optional[Dict] = None,
**kwargs: Any,
) -> None:
"""Creates an empty DeepLakeVectorStore or loads an existing one.
@ -77,7 +78,7 @@ class DeepLake(VectorStore):
>>> # Create a vector store in the Deep Lake Managed Tensor Database
>>> data = DeepLake(
... path = "hub://org_id/dataset_name",
... exec_option = "tensor_db",
... runtime = {"tensor_db": True},
... )
Args:
@ -114,6 +115,10 @@ class DeepLake(VectorStore):
responsible for storage and query execution. Only for data stored in
the Deep Lake Managed Database. Use runtime = {"db_engine": True}
during dataset creation.
runtime (Dict, optional): Parameters for creating the Vector Store in
Deep Lake's Managed Tensor Database. Not applicable when loading an
existing Vector Store. To create a Vector Store in the Managed Tensor
Database, set `runtime = {"tensor_db": True}`.
**kwargs: Other optional keyword arguments.
Raises:
@ -131,11 +136,12 @@ class DeepLake(VectorStore):
)
if (
kwargs.get("runtime") == {"tensor_db": True}
runtime == {"tensor_db": True}
and version_compare(deeplake.__version__, "3.6.7") == -1
):
raise ImportError(
"To use tensor_db option you need to update deeplake to `3.6.7`. "
"To use tensor_db option you need to update deeplake to `3.6.7` or "
"higher. "
f"Currently installed deeplake version is {deeplake.__version__}. "
)
@ -154,6 +160,7 @@ class DeepLake(VectorStore):
token=token,
exec_option=exec_option,
verbose=verbose,
runtime=runtime,
**kwargs,
)

View File

@ -736,6 +736,8 @@ class FAISS(VectorStore):
elif self.distance_strategy == DistanceStrategy.EUCLIDEAN_DISTANCE:
# Default behavior is to use euclidean distance relevancy
return self._euclidean_relevance_score_fn
elif self.distance_strategy == DistanceStrategy.COSINE:
return self._cosine_relevance_score_fn
else:
raise ValueError(
"Unknown distance strategy, must be cosine, max_inner_product,"

View File

@ -372,10 +372,10 @@ class Marqo(VectorStore):
index_name: str = "",
url: str = "http://localhost:8882",
api_key: str = "",
add_documents_settings: Optional[Dict[str, Any]] = {},
add_documents_settings: Optional[Dict[str, Any]] = None,
searchable_attributes: Optional[List[str]] = None,
page_content_builder: Optional[Callable[[Dict[str, str]], str]] = None,
index_settings: Optional[Dict[str, Any]] = {},
index_settings: Optional[Dict[str, Any]] = None,
verbose: bool = True,
**kwargs: Any,
) -> Marqo:
@ -435,7 +435,7 @@ class Marqo(VectorStore):
client = marqo.Client(url=url, api_key=api_key)
try:
client.create_index(index_name, settings_dict=index_settings)
client.create_index(index_name, settings_dict=index_settings or {})
if verbose:
print(f"Created {index_name} successfully.")
except Exception:
@ -446,7 +446,7 @@ class Marqo(VectorStore):
client,
index_name,
searchable_attributes=searchable_attributes,
add_documents_settings=add_documents_settings,
add_documents_settings=add_documents_settings or {},
page_content_builder=page_content_builder,
)
instance.add_texts(texts, metadatas)

View File

@ -0,0 +1,685 @@
from __future__ import annotations
import logging
import uuid
from typing import (
Any,
Callable,
Dict,
Iterable,
List,
Optional,
Tuple,
Type,
)
from langchain.docstore.document import Document
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_env
from langchain.vectorstores.base import VectorStore
from langchain.vectorstores.utils import DistanceStrategy
DEFAULT_DISTANCE_STRATEGY = DistanceStrategy.COSINE
distance_mapping = {
DistanceStrategy.EUCLIDEAN_DISTANCE: "euclidean",
DistanceStrategy.COSINE: "cosine",
}
def check_if_not_null(props: List[str], values: List[Any]) -> None:
for prop, value in zip(props, values):
if not value:
raise ValueError(f"Parameter `{prop}` must not be None or empty string")
def sort_by_index_name(
lst: List[Dict[str, Any]], index_name: str
) -> List[Dict[str, Any]]:
"""Sort first element to match the index_name if exists"""
return sorted(lst, key=lambda x: x.get("index_name") != index_name)
class Neo4jVector(VectorStore):
"""`Neo4j` vector index.
To use, you should have the ``neo4j`` python package installed.
Args:
url: Neo4j connection url
username: Neo4j username.
password: Neo4j password
database: Optionally provide Neo4j database
Defaults to "neo4j"
embedding: Any embedding function implementing
`langchain.embeddings.base.Embeddings` interface.
distance_strategy: The distance strategy to use. (default: COSINE)
pre_delete_collection: If True, will delete existing data if it exists.
(default: False). Useful for testing.
Example:
.. code-block:: python
from langchain.vectorstores.neo4j_vector import Neo4jVector
from langchain.embeddings.openai import OpenAIEmbeddings
url="bolt://localhost:7687"
username="neo4j"
password="pleaseletmein"
embeddings = OpenAIEmbeddings()
vectorestore = Neo4jVector.from_documents(
embedding=embeddings,
documents=docs,
url=url
username=username,
password=password,
)
"""
def __init__(
self,
embedding: Embeddings,
*,
username: Optional[str] = None,
password: Optional[str] = None,
url: Optional[str] = None,
database: str = "neo4j",
index_name: str = "vector",
node_label: str = "Chunk",
embedding_node_property: str = "embedding",
text_node_property: str = "text",
distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
logger: Optional[logging.Logger] = None,
pre_delete_collection: bool = False,
retrieval_query: str = "",
relevance_score_fn: Optional[Callable[[float], float]] = None,
) -> None:
try:
import neo4j
except ImportError:
raise ImportError(
"Could not import neo4j python package. "
"Please install it with `pip install neo4j`."
)
# Allow only cosine and euclidean distance strategies
if distance_strategy not in [
DistanceStrategy.EUCLIDEAN_DISTANCE,
DistanceStrategy.COSINE,
]:
raise ValueError(
"distance_strategy must be either 'EUCLIDEAN_DISTANCE' or 'COSINE'"
)
# Handle if the credentials are environment variables
url = get_from_env("url", "NEO4J_URL", url)
username = get_from_env("username", "NEO4J_USERNAME", username)
password = get_from_env("password", "NEO4J_PASSWORD", password)
database = get_from_env("database", "NEO4J_DATABASE", database)
self._driver = neo4j.GraphDatabase.driver(url, auth=(username, password))
self._database = database
self.schema = ""
# Verify connection
try:
self._driver.verify_connectivity()
except neo4j.exceptions.ServiceUnavailable:
raise ValueError(
"Could not connect to Neo4j database. "
"Please ensure that the url is correct"
)
except neo4j.exceptions.AuthError:
raise ValueError(
"Could not connect to Neo4j database. "
"Please ensure that the username and password are correct"
)
# Verify if the version support vector index
self.verify_version()
# Verify that required values are not null
check_if_not_null(
[
"index_name",
"node_label",
"embedding_node_property",
"text_node_property",
],
[index_name, node_label, embedding_node_property, text_node_property],
)
self.embedding = embedding
self._distance_strategy = distance_strategy
self.index_name = index_name
self.node_label = node_label
self.embedding_node_property = embedding_node_property
self.text_node_property = text_node_property
self.logger = logger or logging.getLogger(__name__)
self.override_relevance_score_fn = relevance_score_fn
self.retrieval_query = retrieval_query
# Calculate embedding dimension
self.embedding_dimension = len(embedding.embed_query("foo"))
# Delete existing data if flagged
if pre_delete_collection:
from neo4j.exceptions import DatabaseError
self.query(
f"MATCH (n:`{self.node_label}`) "
"CALL { WITH n DETACH DELETE n } "
"IN TRANSACTIONS OF 10000 ROWS;"
)
# Delete index
try:
self.query(f"DROP INDEX {self.index_name}")
except DatabaseError: # Index didn't exist yet
pass
def query(
self, query: str, *, params: Optional[dict] = None
) -> List[Dict[str, Any]]:
"""
This method sends a Cypher query to the connected Neo4j database
and returns the results as a list of dictionaries.
Args:
query (str): The Cypher query to execute.
params (dict, optional): Dictionary of query parameters. Defaults to {}.
Returns:
List[Dict[str, Any]]: List of dictionaries containing the query results.
"""
from neo4j.exceptions import CypherSyntaxError
params = params or {}
with self._driver.session(database=self._database) as session:
try:
data = session.run(query, params)
return [r.data() for r in data]
except CypherSyntaxError as e:
raise ValueError(f"Cypher Statement is not valid\n{e}")
def verify_version(self) -> None:
"""
Check if the connected Neo4j database version supports vector indexing.
Queries the Neo4j database to retrieve its version and compares it
against a target version (5.11.0) that is known to support vector
indexing. Raises a ValueError if the connected Neo4j version is
not supported.
"""
version = self.query("CALL dbms.components()")[0]["versions"][0]
if "aura" in version:
version_tuple = tuple(map(int, version.split("-")[0].split("."))) + (0,)
else:
version_tuple = tuple(map(int, version.split(".")))
target_version = (5, 11, 0)
if version_tuple < target_version:
raise ValueError(
"Version index is only supported in Neo4j version 5.11 or greater"
)
def retrieve_existing_index(self) -> Optional[int]:
"""
Check if the vector index exists in the Neo4j database
and returns its embedding dimension.
This method queries the Neo4j database for existing indexes
and attempts to retrieve the dimension of the vector index
with the specified name. If the index exists, its dimension is returned.
If the index doesn't exist, `None` is returned.
Returns:
int or None: The embedding dimension of the existing index if found.
"""
index_information = self.query(
"SHOW INDEXES YIELD name, type, labelsOrTypes, properties, options "
"WHERE type = 'VECTOR' AND (name = $index_name "
"OR (labelsOrTypes[0] = $node_label AND "
"properties[0] = $embedding_node_property)) "
"RETURN name, labelsOrTypes, properties, options ",
params={
"index_name": self.index_name,
"node_label": self.node_label,
"embedding_node_property": self.embedding_node_property,
},
)
# sort by index_name
index_information = sort_by_index_name(index_information, self.index_name)
try:
self.index_name = index_information[0]["name"]
self.node_label = index_information[0]["labelsOrTypes"][0]
self.embedding_node_property = index_information[0]["properties"][0]
embedding_dimension = index_information[0]["options"]["indexConfig"][
"vector.dimensions"
]
return embedding_dimension
except IndexError:
return None
def create_new_index(self) -> None:
"""
This method constructs a Cypher query and executes it
to create a new vector index in Neo4j.
"""
index_query = (
"CALL db.index.vector.createNodeIndex("
"$index_name,"
"$node_label,"
"$embedding_node_property,"
"toInteger($embedding_dimension),"
"$similarity_metric )"
)
parameters = {
"index_name": self.index_name,
"node_label": self.node_label,
"embedding_node_property": self.embedding_node_property,
"embedding_dimension": self.embedding_dimension,
"similarity_metric": distance_mapping[self._distance_strategy],
}
self.query(index_query, params=parameters)
@property
def embeddings(self) -> Embeddings:
return self.embedding
@classmethod
def __from(
cls,
texts: List[str],
embeddings: List[List[float]],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
create_id_index: bool = True,
**kwargs: Any,
) -> Neo4jVector:
if ids is None:
ids = [str(uuid.uuid1()) for _ in texts]
if not metadatas:
metadatas = [{} for _ in texts]
store = cls(
embedding=embedding,
**kwargs,
)
# Check if the index already exists
embedding_dimension = store.retrieve_existing_index()
# If the index doesn't exist yet
if not embedding_dimension:
store.create_new_index()
# If the index already exists, check if embedding dimensions match
elif not store.embedding_dimension == embedding_dimension:
raise ValueError(
f"Index with name {store.index_name} already exists."
"The provided embedding function and vector index "
"dimensions do not match.\n"
f"Embedding function dimension: {store.embedding_dimension}\n"
f"Vector index dimension: {embedding_dimension}"
)
# Create unique constraint for faster import
if create_id_index:
store.query(
"CREATE CONSTRAINT IF NOT EXISTS "
f"FOR (n:`{store.node_label}`) REQUIRE n.id IS UNIQUE;"
)
store.add_embeddings(
texts=texts, embeddings=embeddings, metadatas=metadatas, ids=ids, **kwargs
)
return store
def add_embeddings(
self,
texts: Iterable[str],
embeddings: List[List[float]],
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> List[str]:
"""Add embeddings to the vectorstore.
Args:
texts: Iterable of strings to add to the vectorstore.
embeddings: List of list of embedding vectors.
metadatas: List of metadatas associated with the texts.
kwargs: vectorstore specific parameters
"""
if ids is None:
ids = [str(uuid.uuid1()) for _ in texts]
if not metadatas:
metadatas = [{} for _ in texts]
import_query = (
"UNWIND $data AS row "
"CALL { WITH row "
f"MERGE (c:`{self.node_label}` {{id: row.id}}) "
"WITH c, row "
f"CALL db.create.setVectorProperty(c, "
f"'{self.embedding_node_property}', row.embedding) "
"YIELD node "
f"SET c.`{self.text_node_property}` = row.text "
"SET c += row.metadata } IN TRANSACTIONS OF 1000 ROWS"
)
parameters = {
"data": [
{"text": text, "metadata": metadata, "embedding": embedding, "id": id}
for text, metadata, embedding, id in zip(
texts, metadatas, embeddings, ids
)
]
}
self.query(import_query, params=parameters)
return ids
def add_texts(
self,
texts: Iterable[str],
metadatas: Optional[List[dict]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> List[str]:
"""Run more texts through the embeddings and add to the vectorstore.
Args:
texts: Iterable of strings to add to the vectorstore.
metadatas: Optional list of metadatas associated with the texts.
kwargs: vectorstore specific parameters
Returns:
List of ids from adding the texts into the vectorstore.
"""
embeddings = self.embedding.embed_documents(list(texts))
return self.add_embeddings(
texts=texts, embeddings=embeddings, metadatas=metadatas, ids=ids, **kwargs
)
def similarity_search(
self,
query: str,
k: int = 4,
**kwargs: Any,
) -> List[Document]:
"""Run similarity search with Neo4jVector.
Args:
query (str): Query text to search for.
k (int): Number of results to return. Defaults to 4.
Returns:
List of Documents most similar to the query.
"""
embedding = self.embedding.embed_query(text=query)
return self.similarity_search_by_vector(
embedding=embedding,
k=k,
)
def similarity_search_with_score(
self, query: str, k: int = 4
) -> List[Tuple[Document, float]]:
"""Return docs most similar to query.
Args:
query: Text to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
Returns:
List of Documents most similar to the query and score for each
"""
embedding = self.embedding.embed_query(query)
docs = self.similarity_search_with_score_by_vector(embedding=embedding, k=k)
return docs
def similarity_search_with_score_by_vector(
self, embedding: List[float], k: int = 4
) -> List[Tuple[Document, float]]:
"""
Perform a similarity search in the Neo4j database using a
given vector and return the top k similar documents with their scores.
This method uses a Cypher query to find the top k documents that
are most similar to a given embedding. The similarity is measured
using a vector index in the Neo4j database. The results are returned
as a list of tuples, each containing a Document object and
its similarity score.
Args:
embedding (List[float]): The embedding vector to compare against.
k (int, optional): The number of top similar documents to retrieve.
Returns:
List[Tuple[Document, float]]: A list of tuples, each containing
a Document object and its similarity score.
"""
default_retrieval = (
f"RETURN node.`{self.text_node_property}` AS text, score, "
f"node {{.*, `{self.text_node_property}`: Null, "
f"`{self.embedding_node_property}`: Null, id: Null }} AS metadata"
)
retrieval_query = (
self.retrieval_query if self.retrieval_query else default_retrieval
)
read_query = (
"CALL db.index.vector.queryNodes($index, $k, $embedding) "
"YIELD node, score "
) + retrieval_query
parameters = {"index": self.index_name, "k": k, "embedding": embedding}
results = self.query(read_query, params=parameters)
docs = [
(
Document(
page_content=result["text"],
metadata={
k: v for k, v in result["metadata"].items() if v is not None
},
),
result["score"],
)
for result in results
]
return docs
def similarity_search_by_vector(
self,
embedding: List[float],
k: int = 4,
**kwargs: Any,
) -> List[Document]:
"""Return docs most similar to embedding vector.
Args:
embedding: Embedding to look up documents similar to.
k: Number of Documents to return. Defaults to 4.
Returns:
List of Documents most similar to the query vector.
"""
docs_and_scores = self.similarity_search_with_score_by_vector(
embedding=embedding, k=k
)
return [doc for doc, _ in docs_and_scores]
@classmethod
def from_texts(
cls: Type[Neo4jVector],
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> Neo4jVector:
"""
Return Neo4jVector initialized from texts and embeddings.
Neo4j credentials are required in the form of `url`, `username`,
and `password` and optional `database` parameters.
"""
embeddings = embedding.embed_documents(list(texts))
return cls.__from(
texts,
embeddings,
embedding,
metadatas=metadatas,
ids=ids,
distance_strategy=distance_strategy,
**kwargs,
)
@classmethod
def from_embeddings(
cls,
text_embeddings: List[Tuple[str, List[float]]],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
ids: Optional[List[str]] = None,
pre_delete_collection: bool = False,
**kwargs: Any,
) -> Neo4jVector:
"""Construct Neo4jVector wrapper from raw documents and pre-
generated embeddings.
Return Neo4jVector initialized from documents and embeddings.
Neo4j credentials are required in the form of `url`, `username`,
and `password` and optional `database` parameters.
Example:
.. code-block:: python
from langchain.vectorstores.neo4j_vector import Neo4jVector
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
text_embeddings = embeddings.embed_documents(texts)
text_embedding_pairs = list(zip(texts, text_embeddings))
vectorstore = Neo4jVector.from_embeddings(
text_embedding_pairs, embeddings)
"""
texts = [t[0] for t in text_embeddings]
embeddings = [t[1] for t in text_embeddings]
return cls.__from(
texts,
embeddings,
embedding,
metadatas=metadatas,
ids=ids,
distance_strategy=distance_strategy,
pre_delete_collection=pre_delete_collection,
**kwargs,
)
@classmethod
def from_existing_index(
cls: Type[Neo4jVector],
embedding: Embeddings,
index_name: str,
**kwargs: Any,
) -> Neo4jVector:
"""
Get instance of an existing Neo4j vector index. This method will
return the instance of the store without inserting any new
embeddings.
Neo4j credentials are required in the form of `url`, `username`,
and `password` and optional `database` parameters along with
the `index_name` definition.
"""
store = cls(
embedding=embedding,
index_name=index_name,
**kwargs,
)
embedding_dimension = store.retrieve_existing_index()
if not embedding_dimension:
raise ValueError(
"The specified vector index name does not exist. "
"Make sure to check if you spelled it correctly"
)
# Check if embedding function and vector index dimensions match
if not store.embedding_dimension == embedding_dimension:
raise ValueError(
"The provided embedding function and vector index "
"dimensions do not match.\n"
f"Embedding function dimension: {store.embedding_dimension}\n"
f"Vector index dimension: {embedding_dimension}"
)
return store
@classmethod
def from_documents(
cls: Type[Neo4jVector],
documents: List[Document],
embedding: Embeddings,
distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> Neo4jVector:
"""
Return Neo4jVector initialized from documents and embeddings.
Neo4j credentials are required in the form of `url`, `username`,
and `password` and optional `database` parameters.
"""
texts = [d.page_content for d in documents]
metadatas = [d.metadata for d in documents]
return cls.from_texts(
texts=texts,
embedding=embedding,
distance_strategy=distance_strategy,
metadatas=metadatas,
ids=ids,
**kwargs,
)
def _select_relevance_score_fn(self) -> Callable[[float], float]:
"""
The 'correct' relevance function
may differ depending on a few things, including:
- the distance / similarity metric used by the VectorStore
- the scale of your embeddings (OpenAI's are unit normed. Many others are not!)
- embedding dimensionality
- etc.
"""
if self.override_relevance_score_fn is not None:
return self.override_relevance_score_fn
# Default strategy is to rely on distance strategy provided
# in vectorstore constructor
if self._distance_strategy == DistanceStrategy.COSINE:
return lambda x: x
elif self._distance_strategy == DistanceStrategy.EUCLIDEAN_DISTANCE:
return lambda x: x
else:
raise ValueError(
"No supported normalization function"
f" for distance_strategy of {self._distance_strategy}."
"Consider providing relevance_score_fn to PGVector constructor."
)

View File

@ -991,7 +991,7 @@ class Redis(VectorStore):
self,
k: int,
filter: Optional[RedisFilterExpression] = None,
return_fields: List[str] = [],
return_fields: Optional[List[str]] = None,
) -> "Query":
try:
from redis.commands.search.query import Query
@ -1000,6 +1000,7 @@ class Redis(VectorStore):
"Could not import redis python package. "
"Please install it with `pip install redis`."
) from e
return_fields = return_fields or []
vector_key = self._schema.content_vector_key
base_query = f"@{vector_key}:[VECTOR_RANGE $distance_threshold $vector]"
@ -1020,7 +1021,7 @@ class Redis(VectorStore):
self,
k: int,
filter: Optional[RedisFilterExpression] = None,
return_fields: List[str] = [],
return_fields: Optional[List[str]] = None,
) -> "Query":
"""Prepare query for vector search.
@ -1038,6 +1039,7 @@ class Redis(VectorStore):
"Could not import redis python package. "
"Please install it with `pip install redis`."
) from e
return_fields = return_fields or []
query_prefix = "*"
if filter:
query_prefix = f"{str(filter)}"

View File

@ -345,8 +345,9 @@ class SingleStoreDB(VectorStore):
def build_where_clause(
where_clause_values: List[Any],
sub_filter: dict,
prefix_args: List[str] = [],
prefix_args: Optional[List[str]] = None,
) -> None:
prefix_args = prefix_args or []
for key in sub_filter.keys():
if isinstance(sub_filter[key], dict):
build_where_clause(

View File

@ -245,6 +245,7 @@ class Vectara(VectorStore):
k: int = 5,
lambda_val: float = 0.025,
filter: Optional[str] = None,
score_threshold: Optional[float] = None,
n_sentence_context: int = 2,
**kwargs: Any,
) -> List[Tuple[Document, float]]:
@ -258,6 +259,9 @@ class Vectara(VectorStore):
filter can be "doc.rating > 3.0 and part.lang = 'deu'"} see
https://docs.vectara.com/docs/search-apis/sql/filter-overview
for more details.
score_threshold: minimal score threshold for the result.
If defined, results with score less than this value will be
filtered out.
n_sentence_context: number of sentences before/after the matching segment
to add, defaults to 2
@ -305,7 +309,14 @@ class Vectara(VectorStore):
result = response.json()
responses = result["responseSet"][0]["response"]
if score_threshold:
responses = [
r
for r in result["responseSet"][0]["response"]
if r["score"] > score_threshold
]
else:
responses = result["responseSet"][0]["response"]
documents = result["responseSet"][0]["document"]
metadatas = []
@ -316,7 +327,7 @@ class Vectara(VectorStore):
md.update(doc_md)
metadatas.append(md)
docs = [
docs_with_score = [
(
Document(
page_content=x["text"],
@ -327,7 +338,7 @@ class Vectara(VectorStore):
for x, md in zip(responses, metadatas)
]
return docs
return docs_with_score
def similarity_search(
self,
@ -358,6 +369,7 @@ class Vectara(VectorStore):
k=k,
lambda_val=lambda_val,
filter=filter,
score_threshold=None,
n_sentence_context=n_sentence_context,
**kwargs,
)
@ -451,7 +463,7 @@ class VectaraRetriever(VectorStoreRetriever):
self,
texts: List[str],
metadatas: Optional[List[dict]] = None,
doc_metadata: Optional[dict] = {},
doc_metadata: Optional[dict] = None,
) -> None:
"""Add text to the Vectara vectorstore.
@ -459,4 +471,4 @@ class VectaraRetriever(VectorStoreRetriever):
texts (List[str]): The text
metadatas (List[dict]): Metadata dicts, must line up with existing store
"""
self.vectorstore.add_texts(texts, metadatas, doc_metadata)
self.vectorstore.add_texts(texts, metadatas, doc_metadata or {})

View File

@ -1,7 +1,7 @@
from __future__ import annotations
import logging
from typing import Any, List, Optional
from typing import Any, Dict, List, Optional
from langchain.embeddings.base import Embeddings
from langchain.vectorstores.milvus import Milvus
@ -140,7 +140,7 @@ class Zilliz(Milvus):
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
collection_name: str = "LangChainCollection",
connection_args: dict[str, Any] = {},
connection_args: Optional[Dict[str, Any]] = None,
consistency_level: str = "Session",
index_params: Optional[dict] = None,
search_params: Optional[dict] = None,
@ -173,7 +173,7 @@ class Zilliz(Milvus):
vector_db = cls(
embedding_function=embedding,
collection_name=collection_name,
connection_args=connection_args,
connection_args=connection_args or {},
consistency_level=consistency_level,
index_params=index_params,
search_params=search_params,

View File

@ -1,6 +1,6 @@
[tool.poetry]
name = "langchain"
version = "0.0.275"
version = "0.0.276"
description = "Building applications with LLMs through composability"
authors = []
license = "MIT"

Some files were not shown because too many files have changed in this diff Show More