mirror of
https://github.com/hwchase17/langchain.git
synced 2026-02-10 03:00:59 +00:00
Compare commits
26 Commits
langchain-
...
langchain-
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
9447925d94 | ||
|
|
47adc7f32b | ||
|
|
16fc0a866e | ||
|
|
e499caa9cd | ||
|
|
29c873dd69 | ||
|
|
4ff2f4499e | ||
|
|
1f1679e960 | ||
|
|
5e3a321f71 | ||
|
|
820da64983 | ||
|
|
67b6e6c2e3 | ||
|
|
6247259438 | ||
|
|
0091947efd | ||
|
|
e958f76160 | ||
|
|
3981d736df | ||
|
|
fb1d67edf6 | ||
|
|
4f347cbcb9 | ||
|
|
4591bc0b01 | ||
|
|
f535e8a99e | ||
|
|
766b650fdc | ||
|
|
9daff60698 | ||
|
|
c8be0a9f70 | ||
|
|
f4b3c90886 | ||
|
|
b71ae52e65 | ||
|
|
39c44817ae | ||
|
|
4feda41ab6 | ||
|
|
71c2ec6782 |
20
README.md
20
README.md
@@ -14,18 +14,20 @@
|
||||
|
||||
Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
|
||||
|
||||
To help you ship LangChain apps to production faster, check out [LangSmith](https://smith.langchain.com).
|
||||
[LangSmith](https://smith.langchain.com) is a unified developer platform for building, testing, and monitoring LLM applications.
|
||||
To help you ship LangChain apps to production faster, check out [LangSmith](https://smith.langchain.com).
|
||||
[LangSmith](https://smith.langchain.com) is a unified developer platform for building, testing, and monitoring LLM applications.
|
||||
Fill out [this form](https://www.langchain.com/contact-sales) to speak with our sales team.
|
||||
|
||||
## Quick Install
|
||||
|
||||
With pip:
|
||||
|
||||
```bash
|
||||
pip install langchain
|
||||
```
|
||||
|
||||
With conda:
|
||||
|
||||
```bash
|
||||
conda install langchain -c conda-forge
|
||||
```
|
||||
@@ -36,12 +38,13 @@ conda install langchain -c conda-forge
|
||||
|
||||
For these applications, LangChain simplifies the entire application lifecycle:
|
||||
|
||||
- **Open-source libraries**: Build your applications using LangChain's open-source [building blocks](https://python.langchain.com/v0.2/docs/concepts#langchain-expression-language-lcel), [components](https://python.langchain.com/v0.2/docs/concepts), and [third-party integrations](https://python.langchain.com/v0.2/docs/integrations/platforms/).
|
||||
Use [LangGraph](/docs/concepts/#langgraph) to build stateful agents with first-class streaming and human-in-the-loop support.
|
||||
- **Open-source libraries**: Build your applications using LangChain's open-source [building blocks](https://python.langchain.com/v0.2/docs/concepts#langchain-expression-language-lcel), [components](https://python.langchain.com/v0.2/docs/concepts), and [third-party integrations](https://python.langchain.com/v0.2/docs/integrations/platforms/).
|
||||
Use [LangGraph](/docs/concepts/#langgraph) to build stateful agents with first-class streaming and human-in-the-loop support.
|
||||
- **Productionization**: Inspect, monitor, and evaluate your apps with [LangSmith](https://docs.smith.langchain.com/) so that you can constantly optimize and deploy with confidence.
|
||||
- **Deployment**: Turn your LangGraph applications into production-ready APIs and Assistants with [LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/).
|
||||
|
||||
### Open-source libraries
|
||||
|
||||
- **`langchain-core`**: Base abstractions and LangChain Expression Language.
|
||||
- **`langchain-community`**: Third party integrations.
|
||||
- Some integrations have been further split into **partner packages** that only rely on **`langchain-core`**. Examples include **`langchain_openai`** and **`langchain_anthropic`**.
|
||||
@@ -49,9 +52,11 @@ Use [LangGraph](/docs/concepts/#langgraph) to build stateful agents with first-c
|
||||
- **[`LangGraph`](https://langchain-ai.github.io/langgraph/)**: A library for building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it.
|
||||
|
||||
### Productionization:
|
||||
|
||||
- **[LangSmith](https://docs.smith.langchain.com/)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
|
||||
|
||||
### Deployment:
|
||||
|
||||
- **[LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/)**: Turn your LangGraph applications into production-ready APIs and Assistants.
|
||||
|
||||

|
||||
@@ -76,15 +81,17 @@ Use [LangGraph](/docs/concepts/#langgraph) to build stateful agents with first-c
|
||||
And much more! Head to the [Tutorials](https://python.langchain.com/v0.2/docs/tutorials/) section of the docs for more.
|
||||
|
||||
## 🚀 How does LangChain help?
|
||||
|
||||
The main value props of the LangChain libraries are:
|
||||
|
||||
1. **Components**: composable building blocks, tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
|
||||
2. **Off-the-shelf chains**: built-in assemblages of components for accomplishing higher-level tasks
|
||||
|
||||
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
|
||||
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
|
||||
|
||||
## LangChain Expression Language (LCEL)
|
||||
|
||||
LCEL is the foundation of many of LangChain's components, and is a declarative way to compose chains. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains.
|
||||
LCEL is a key part of LangChain, allowing you to build and organize chains of processes in a straightforward, declarative manner. It was designed to support taking prototypes directly into production without needing to alter any code. This means you can use LCEL to set up everything from basic "prompt + LLM" setups to intricate, multi-step workflows.
|
||||
|
||||
- **[Overview](https://python.langchain.com/v0.2/docs/concepts/#langchain-expression-language-lcel)**: LCEL and its benefits
|
||||
- **[Interface](https://python.langchain.com/v0.2/docs/concepts/#runnable-interface)**: The standard Runnable interface for LCEL objects
|
||||
@@ -123,7 +130,6 @@ Please see [here](https://python.langchain.com) for full documentation, which in
|
||||
- [🦜🕸️ LangGraph](https://langchain-ai.github.io/langgraph/): Create stateful, multi-actor applications with LLMs. Integrates smoothly with LangChain, but can be used without it.
|
||||
- [🦜🏓 LangServe](https://python.langchain.com/docs/langserve): Deploy LangChain runnables and chains as REST APIs.
|
||||
|
||||
|
||||
## 💁 Contributing
|
||||
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
|
||||
|
||||
@@ -18,7 +18,7 @@ for dir; do \
|
||||
if find "$$dir" -maxdepth 1 -type f \( -name "pyproject.toml" -o -name "setup.py" \) | grep -q .; then \
|
||||
echo "$$dir"; \
|
||||
fi \
|
||||
done' sh {} + | grep -vE "airbyte|ibm|couchbase" | tr '\n' ' ')
|
||||
done' sh {} + | grep -vE "airbyte|ibm|couchbase|databricks" | tr '\n' ' ')
|
||||
|
||||
PORT ?= 3001
|
||||
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -18,7 +18,8 @@
|
||||
"\n",
|
||||
"\n",
|
||||
"<Compatibility packagesAndVersions={[\n",
|
||||
" [\"langsmith\", \"0.1.100\"],\n",
|
||||
" [\"langsmith\", \"0.1.101\"],\n",
|
||||
" [\"langchain-core\", \"0.2.34\"],\n",
|
||||
"]} />\n",
|
||||
"\n",
|
||||
"\n",
|
||||
@@ -72,7 +73,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -qU langsmith>=0.1.100 langchain langchain-openai langchain-benchmarks"
|
||||
"%pip install -qU \"langsmith>=0.1.101\" \"langchain-core>=0.2.34\" langchain langchain-openai langchain-benchmarks"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -115,6 +116,8 @@
|
||||
"id": "5767d171",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Querying dataset\n",
|
||||
"\n",
|
||||
"Indexing can take a few seconds. Once the dataset is indexed, we can search for similar examples. Note that the input to the `similar_examples` method must have the same schema as the examples inputs. In this case our example inputs are a dictionary with a \"question\" key:"
|
||||
]
|
||||
},
|
||||
@@ -222,6 +225,8 @@
|
||||
"id": "e852c8ef",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Creating dynamic few-shot prompts\n",
|
||||
"\n",
|
||||
"The search returns the examples whose inputs are most similar to the query input. We can use this for few-shot prompting a model like so:"
|
||||
]
|
||||
},
|
||||
@@ -320,7 +325,7 @@
|
||||
"id": "94489b4a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Looking at the LangSmith trace, we can see that relevant examples were pulled in in the `similar_examples` step and passed as messages to ChatOpenAI: https://smith.langchain.com/public/9585e30f-765a-4ed9-b964-2211420cd2f8/r."
|
||||
"Looking at the LangSmith trace, we can see that relevant examples were pulled in in the `similar_examples` step and passed as messages to ChatOpenAI: https://smith.langchain.com/public/9585e30f-765a-4ed9-b964-2211420cd2f8/r/fdea98d6-e90f-49d4-ac22-dfd012e9e0d9."
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
424
docs/docs/integrations/caches/redis_llm_caching.ipynb
Normal file
424
docs/docs/integrations/caches/redis_llm_caching.ipynb
Normal file
@@ -0,0 +1,424 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Redis Cache for LangChain\n",
|
||||
"\n",
|
||||
"This notebook demonstrates how to use the `RedisCache` and `RedisSemanticCache` classes from the langchain-redis package to implement caching for LLM responses."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"First, let's install the required dependencies and ensure we have a Redis instance running."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -U langchain-core langchain-redis langchain-openai redis"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Ensure you have a Redis server running. You can start one using Docker with:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"docker run -d -p 6379:6379 redis:latest\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Or install and run Redis locally according to your operating system's instructions."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Connecting to Redis at: redis://redis:6379\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"# Use the environment variable if set, otherwise default to localhost\n",
|
||||
"REDIS_URL = os.getenv(\"REDIS_URL\", \"redis://localhost:6379\")\n",
|
||||
"print(f\"Connecting to Redis at: {REDIS_URL}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Importing Required Libraries"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import time\n",
|
||||
"\n",
|
||||
"from langchain.globals import set_llm_cache\n",
|
||||
"from langchain.schema import Generation\n",
|
||||
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
|
||||
"from langchain_redis import RedisCache, RedisSemanticCache"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import langchain_core\n",
|
||||
"import langchain_openai\n",
|
||||
"import openai\n",
|
||||
"import redis"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Set OpenAI API key"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"OpenAI API key not found in environment variables.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Please enter your OpenAI API key: ········\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"OpenAI API key has been set for this session.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from getpass import getpass\n",
|
||||
"\n",
|
||||
"# Check if OPENAI_API_KEY is already set in the environment\n",
|
||||
"openai_api_key = os.getenv(\"OPENAI_API_KEY\")\n",
|
||||
"\n",
|
||||
"if not openai_api_key:\n",
|
||||
" print(\"OpenAI API key not found in environment variables.\")\n",
|
||||
" openai_api_key = getpass(\"Please enter your OpenAI API key: \")\n",
|
||||
"\n",
|
||||
" # Set the API key for the current session\n",
|
||||
" os.environ[\"OPENAI_API_KEY\"] = openai_api_key\n",
|
||||
" print(\"OpenAI API key has been set for this session.\")\n",
|
||||
"else:\n",
|
||||
" print(\"OpenAI API key found in environment variables.\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Using RedisCache"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"First call (not cached):\n",
|
||||
"Result: \n",
|
||||
"\n",
|
||||
"Caching is the process of storing frequently accessed data in a temporary storage location for faster retrieval. This helps to reduce the time and resources needed to access the data from its original source. Caching is commonly used in computer systems, web browsers, and databases to improve performance and efficiency.\n",
|
||||
"Time: 1.16 seconds\n",
|
||||
"\n",
|
||||
"Second call (cached):\n",
|
||||
"Result: \n",
|
||||
"\n",
|
||||
"Caching is the process of storing frequently accessed data in a temporary storage location for faster retrieval. This helps to reduce the time and resources needed to access the data from its original source. Caching is commonly used in computer systems, web browsers, and databases to improve performance and efficiency.\n",
|
||||
"Time: 0.05 seconds\n",
|
||||
"\n",
|
||||
"Speed improvement: 25.40x faster\n",
|
||||
"Cache cleared\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Initialize RedisCache\n",
|
||||
"redis_cache = RedisCache(redis_url=REDIS_URL)\n",
|
||||
"\n",
|
||||
"# Set the cache for LangChain to use\n",
|
||||
"set_llm_cache(redis_cache)\n",
|
||||
"\n",
|
||||
"# Initialize the language model\n",
|
||||
"llm = OpenAI(temperature=0)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Function to measure execution time\n",
|
||||
"def timed_completion(prompt):\n",
|
||||
" start_time = time.time()\n",
|
||||
" result = llm.invoke(prompt)\n",
|
||||
" end_time = time.time()\n",
|
||||
" return result, end_time - start_time\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# First call (not cached)\n",
|
||||
"prompt = \"Explain the concept of caching in three sentences.\"\n",
|
||||
"result1, time1 = timed_completion(prompt)\n",
|
||||
"print(f\"First call (not cached):\\nResult: {result1}\\nTime: {time1:.2f} seconds\\n\")\n",
|
||||
"\n",
|
||||
"# Second call (should be cached)\n",
|
||||
"result2, time2 = timed_completion(prompt)\n",
|
||||
"print(f\"Second call (cached):\\nResult: {result2}\\nTime: {time2:.2f} seconds\\n\")\n",
|
||||
"\n",
|
||||
"print(f\"Speed improvement: {time1 / time2:.2f}x faster\")\n",
|
||||
"\n",
|
||||
"# Clear the cache\n",
|
||||
"redis_cache.clear()\n",
|
||||
"print(\"Cache cleared\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Using RedisSemanticCache"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Original query:\n",
|
||||
"Prompt: What is the capital of France?\n",
|
||||
"Result: \n",
|
||||
"\n",
|
||||
"The capital of France is Paris.\n",
|
||||
"Time: 1.52 seconds\n",
|
||||
"\n",
|
||||
"Similar query:\n",
|
||||
"Prompt: Can you tell me the capital city of France?\n",
|
||||
"Result: \n",
|
||||
"\n",
|
||||
"The capital of France is Paris.\n",
|
||||
"Time: 0.29 seconds\n",
|
||||
"\n",
|
||||
"Speed improvement: 5.22x faster\n",
|
||||
"Semantic cache cleared\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Initialize RedisSemanticCache\n",
|
||||
"embeddings = OpenAIEmbeddings()\n",
|
||||
"semantic_cache = RedisSemanticCache(\n",
|
||||
" redis_url=REDIS_URL, embeddings=embeddings, distance_threshold=0.2\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Set the cache for LangChain to use\n",
|
||||
"set_llm_cache(semantic_cache)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Function to test semantic cache\n",
|
||||
"def test_semantic_cache(prompt):\n",
|
||||
" start_time = time.time()\n",
|
||||
" result = llm.invoke(prompt)\n",
|
||||
" end_time = time.time()\n",
|
||||
" return result, end_time - start_time\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Original query\n",
|
||||
"original_prompt = \"What is the capital of France?\"\n",
|
||||
"result1, time1 = test_semantic_cache(original_prompt)\n",
|
||||
"print(\n",
|
||||
" f\"Original query:\\nPrompt: {original_prompt}\\nResult: {result1}\\nTime: {time1:.2f} seconds\\n\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Semantically similar query\n",
|
||||
"similar_prompt = \"Can you tell me the capital city of France?\"\n",
|
||||
"result2, time2 = test_semantic_cache(similar_prompt)\n",
|
||||
"print(\n",
|
||||
" f\"Similar query:\\nPrompt: {similar_prompt}\\nResult: {result2}\\nTime: {time2:.2f} seconds\\n\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(f\"Speed improvement: {time1 / time2:.2f}x faster\")\n",
|
||||
"\n",
|
||||
"# Clear the semantic cache\n",
|
||||
"semantic_cache.clear()\n",
|
||||
"print(\"Semantic cache cleared\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Advanced Usage"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Custom TTL (Time-To-Live)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Cached result: Cached response\n",
|
||||
"Waiting for TTL to expire...\n",
|
||||
"Result after TTL: Not found (expired)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Initialize RedisCache with custom TTL\n",
|
||||
"ttl_cache = RedisCache(redis_url=REDIS_URL, ttl=5) # 60 seconds TTL\n",
|
||||
"\n",
|
||||
"# Update a cache entry\n",
|
||||
"ttl_cache.update(\"test_prompt\", \"test_llm\", [Generation(text=\"Cached response\")])\n",
|
||||
"\n",
|
||||
"# Retrieve the cached entry\n",
|
||||
"cached_result = ttl_cache.lookup(\"test_prompt\", \"test_llm\")\n",
|
||||
"print(f\"Cached result: {cached_result[0].text if cached_result else 'Not found'}\")\n",
|
||||
"\n",
|
||||
"# Wait for TTL to expire\n",
|
||||
"print(\"Waiting for TTL to expire...\")\n",
|
||||
"time.sleep(6)\n",
|
||||
"\n",
|
||||
"# Try to retrieve the expired entry\n",
|
||||
"expired_result = ttl_cache.lookup(\"test_prompt\", \"test_llm\")\n",
|
||||
"print(\n",
|
||||
" f\"Result after TTL: {expired_result[0].text if expired_result else 'Not found (expired)'}\"\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Customizing RedisSemanticCache"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Original result: \n",
|
||||
"\n",
|
||||
"The largest planet in our solar system is Jupiter.\n",
|
||||
"Similar query result: \n",
|
||||
"\n",
|
||||
"The largest planet in our solar system is Jupiter.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Initialize RedisSemanticCache with custom settings\n",
|
||||
"custom_semantic_cache = RedisSemanticCache(\n",
|
||||
" redis_url=REDIS_URL,\n",
|
||||
" embeddings=embeddings,\n",
|
||||
" distance_threshold=0.1, # Stricter similarity threshold\n",
|
||||
" ttl=3600, # 1 hour TTL\n",
|
||||
" name=\"custom_cache\", # Custom cache name\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Test the custom semantic cache\n",
|
||||
"set_llm_cache(custom_semantic_cache)\n",
|
||||
"\n",
|
||||
"test_prompt = \"What's the largest planet in our solar system?\"\n",
|
||||
"result, _ = test_semantic_cache(test_prompt)\n",
|
||||
"print(f\"Original result: {result}\")\n",
|
||||
"\n",
|
||||
"# Try a slightly different query\n",
|
||||
"similar_test_prompt = \"Which planet is the biggest in the solar system?\"\n",
|
||||
"similar_result, _ = test_semantic_cache(similar_test_prompt)\n",
|
||||
"print(f\"Similar query result: {similar_result}\")\n",
|
||||
"\n",
|
||||
"# Clean up\n",
|
||||
"custom_semantic_cache.clear()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Conclusion\n",
|
||||
"\n",
|
||||
"This notebook demonstrated the usage of `RedisCache` and `RedisSemanticCache` from the langchain-redis package. These caching mechanisms can significantly improve the performance of LLM-based applications by reducing redundant API calls and leveraging semantic similarity for intelligent caching. The Redis-based implementation provides a fast, scalable, and flexible solution for caching in distributed systems."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
@@ -95,7 +95,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": 1,
|
||||
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -223,34 +223,28 @@
|
||||
"id": "d1ee55bc-ffc8-4cfa-801c-993953a08cfd",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## ***Beta***: Bedrock Converse API\n",
|
||||
"## Bedrock Converse API\n",
|
||||
"\n",
|
||||
"AWS has recently recently the Bedrock Converse API which provides a unified conversational interface for Bedrock models. This API does not yet support custom models. You can see a list of all [models that are supported here](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html). To improve reliability the ChatBedrock integration will switch to using the Bedrock Converse API as soon as it has feature parity with the existing Bedrock API. Until then a separate [ChatBedrockConverse](https://api.python.langchain.com/en/latest/chat_models/langchain_aws.chat_models.bedrock_converse.ChatBedrockConverse.html#langchain_aws.chat_models.bedrock_converse.ChatBedrockConverse) integration has been released in beta for users who do not need to use custom models.\n",
|
||||
"AWS has recently released the Bedrock Converse API which provides a unified conversational interface for Bedrock models. This API does not yet support custom models. You can see a list of all [models that are supported here](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html). To improve reliability the ChatBedrock integration will switch to using the Bedrock Converse API as soon as it has feature parity with the existing Bedrock API. Until then a separate [ChatBedrockConverse](https://python.langchain.com/v0.2/api_reference/aws/chat_models/langchain_aws.chat_models.bedrock_converse.ChatBedrockConverse.html) integration has been released.\n",
|
||||
"\n",
|
||||
"We recommend using `ChatBedrockConverse` for users who do not need to use custom models.\n",
|
||||
"\n",
|
||||
"You can use it like so:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"execution_count": 3,
|
||||
"id": "ae728e59-94d4-40cf-9d24-25ad8723fc59",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"/Users/bagatur/langchain/libs/core/langchain_core/_api/beta_decorator.py:87: LangChainBetaWarning: The class `ChatBedrockConverse` is in beta. It is actively being worked on, so the API may change.\n",
|
||||
" warn_beta(\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content=\"Voici la traduction en français :\\n\\nJ'aime la programmation.\", response_metadata={'ResponseMetadata': {'RequestId': '122fb1c8-c3c5-4b06-941e-c95d210bfbc7', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Mon, 01 Jul 2024 21:48:25 GMT', 'content-type': 'application/json', 'content-length': '243', 'connection': 'keep-alive', 'x-amzn-requestid': '122fb1c8-c3c5-4b06-941e-c95d210bfbc7'}, 'RetryAttempts': 0}, 'stopReason': 'end_turn', 'metrics': {'latencyMs': 830}}, id='run-0e3df22f-fcd8-4fbb-a4fb-565227e7e430-0', usage_metadata={'input_tokens': 29, 'output_tokens': 21, 'total_tokens': 50})"
|
||||
"AIMessage(content=\"Voici la traduction en français :\\n\\nJ'aime la programmation.\", response_metadata={'ResponseMetadata': {'RequestId': '4fcbfbe9-f916-4df2-b0bd-ea1147b550aa', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Wed, 21 Aug 2024 17:23:49 GMT', 'content-type': 'application/json', 'content-length': '243', 'connection': 'keep-alive', 'x-amzn-requestid': '4fcbfbe9-f916-4df2-b0bd-ea1147b550aa'}, 'RetryAttempts': 0}, 'stopReason': 'end_turn', 'metrics': {'latencyMs': 672}}, id='run-77ee9810-e32b-45dc-9ccb-6692253b1f45-0', usage_metadata={'input_tokens': 29, 'output_tokens': 21, 'total_tokens': 50})"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -268,6 +262,87 @@
|
||||
"llm.invoke(messages)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4da16f3e-e80b-48c0-8036-c1cc5f7c8c05",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Streaming\n",
|
||||
"\n",
|
||||
"Note that `ChatBedrockConverse` emits content blocks while streaming:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "7794b32e-d8de-4973-bf0f-39807dc745f0",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"content=[] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': 'Vo', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': 'ici', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': ' la', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': ' tra', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': 'duction', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': ' en', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': ' français', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': ' :', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': '\\n\\nJ', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': \"'\", 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': 'a', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': 'ime', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': ' la', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': ' programm', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': 'ation', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'type': 'text', 'text': '.', 'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[{'index': 0}] id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[] response_metadata={'stopReason': 'end_turn'} id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8'\n",
|
||||
"content=[] response_metadata={'metrics': {'latencyMs': 713}} id='run-2c92c5af-d771-4cc2-98d9-c11bbd30a1d8' usage_metadata={'input_tokens': 29, 'output_tokens': 21, 'total_tokens': 50}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for chunk in llm.stream(messages):\n",
|
||||
" print(chunk)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0ef05abb-9c04-4dc3-995e-f857779644d5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"An output parser can be used to filter to text, if desired:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "2a4e743f-ea7d-4e5a-9b12-f9992362de8b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"|Vo|ici| la| tra|duction| en| français| :|\n",
|
||||
"\n",
|
||||
"J|'|a|ime| la| programm|ation|.||||"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_core.output_parsers import StrOutputParser\n",
|
||||
"\n",
|
||||
"chain = llm | StrOutputParser()\n",
|
||||
"\n",
|
||||
"for chunk in chain.stream(messages):\n",
|
||||
" print(chunk, end=\"|\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
|
||||
@@ -297,7 +372,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.9"
|
||||
"version": "3.10.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -31,7 +31,7 @@
|
||||
"\n",
|
||||
"| Class | Package | Local | Serializable | Package downloads | Package latest |\n",
|
||||
"| :--- | :--- | :---: | :---: | :---: | :---: |\n",
|
||||
"| [ChatDatabricks](https://api.python.langchain.com/en/latest/chat_models/langchain_community.chat_models.databricks.ChatDatabricks.html) | [langchain-community](https://api.python.langchain.com/en/latest/community_api_reference.html) | ❌ | beta |  |  |\n",
|
||||
"| [ChatDatabricks](https://api.python.langchain.com/en/latest/chat_models/langchain_community.chat_models.databricks.ChatDatabricks.html) | [langchain-databricks](https://api.python.langchain.com/en/latest/databricks_api_reference.html) | ❌ | beta |  |  |\n",
|
||||
"\n",
|
||||
"### Model features\n",
|
||||
"| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
|
||||
@@ -99,7 +99,7 @@
|
||||
"source": [
|
||||
"### Installation\n",
|
||||
"\n",
|
||||
"The LangChain Databricks integration lives in the `langchain-community` package. Also, `mlflow >= 2.9 ` is required to run the code in this notebook."
|
||||
"The LangChain Databricks integration lives in the `langchain-databricks` package."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -108,7 +108,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -qU langchain-community mlflow>=2.9.0"
|
||||
"%pip install -qU langchain-databricks"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -133,7 +133,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_community.chat_models import ChatDatabricks\n",
|
||||
"from langchain_databricks import ChatDatabricks\n",
|
||||
"\n",
|
||||
"chat_model = ChatDatabricks(\n",
|
||||
" endpoint=\"databricks-dbrx-instruct\",\n",
|
||||
@@ -245,9 +245,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Invocation (streaming)\n",
|
||||
"\n",
|
||||
"`ChatDatabricks` supports streaming response by `stream` method since `langchain-community>=0.2.1`."
|
||||
"## Invocation (streaming)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -299,7 +297,7 @@
|
||||
"* An LLM was registered and deployed to [a Databricks serving endpoint](https://docs.databricks.com/machine-learning/model-serving/index.html) via MLflow. The endpoint must have OpenAI-compatible chat input/output format ([reference](https://mlflow.org/docs/latest/llms/deployments/index.html#chat))\n",
|
||||
"* You have [\"Can Query\" permission](https://docs.databricks.com/security/auth-authz/access-control/serving-endpoint-acl.html) to the endpoint.\n",
|
||||
"\n",
|
||||
"Once the endpoint is ready, the usage pattern is completely same as Foundation Models."
|
||||
"Once the endpoint is ready, the usage pattern is identical to that of Foundation Models."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -332,7 +330,7 @@
|
||||
"\n",
|
||||
"First, create a new Databricks serving endpoint that proxies requests to the target external model. The endpoint creation should be fairy quick for proxying external models.\n",
|
||||
"\n",
|
||||
"This requires registering OpenAI API Key in Databricks secret manager with the following comment:\n",
|
||||
"This requires registering your OpenAI API Key within the Databricks secret manager as follows:\n",
|
||||
"```sh\n",
|
||||
"# Replace `<scope>` with your scope\n",
|
||||
"databricks secrets create-scope <scope>\n",
|
||||
@@ -417,8 +415,6 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_community.chat_models.databricks import ChatDatabricks\n",
|
||||
"\n",
|
||||
"llm = ChatDatabricks(endpoint=\"databricks-meta-llama-3-70b-instruct\")\n",
|
||||
"tools = [\n",
|
||||
" {\n",
|
||||
@@ -461,7 +457,7 @@
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For detailed documentation of all ChatDatabricks features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_community.chat_models.ChatDatabricks.html"
|
||||
"For detailed documentation of all ChatDatabricks features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_databricks.chat_models.ChatDatabricks.html"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
@@ -457,7 +457,9 @@
|
||||
"tags": []
|
||||
},
|
||||
"source": [
|
||||
"## `Redis` Cache"
|
||||
"## `Redis` Cache\n",
|
||||
"\n",
|
||||
"See the main [Redis cache docs](/docs/integrations/caches/redis_llm_caching/) for detail."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -2,171 +2,347 @@
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "91c6a7ef",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Redis\n",
|
||||
"# Redis Chat Message History\n",
|
||||
"\n",
|
||||
">[Redis (Remote Dictionary Server)](https://en.wikipedia.org/wiki/Redis) is an open-source in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, `Redis` offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Redis is the most popular NoSQL database, and one of the most popular databases overall.\n",
|
||||
">[Redis (Remote Dictionary Server)](https://en.wikipedia.org/wiki/Redis) is an open-source in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. `Redis` offers low-latency reads and writes. Redis is the most popular NoSQL database, and one of the most popular databases overall.\n",
|
||||
"\n",
|
||||
"This notebook goes over how to use `Redis` to store chat message history."
|
||||
"This notebook demonstrates how to use the `RedisChatMessageHistory` class from the langchain-redis package to store and manage chat message history using Redis."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "897a4682-f9fc-488b-98f3-ae2acad84600",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setup\n",
|
||||
"First we need to install dependencies, and start a redis instance using commands like: `redis-server`."
|
||||
"\n",
|
||||
"First, we need to install the required dependencies and ensure we have a Redis instance running."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "cda8b56d-baf7-49a2-91a2-4d424a8519cb",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"pip install -U langchain-community redis"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b11090e7-284b-4ed2-9790-ce0d35638717",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_community.chat_message_histories import RedisChatMessageHistory"
|
||||
"%pip install -qU langchain-redis langchain-openai redis"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "20b99474-75ea-422e-9809-fbdb9d103afc",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Store and Retrieve Messages"
|
||||
"Make sure you have a Redis server running. You can start one using Docker with the following command:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"docker run -d -p 6379:6379 redis:latest\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Or install and run Redis locally according to the instructions for your operating system."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Connecting to Redis at: redis://redis:6379\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"# Use the environment variable if set, otherwise default to localhost\n",
|
||||
"REDIS_URL = os.getenv(\"REDIS_URL\", \"redis://localhost:6379\")\n",
|
||||
"print(f\"Connecting to Redis at: {REDIS_URL}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Importing Required Libraries"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "d15e3302",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"history = RedisChatMessageHistory(\"foo\", url=\"redis://localhost:6379\")\n",
|
||||
"\n",
|
||||
"history.add_user_message(\"hi!\")\n",
|
||||
"\n",
|
||||
"history.add_ai_message(\"whats up?\")"
|
||||
"from langchain_core.chat_history import BaseChatMessageHistory\n",
|
||||
"from langchain_core.messages import AIMessage, HumanMessage\n",
|
||||
"from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
|
||||
"from langchain_core.runnables.history import RunnableWithMessageHistory\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"from langchain_redis import RedisChatMessageHistory"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Basic Usage of RedisChatMessageHistory"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "64fc465e",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[HumanMessage(content='hi!'), AIMessage(content='whats up?')]"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Chat History:\n",
|
||||
"HumanMessage: Hello, AI assistant!\n",
|
||||
"AIMessage: Hello! How can I assist you today?\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"history.messages"
|
||||
"# Initialize RedisChatMessageHistory\n",
|
||||
"history = RedisChatMessageHistory(session_id=\"user_123\", redis_url=REDIS_URL)\n",
|
||||
"\n",
|
||||
"# Add messages to the history\n",
|
||||
"history.add_user_message(\"Hello, AI assistant!\")\n",
|
||||
"history.add_ai_message(\"Hello! How can I assist you today?\")\n",
|
||||
"\n",
|
||||
"# Retrieve messages\n",
|
||||
"print(\"Chat History:\")\n",
|
||||
"for message in history.messages:\n",
|
||||
" print(f\"{type(message).__name__}: {message.content}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "465fdd8c-b093-4d19-a55a-30f3b646432b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Using in the Chains"
|
||||
"## Using RedisChatMessageHistory with Language Models"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "94d65d2f-e9bb-4b47-a86d-dd6b1b5e8247",
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"pip install -U langchain-openai"
|
||||
"### Set OpenAI API key"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "ace3e7b2-5e3e-4966-b549-04952a6a9a09",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"OpenAI API key not found in environment variables.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Please enter your OpenAI API key: ········\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"OpenAI API key has been set for this session.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from typing import Optional\n",
|
||||
"from getpass import getpass\n",
|
||||
"\n",
|
||||
"from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
|
||||
"from langchain_core.runnables.history import RunnableWithMessageHistory\n",
|
||||
"from langchain_openai import ChatOpenAI"
|
||||
"# Check if OPENAI_API_KEY is already set in the environment\n",
|
||||
"openai_api_key = os.getenv(\"OPENAI_API_KEY\")\n",
|
||||
"\n",
|
||||
"if not openai_api_key:\n",
|
||||
" print(\"OpenAI API key not found in environment variables.\")\n",
|
||||
" openai_api_key = getpass(\"Please enter your OpenAI API key: \")\n",
|
||||
"\n",
|
||||
" # Set the API key for the current session\n",
|
||||
" os.environ[\"OPENAI_API_KEY\"] = openai_api_key\n",
|
||||
" print(\"OpenAI API key has been set for this session.\")\n",
|
||||
"else:\n",
|
||||
" print(\"OpenAI API key found in environment variables.\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "5c1fba0d-d06a-4695-ba14-c42a3461ada1",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content='Your name is Bob, as you mentioned earlier. Is there anything specific you would like assistance with, Bob?')"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"AI Response 1: Hello Alice! How can I assist you today?\n",
|
||||
"AI Response 2: Your name is Alice.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Create a prompt template\n",
|
||||
"prompt = ChatPromptTemplate.from_messages(\n",
|
||||
" [\n",
|
||||
" (\"system\", \"You're an assistant。\"),\n",
|
||||
" (\"system\", \"You are a helpful AI assistant.\"),\n",
|
||||
" MessagesPlaceholder(variable_name=\"history\"),\n",
|
||||
" (\"human\", \"{question}\"),\n",
|
||||
" (\"human\", \"{input}\"),\n",
|
||||
" ]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"chain = prompt | ChatOpenAI()\n",
|
||||
"# Initialize the language model\n",
|
||||
"llm = ChatOpenAI()\n",
|
||||
"\n",
|
||||
"# Create the conversational chain\n",
|
||||
"chain = prompt | llm\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Function to get or create a RedisChatMessageHistory instance\n",
|
||||
"def get_redis_history(session_id: str) -> BaseChatMessageHistory:\n",
|
||||
" return RedisChatMessageHistory(session_id, redis_url=REDIS_URL)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Create a runnable with message history\n",
|
||||
"chain_with_history = RunnableWithMessageHistory(\n",
|
||||
" chain,\n",
|
||||
" lambda session_id: RedisChatMessageHistory(\n",
|
||||
" session_id, url=\"redis://localhost:6379\"\n",
|
||||
" ),\n",
|
||||
" input_messages_key=\"question\",\n",
|
||||
" history_messages_key=\"history\",\n",
|
||||
" chain, get_redis_history, input_messages_key=\"input\", history_messages_key=\"history\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"config = {\"configurable\": {\"session_id\": \"foo\"}}\n",
|
||||
"# Use the chain in a conversation\n",
|
||||
"response1 = chain_with_history.invoke(\n",
|
||||
" {\"input\": \"Hi, my name is Alice.\"},\n",
|
||||
" config={\"configurable\": {\"session_id\": \"alice_123\"}},\n",
|
||||
")\n",
|
||||
"print(\"AI Response 1:\", response1.content)\n",
|
||||
"\n",
|
||||
"chain_with_history.invoke({\"question\": \"Hi! I'm bob\"}, config=config)\n",
|
||||
"\n",
|
||||
"chain_with_history.invoke({\"question\": \"Whats my name\"}, config=config)"
|
||||
"response2 = chain_with_history.invoke(\n",
|
||||
" {\"input\": \"What's my name?\"}, config={\"configurable\": {\"session_id\": \"alice_123\"}}\n",
|
||||
")\n",
|
||||
"print(\"AI Response 2:\", response2.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Advanced Features"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Custom Redis Configuration"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "76ce3f6b-f4c7-4d27-8031-60f7dd756695",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Custom History: [HumanMessage(content='This is a message with custom configuration.')]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Initialize with custom Redis configuration\n",
|
||||
"custom_history = RedisChatMessageHistory(\n",
|
||||
" \"user_456\",\n",
|
||||
" redis_url=REDIS_URL,\n",
|
||||
" key_prefix=\"custom_prefix:\",\n",
|
||||
" ttl=3600, # Set TTL to 1 hour\n",
|
||||
" index_name=\"custom_index\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"custom_history.add_user_message(\"This is a message with custom configuration.\")\n",
|
||||
"print(\"Custom History:\", custom_history.messages)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Searching Messages"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Search Results:\n",
|
||||
"human: Tell me about artificial intelligence....\n",
|
||||
"ai: Artificial Intelligence (AI) is a branch of comput...\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Add more messages\n",
|
||||
"history.add_user_message(\"Tell me about artificial intelligence.\")\n",
|
||||
"history.add_ai_message(\n",
|
||||
" \"Artificial Intelligence (AI) is a branch of computer science...\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Search for messages containing a specific term\n",
|
||||
"search_results = history.search_messages(\"artificial intelligence\")\n",
|
||||
"print(\"Search Results:\")\n",
|
||||
"for result in search_results:\n",
|
||||
" print(f\"{result['type']}: {result['content'][:50]}...\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Clearing History"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Messages after clearing: []\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Clear the chat history\n",
|
||||
"history.clear()\n",
|
||||
"print(\"Messages after clearing:\", history.messages)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Conclusion\n",
|
||||
"\n",
|
||||
"This notebook demonstrated the key features of `RedisChatMessageHistory` from the langchain-redis package. It showed how to initialize and use the chat history, integrate it with language models, and utilize advanced features like custom configurations and message searching. Redis provides a fast and scalable solution for managing chat history in AI applications."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
@@ -185,9 +361,9 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.12"
|
||||
"version": "3.11.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
|
||||
@@ -34,6 +34,15 @@ See a [usage example](/docs/integrations/chat/bedrock).
|
||||
from langchain_aws import ChatBedrock
|
||||
```
|
||||
|
||||
### Bedrock Converse
|
||||
AWS has recently released the Bedrock Converse API which provides a unified conversational interface for Bedrock models. This API does not yet support custom models. You can see a list of all [models that are supported here](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html). To improve reliability the ChatBedrock integration will switch to using the Bedrock Converse API as soon as it has feature parity with the existing Bedrock API. Until then a separate [ChatBedrockConverse](https://python.langchain.com/v0.2/api_reference/aws/chat_models/langchain_aws.chat_models.bedrock_converse.ChatBedrockConverse.html) integration has been released.
|
||||
|
||||
We recommend using `ChatBedrockConverse` for users who do not need to use custom models. See the [docs](/docs/integrations/chat/bedrock/#bedrock-converse-api) and [API reference](https://python.langchain.com/v0.2/api_reference/aws/chat_models/langchain_aws.chat_models.bedrock_converse.ChatBedrockConverse.html) for more detail.
|
||||
|
||||
```python
|
||||
from langchain_aws import ChatBedrockConverse
|
||||
```
|
||||
|
||||
## LLMs
|
||||
|
||||
### Bedrock
|
||||
|
||||
@@ -172,3 +172,14 @@ If you wish to use OAuth2 with the authorization_code flow, please use `BoxAuthT
|
||||
from langchain_box.document_loaders import BoxLoader
|
||||
|
||||
```
|
||||
|
||||
## Retrievers
|
||||
|
||||
### BoxRetriever
|
||||
|
||||
[See usage example](/docs/integrations/retrievers/box)
|
||||
|
||||
```python
|
||||
from langchain_box.retrievers import BoxRetriever
|
||||
|
||||
```
|
||||
|
||||
25
docs/docs/integrations/providers/dria.mdx
Normal file
25
docs/docs/integrations/providers/dria.mdx
Normal file
@@ -0,0 +1,25 @@
|
||||
# Dria
|
||||
|
||||
>[Dria](https://dria.co/) is a hub of public RAG models for developers to
|
||||
> both contribute and utilize a shared embedding lake.
|
||||
|
||||
See more details about the LangChain integration with Dria
|
||||
at [this page](https://dria.co/docs/integrations/langchain).
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
You have to install a python package:
|
||||
|
||||
```bash
|
||||
pip install dria
|
||||
```
|
||||
|
||||
You have to get an API key from Dria. You can get it by signing up at [Dria](https://dria.co/).
|
||||
|
||||
## Retrievers
|
||||
|
||||
See a [usage example](/docs/integrations/retrievers/dria_index).
|
||||
|
||||
```python
|
||||
from langchain_community.retrievers import DriaRetriever
|
||||
```
|
||||
25
docs/docs/integrations/providers/duckduckgo_search.mdx
Normal file
25
docs/docs/integrations/providers/duckduckgo_search.mdx
Normal file
@@ -0,0 +1,25 @@
|
||||
# DuckDuckGo Search
|
||||
|
||||
>[DuckDuckGo Search](https://github.com/deedy5/duckduckgo_search) is a package that
|
||||
> searches for words, documents, images, videos, news, maps and text
|
||||
> translation using the `DuckDuckGo.com` search engine. It is downloading files
|
||||
> and images to a local hard drive.
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
You have to install a python package:
|
||||
|
||||
```bash
|
||||
pip install duckduckgo-search
|
||||
```
|
||||
|
||||
## Tools
|
||||
|
||||
See a [usage example](/docs/integrations/tools/ddg).
|
||||
|
||||
There are two tools available:
|
||||
|
||||
```python
|
||||
from langchain_community.tools import DuckDuckGoSearchRun
|
||||
from langchain_community.tools import DuckDuckGoSearchResults
|
||||
```
|
||||
20
docs/docs/integrations/providers/e2b.mdx
Normal file
20
docs/docs/integrations/providers/e2b.mdx
Normal file
@@ -0,0 +1,20 @@
|
||||
# E2B
|
||||
|
||||
>[E2B](https://e2b.dev/) provides open-source secure sandboxes
|
||||
> for AI-generated code execution. See more [here](https://github.com/e2b-dev).
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
You have to install a python package:
|
||||
|
||||
```bash
|
||||
pip install e2b_code_interpreter
|
||||
```
|
||||
|
||||
## Tool
|
||||
|
||||
See a [usage example](/docs/integrations/tools/e2b_data_analysis).
|
||||
|
||||
```python
|
||||
from langchain_community.tools import E2BDataAnalysisTool
|
||||
```
|
||||
@@ -6,11 +6,14 @@
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
### Setup Elasticsearch
|
||||
|
||||
There are two ways to get started with Elasticsearch:
|
||||
|
||||
#### Install Elasticsearch on your local machine via docker
|
||||
#### Install Elasticsearch on your local machine via Docker
|
||||
|
||||
Example: Run a single-node Elasticsearch instance with security disabled. This is not recommended for production use.
|
||||
Example: Run a single-node Elasticsearch instance with security disabled.
|
||||
This is not recommended for production use.
|
||||
|
||||
```bash
|
||||
docker run -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" -e "xpack.security.http.ssl.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.9.0
|
||||
@@ -18,7 +21,7 @@ Example: Run a single-node Elasticsearch instance with security disabled. This i
|
||||
|
||||
#### Deploy Elasticsearch on Elastic Cloud
|
||||
|
||||
Elastic Cloud is a managed Elasticsearch service. Signup for a [free trial](https://cloud.elastic.co/registration?utm_source=langchain&utm_content=documentation).
|
||||
`Elastic Cloud` is a managed Elasticsearch service. Signup for a [free trial](https://cloud.elastic.co/registration?utm_source=langchain&utm_content=documentation).
|
||||
|
||||
### Install Client
|
||||
|
||||
@@ -43,7 +46,34 @@ See a [usage example](/docs/integrations/vectorstores/elasticsearch).
|
||||
from langchain_elasticsearch import ElasticsearchStore
|
||||
```
|
||||
|
||||
### Third-party integrations
|
||||
|
||||
#### EcloudESVectorStore
|
||||
|
||||
```python
|
||||
from langchain_community.vectorstores.ecloud_vector_search import EcloudESVectorStore
|
||||
```
|
||||
|
||||
## Retrievers
|
||||
|
||||
### ElasticsearchRetriever
|
||||
|
||||
The `ElasticsearchRetriever` enables flexible access to all Elasticsearch features
|
||||
through the Query DSL.
|
||||
|
||||
See a [usage example](/docs/integrations/retrievers/elasticsearch_retriever).
|
||||
|
||||
```python
|
||||
from langchain_elasticsearch import ElasticsearchRetriever
|
||||
```
|
||||
|
||||
### BM25
|
||||
|
||||
See a [usage example](/docs/integrations/retrievers/elastic_search_bm25).
|
||||
|
||||
```python
|
||||
from langchain_community.retrievers import ElasticSearchBM25Retriever
|
||||
```
|
||||
## Memory
|
||||
|
||||
See a [usage example](/docs/integrations/memory/elasticsearch_chat_message_history).
|
||||
@@ -67,3 +97,12 @@ See a [usage example](/docs/integrations/stores/elasticsearch).
|
||||
```python
|
||||
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
|
||||
```
|
||||
|
||||
## Chain
|
||||
|
||||
It is a chain for interacting with Elasticsearch Database.
|
||||
|
||||
```python
|
||||
from langchain.chains.elasticsearch_database import ElasticsearchDatabaseChain
|
||||
```
|
||||
|
||||
|
||||
323
docs/docs/integrations/retrievers/box.ipynb
Normal file
323
docs/docs/integrations/retrievers/box.ipynb
Normal file
File diff suppressed because one or more lines are too long
@@ -23,7 +23,7 @@
|
||||
"\n",
|
||||
"3. **Comment on Issue**- posts a comment on a specific issue.\n",
|
||||
"\n",
|
||||
"4. **Create Pull Request**- creates a pull request from the bot's working branch to the base branch.\n",
|
||||
"4. **Create Merge Request**- creates a merge request from the bot's working branch to the base branch.\n",
|
||||
"\n",
|
||||
"5. **Create File**- creates a new file in the repository.\n",
|
||||
"\n",
|
||||
@@ -82,7 +82,7 @@
|
||||
"* **GITLAB_PERSONAL_ACCESS_TOKEN**- The personal access token you created in the last step\n",
|
||||
"* **GITLAB_REPOSITORY**- The name of the Gitlab repository you want your bot to act upon. Must follow the format {username}/{repo-name}.\n",
|
||||
"* **GITLAB_BRANCH**- The branch where the bot will make its commits. Defaults to 'main.'\n",
|
||||
"* **GITLAB_BASE_BRANCH**- The base branch of your repo, usually either 'main' or 'master.' This is where pull requests will base from. Defaults to 'main.'\n"
|
||||
"* **GITLAB_BASE_BRANCH**- The base branch of your repo, usually either 'main' or 'master.' This is where merge requests will base from. Defaults to 'main.'\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -185,14 +185,14 @@
|
||||
"</html>\n",
|
||||
">>>> NEW\u001b[0m\n",
|
||||
"Observation: \u001b[36;1m\u001b[1;3mUpdated file game.html\u001b[0m\n",
|
||||
"Thought:\u001b[32;1m\u001b[1;3m I need to create a pull request to submit my changes.\n",
|
||||
"Action: Create Pull Request\n",
|
||||
"Thought:\u001b[32;1m\u001b[1;3m I need to create a merge request to submit my changes.\n",
|
||||
"Action: Create Merge Request\n",
|
||||
"Action Input: Add tic-tac-toe game\n",
|
||||
"\n",
|
||||
"added tic-tac-toe game, closes issue #15\u001b[0m\n",
|
||||
"Observation: \u001b[36;1m\u001b[1;3mSuccessfully created PR number 12\u001b[0m\n",
|
||||
"Observation: \u001b[36;1m\u001b[1;3mSuccessfully created MR number 12\u001b[0m\n",
|
||||
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
|
||||
"Final Answer: I have created a pull request with number 12 that solves issue 15.\u001b[0m\n",
|
||||
"Final Answer: I have created a merge request with number 12 that solves issue 15.\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1m> Finished chain.\u001b[0m\n"
|
||||
]
|
||||
@@ -200,7 +200,7 @@
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'I have created a pull request with number 12 that solves issue 15.'"
|
||||
"'I have created a merge request with number 12 that solves issue 15.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
@@ -210,7 +210,7 @@
|
||||
],
|
||||
"source": [
|
||||
"agent.run(\n",
|
||||
" \"You have the software engineering capabilities of a Google Principle engineer. You are tasked with completing issues on a gitlab repository. Please look at the open issues and complete them by creating pull requests that solve the issues.\"\n",
|
||||
" \"You have the software engineering capabilities of a Google Principle engineer. You are tasked with completing issues on a gitlab repository. Please look at the open issues and complete them by creating merge requests that solve the issues.\"\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -83,11 +83,13 @@ Trace and evaluate your language model applications and intelligent agents to he
|
||||
### [🦜🕸️ LangGraph](https://langchain-ai.github.io/langgraph)
|
||||
Build stateful, multi-actor applications with LLMs. Integrates smoothly with LangChain, but can be used without it.
|
||||
|
||||
|
||||
## Additional resources
|
||||
|
||||
### [Versions](/docs/versions/overview/)
|
||||
See what changed in v0.2, learn how to migrate legacy code, and read up on our release/versioning policies, and more.
|
||||
|
||||
### [Security](/docs/security)
|
||||
Read up on our [Security](/docs/security) best practices to make sure you're developing safely with LangChain.
|
||||
Read up on [security](/docs/security) best practices to make sure you're developing safely with LangChain.
|
||||
|
||||
### [Integrations](/docs/integrations/providers/)
|
||||
LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/integrations/providers/).
|
||||
|
||||
@@ -1,20 +1,12 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b57124cc-60a0-4c18-b7ce-3e483d1024a2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from ConstitutionalChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ce8457ed-c0b1-4a74-abbd-9d3d2211270f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Migrating from ConstitutionalChain\n",
|
||||
"\n",
|
||||
"[ConstitutionalChain](https://api.python.langchain.com/en/latest/chains/langchain.chains.constitutional_ai.base.ConstitutionalChain.html) allowed for a LLM to critique and revise generations based on [principles](https://api.python.langchain.com/en/latest/chains/langchain.chains.constitutional_ai.models.ConstitutionalPrinciple.html), structured as combinations of critique and revision requests. For example, a principle might include a request to identify harmful content, and a request to rewrite the content.\n",
|
||||
"\n",
|
||||
"`Constitutional AI principles` are based on the [Constitutional AI: Harmlessness from AI Feedback](https://arxiv.org/pdf/2212.08073) paper.\n",
|
||||
|
||||
@@ -1,21 +1,13 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "030d95bc-2f9d-492b-8245-b791b866936b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from ConversationalChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d20aeaad-b3ca-4a7d-b02d-3267503965af",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"[`ConversationChain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.conversation.base.ConversationChain.html) incorporates a memory of previous messages to sustain a stateful conversation.\n",
|
||||
"# Migrating from ConversationalChain\n",
|
||||
"\n",
|
||||
"[`ConversationChain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.conversation.base.ConversationChain.html) incorporated a memory of previous messages to sustain a stateful conversation.\n",
|
||||
"\n",
|
||||
"Some advantages of switching to the LCEL implementation are:\n",
|
||||
"\n",
|
||||
|
||||
@@ -1,30 +1,29 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9e279999-6bf0-4a48-9e06-539b916dc705",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from ConversationalRetrievalChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "292a3c83-44a9-4426-bbec-f1a778d00d93",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Migrating from ConversationalRetrievalChain\n",
|
||||
"\n",
|
||||
"The [`ConversationalRetrievalChain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.conversational_retrieval.base.ConversationalRetrievalChain.html) was an all-in one way that combined retrieval-augmented generation with chat history, allowing you to \"chat with\" your documents.\n",
|
||||
"\n",
|
||||
"Advantages of switching to the LCEL implementation are similar to the `RetrievalQA` section above:\n",
|
||||
"Advantages of switching to the LCEL implementation are similar to the [`RetrievalQA` migration guide](./retrieval_qa.ipynb):\n",
|
||||
"\n",
|
||||
"- Clearer internals. The `ConversationalRetrievalChain` chain hides an entire question rephrasing step which dereferences the initial query against the chat history.\n",
|
||||
" - This means the class contains two sets of configurable prompts, LLMs, etc.\n",
|
||||
"- More easily return source documents.\n",
|
||||
"- Support for runnable methods like streaming and async operations.\n",
|
||||
"\n",
|
||||
"Here are side-by-side implementations with custom prompts. We'll reuse the loaded documents and vector store from the previous section:"
|
||||
"Here are equivalent implementations with custom prompts.\n",
|
||||
"We'll use the following ingestion code to load a [blog post by Lilian Weng](https://lilianweng.github.io/posts/2023-06-23-agent/) on autonomous agents into a local vector store:\n",
|
||||
"\n",
|
||||
"## Shared setup\n",
|
||||
"\n",
|
||||
"For both versions, we'll need to load the data with the `WebBaseLoader` document loader, split it with `RecursiveCharacterTextSplitter`, and add it to an in-memory `FAISS` vector store.\n",
|
||||
"\n",
|
||||
"We will also instantiate a chat model to use."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
89
docs/docs/versions/migrating_chains/index.ipynb
Normal file
89
docs/docs/versions/migrating_chains/index.ipynb
Normal file
@@ -0,0 +1,89 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "raw"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"---\n",
|
||||
"sidebar_position: 1\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# How to migrate from v0.0 chains\n",
|
||||
"\n",
|
||||
"LangChain has evolved since its initial release, and many of the original \"Chain\" classes \n",
|
||||
"have been deprecated in favor of the more flexible and powerful frameworks of LCEL and LangGraph. \n",
|
||||
"\n",
|
||||
"This guide will help you migrate your existing v0.0 chains to the new abstractions.\n",
|
||||
"\n",
|
||||
":::info How deprecated implementations work\n",
|
||||
"Even though many of these implementations are deprecated, they are **still supported** in the codebase. \n",
|
||||
"However, they are not recommended for new development, and we recommend re-implementing them using the following guides!\n",
|
||||
"\n",
|
||||
"To see the planned removal version for each deprecated implementation, check their API reference.\n",
|
||||
":::\n",
|
||||
"\n",
|
||||
":::info Prerequisites\n",
|
||||
"\n",
|
||||
"These guides assume some familiarity with the following concepts:\n",
|
||||
"- [LangChain Expression Language](/docs/concepts#langchain-expression-language-lcel)\n",
|
||||
"- [LangGraph](https://langchain-ai.github.io/langgraph/)\n",
|
||||
":::\n",
|
||||
"\n",
|
||||
"LangChain maintains a number of legacy abstractions. Many of these can be reimplemented via short combinations of LCEL and LangGraph primitives.\n",
|
||||
"\n",
|
||||
"### LCEL\n",
|
||||
"[LCEL](/docs/concepts/#langchain-expression-language-lcel) is designed to streamline the process of building useful apps with LLMs and combining related components. It does this by providing:\n",
|
||||
"\n",
|
||||
"1. **A unified interface**: Every LCEL object implements the `Runnable` interface, which defines a common set of invocation methods (`invoke`, `batch`, `stream`, `ainvoke`, ...). This makes it possible to also automatically and consistently support useful operations like streaming of intermediate steps and batching, since every chain composed of LCEL objects is itself an LCEL object.\n",
|
||||
"2. **Composition primitives**: LCEL provides a number of primitives that make it easy to compose chains, parallelize components, add fallbacks, dynamically configure chain internals, and more.\n",
|
||||
"\n",
|
||||
"### LangGraph\n",
|
||||
"[LangGraph](https://langchain-ai.github.io/langgraph/), built on top of LCEL, allows for performant orchestrations of application components while maintaining concise and readable code. It includes built-in persistence, support for cycles, and prioritizes controllability.\n",
|
||||
"If LCEL grows unwieldy for larger or more complex chains, they may benefit from a LangGraph implementation.\n",
|
||||
"\n",
|
||||
"### Advantages\n",
|
||||
"Using these frameworks for existing v0.0 chains confers some advantages:\n",
|
||||
"\n",
|
||||
"- The resulting chains typically implement the full `Runnable` interface, including streaming and asynchronous support where appropriate;\n",
|
||||
"- The chains may be more easily extended or modified;\n",
|
||||
"- The parameters of the chain are typically surfaced for easier customization (e.g., prompts) over previous versions, which tended to be subclasses and had opaque parameters and internals.\n",
|
||||
"- If using LangGraph, the chain supports built-in persistence, allowing for conversational experiences via a \"memory\" of the chat history.\n",
|
||||
"- If using LangGraph, the steps of the chain can be streamed, allowing for greater control and customizability.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"The below pages assist with migration from various specific chains to LCEL and LangGraph:\n",
|
||||
"\n",
|
||||
"- [LLMChain](./llm_chain.ipynb)\n",
|
||||
"- [ConversationChain](./conversation_chain.ipynb)\n",
|
||||
"- [RetrievalQA](./retrieval_qa.ipynb)\n",
|
||||
"- [ConversationalRetrievalChain](./conversation_retrieval_chain.ipynb)\n",
|
||||
"- [StuffDocumentsChain](./stuff_docs_chain.ipynb)\n",
|
||||
"- [MapReduceDocumentsChain](./map_reduce_chain.ipynb)\n",
|
||||
"- [MapRerankDocumentsChain](./map_rerank_docs_chain.ipynb)\n",
|
||||
"- [RefineDocumentsChain](./refine_docs_chain.ipynb)\n",
|
||||
"- [LLMRouterChain](./llm_router_chain.ipynb)\n",
|
||||
"- [MultiPromptChain](./multi_prompt_chain.ipynb)\n",
|
||||
"- [LLMMathChain](./llm_math_chain.ipynb)\n",
|
||||
"- [ConstitutionalChain](./constitutional_chain.ipynb)\n",
|
||||
"\n",
|
||||
"Check out the [LCEL conceptual docs](/docs/concepts/#langchain-expression-language-lcel) and [LangGraph docs](https://langchain-ai.github.io/langgraph/) for more background information."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@@ -1,51 +0,0 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
---
|
||||
|
||||
# How to migrate from v0.0 chains
|
||||
|
||||
:::info Prerequisites
|
||||
|
||||
This guide assumes familiarity with the following concepts:
|
||||
- [LangChain Expression Language](/docs/concepts#langchain-expression-language-lcel)
|
||||
- [LangGraph](https://langchain-ai.github.io/langgraph/)
|
||||
:::
|
||||
|
||||
LangChain maintains a number of legacy abstractions. Many of these can be reimplemented via short combinations of LCEL and LangGraph primitives.
|
||||
|
||||
### LCEL
|
||||
[LCEL](/docs/concepts/#langchain-expression-language-lcel) is designed to streamline the process of building useful apps with LLMs and combining related components. It does this by providing:
|
||||
|
||||
1. **A unified interface**: Every LCEL object implements the `Runnable` interface, which defines a common set of invocation methods (`invoke`, `batch`, `stream`, `ainvoke`, ...). This makes it possible to also automatically and consistently support useful operations like streaming of intermediate steps and batching, since every chain composed of LCEL objects is itself an LCEL object.
|
||||
2. **Composition primitives**: LCEL provides a number of primitives that make it easy to compose chains, parallelize components, add fallbacks, dynamically configure chain internals, and more.
|
||||
|
||||
### LangGraph
|
||||
[LangGraph](https://langchain-ai.github.io/langgraph/), built on top of LCEL, allows for performant orchestrations of application components while maintaining concise and readable code. It includes built-in persistence, support for cycles, and prioritizes controllability.
|
||||
If LCEL grows unwieldy for larger or more complex chains, they may benefit from a LangGraph implementation.
|
||||
|
||||
### Advantages
|
||||
Using these frameworks for existing v0.0 chains confers some advantages:
|
||||
|
||||
- The resulting chains typically implement the full `Runnable` interface, including streaming and asynchronous support where appropriate;
|
||||
- The chains may be more easily extended or modified;
|
||||
- The parameters of the chain are typically surfaced for easier customization (e.g., prompts) over previous versions, which tended to be subclasses and had opaque parameters and internals.
|
||||
- If using LangGraph, the chain supports built-in persistence, allowing for conversational experiences via a "memory" of the chat history.
|
||||
- If using LangGraph, the steps of the chain can be streamed, allowing for greater control and customizability.
|
||||
|
||||
|
||||
The below pages assist with migration from various specific chains to LCEL and LangGraph:
|
||||
|
||||
- [LLMChain](/docs/versions/migrating_chains/llm_chain)
|
||||
- [ConversationChain](/docs/versions/migrating_chains/conversation_chain)
|
||||
- [RetrievalQA](/docs/versions/migrating_chains/retrieval_qa)
|
||||
- [ConversationalRetrievalChain](/docs/versions/migrating_chains/conversation_retrieval_chain)
|
||||
- [StuffDocumentsChain](/docs/versions/migrating_chains/stuff_docs_chain)
|
||||
- [MapReduceDocumentsChain](/docs/versions/migrating_chains/map_reduce_chain)
|
||||
- [MapRerankDocumentsChain](/docs/versions/migrating_chains/map_rerank_docs_chain)
|
||||
- [RefineDocumentsChain](/docs/versions/migrating_chains/refine_docs_chain)
|
||||
- [LLMRouterChain](/docs/versions/migrating_chains/llm_router_chain)
|
||||
- [MultiPromptChain](/docs/versions/migrating_chains/multi_prompt_chain)
|
||||
- [LLMMathChain](/docs/versions/migrating_chains/llm_math_chain)
|
||||
- [ConstitutionalChain](/docs/versions/migrating_chains/constitutional_chain)
|
||||
|
||||
Check out the [LCEL conceptual docs](/docs/concepts/#langchain-expression-language-lcel) and [LangGraph docs](https://langchain-ai.github.io/langgraph/) for more background information.
|
||||
@@ -1,20 +1,12 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b57124cc-60a0-4c18-b7ce-3e483d1024a2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from LLMChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ce8457ed-c0b1-4a74-abbd-9d3d2211270f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Migrating from LLMChain\n",
|
||||
"\n",
|
||||
"[`LLMChain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.llm.LLMChain.html) combined a prompt template, LLM, and output parser into a class.\n",
|
||||
"\n",
|
||||
"Some advantages of switching to the LCEL implementation are:\n",
|
||||
@@ -36,7 +28,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 1,
|
||||
"id": "717c8673",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -44,7 +36,8 @@
|
||||
"import os\n",
|
||||
"from getpass import getpass\n",
|
||||
"\n",
|
||||
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
|
||||
"if \"OPENAI_API_KEY\" not in os.environ:\n",
|
||||
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -59,7 +52,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 5,
|
||||
"id": "f91c9809-8ee7-4e38-881d-0ace4f6ea883",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -70,7 +63,7 @@
|
||||
" 'text': \"Why couldn't the bicycle stand up by itself?\\n\\nBecause it was two tired!\"}"
|
||||
]
|
||||
},
|
||||
"execution_count": 2,
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -84,9 +77,39 @@
|
||||
" [(\"user\", \"Tell me a {adjective} joke\")],\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"chain = LLMChain(llm=ChatOpenAI(), prompt=prompt)\n",
|
||||
"legacy_chain = LLMChain(llm=ChatOpenAI(), prompt=prompt)\n",
|
||||
"\n",
|
||||
"chain({\"adjective\": \"funny\"})"
|
||||
"legacy_result = legacy_chain({\"adjective\": \"funny\"})\n",
|
||||
"legacy_result"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9f89e97b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Note that `LLMChain` by default returned a `dict` containing both the input and the output from `StrOutputParser`, so to extract the output, you need to access the `\"text\"` key."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "c7fa1618",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"\"Why couldn't the bicycle stand up by itself?\\n\\nBecause it was two tired!\""
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"legacy_result[\"text\"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -137,7 +160,7 @@
|
||||
"id": "3c0b0513-77b8-4371-a20e-3e487cec7e7f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Note that `LLMChain` by default returns a `dict` containing both the input and the output. If this behavior is desired, we can replicate it using another LCEL primitive, [`RunnablePassthrough`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html):"
|
||||
"If you'd like to mimic the `dict` packaging of input and output in `LLMChain`, you can use a [`RunnablePassthrough.assign`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html) like:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -197,7 +220,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.4"
|
||||
"version": "3.11.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -1,20 +1,12 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b57124cc-60a0-4c18-b7ce-3e483d1024a2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from LLMMathChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ce8457ed-c0b1-4a74-abbd-9d3d2211270f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Migrating from LLMMathChain\n",
|
||||
"\n",
|
||||
"[`LLMMathChain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.llm_math.base.LLMMathChain.html) enabled the evaluation of mathematical expressions generated by a LLM. Instructions for generating the expressions were formatted into the prompt, and the expressions were parsed out of the string response before evaluation using the [numexpr](https://numexpr.readthedocs.io/en/latest/user_guide.html) library.\n",
|
||||
"\n",
|
||||
"This is more naturally achieved via [tool calling](/docs/concepts/#functiontool-calling). We can equip a chat model with a simple calculator tool leveraging `numexpr` and construct a simple chain around it using [LangGraph](https://langchain-ai.github.io/langgraph/). Some advantages of this approach include:\n",
|
||||
|
||||
@@ -1,20 +1,12 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "575befea-4d98-4941-8e55-1581b169a674",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from LLMRouterChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "14625d35-efca-41cf-b203-be9f4c375700",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Migrating from LLMRouterChain\n",
|
||||
"\n",
|
||||
"The [`LLMRouterChain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.router.llm_router.LLMRouterChain.html) routed an input query to one of multiple destinations-- that is, given an input query, it used a LLM to select from a list of destination chains, and passed its inputs to the selected chain.\n",
|
||||
"\n",
|
||||
"`LLMRouterChain` does not support common [chat model](/docs/concepts/#chat-models) features, such as message roles and [tool calling](/docs/concepts/#functiontool-calling). Under the hood, `LLMRouterChain` routes a query by instructing the LLM to generate JSON-formatted text, and parsing out the intended destination.\n",
|
||||
|
||||
@@ -1,20 +1,12 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3270b34b-8958-425c-886a-ea4b9e26b475",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from MapReduceDocumentsChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2c7bdc91-9b89-4e59-bc27-89508b024635",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Migrating from MapReduceDocumentsChain\n",
|
||||
"\n",
|
||||
"[MapReduceDocumentsChain](https://api.python.langchain.com/en/latest/chains/langchain.chains.combine_documents.map_reduce.MapReduceDocumentsChain.html) implements a map-reduce strategy over (potentially long) texts. The strategy is as follows:\n",
|
||||
"\n",
|
||||
"- Split a text into smaller documents;\n",
|
||||
@@ -37,11 +29,9 @@
|
||||
"\n",
|
||||
"Let's first load a chat model:\n",
|
||||
"\n",
|
||||
"```{=mdx}\n",
|
||||
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
|
||||
"\n",
|
||||
"<ChatModelTabs customVarName=\"llm\" />\n",
|
||||
"```"
|
||||
"<ChatModelTabs customVarName=\"llm\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -66,7 +56,7 @@
|
||||
"source": [
|
||||
"## Basic example (short documents)\n",
|
||||
"\n",
|
||||
"Let's generate some simple documents for illustrative purposes."
|
||||
"Let's use the following 3 documents for illustrative purposes."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -206,7 +196,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"pip install -qU langgraph"
|
||||
"% pip install -qU langgraph"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -5,9 +5,7 @@
|
||||
"id": "9db5ad7a-857e-46ea-9d0c-ba3fbe62fc81",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from MapRerankDocumentsChain\n",
|
||||
"---\n",
|
||||
"# Migrating from MapRerankDocumentsChain\n",
|
||||
"\n",
|
||||
"[MapRerankDocumentsChain](https://api.python.langchain.com/en/latest/chains/langchain.chains.combine_documents.map_rerank.MapRerankDocumentsChain.html) implements a strategy for analyzing long texts. The strategy is as follows:\n",
|
||||
"\n",
|
||||
@@ -27,7 +25,7 @@
|
||||
"source": [
|
||||
"## Example\n",
|
||||
"\n",
|
||||
"Let's go through an example where we analyze a set of documents. We first generate some simple documents for illustrative purposes:"
|
||||
"Let's go through an example where we analyze a set of documents. Let's use the following 3 documents:"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -1,20 +1,12 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "575befea-4d98-4941-8e55-1581b169a674",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from MultiPromptChain\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "14625d35-efca-41cf-b203-be9f4c375700",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Migrating from MultiPromptChain\n",
|
||||
"\n",
|
||||
"The [`MultiPromptChain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.router.multi_prompt.MultiPromptChain.html) routed an input query to one of multiple LLMChains-- that is, given an input query, it used a LLM to select from a list of prompts, formatted the query into the prompt, and generated a response.\n",
|
||||
"\n",
|
||||
"`MultiPromptChain` does not support common [chat model](/docs/concepts/#chat-models) features, such as message roles and [tool calling](/docs/concepts/#functiontool-calling).\n",
|
||||
@@ -321,7 +313,7 @@
|
||||
"\n",
|
||||
"## Overview:\n",
|
||||
"\n",
|
||||
"- Under the hood, `MultiPromptChain` routes the query by instructing the LLM to generate JSON-formatted text, and parses out the intended destination. It takes a registry of string prompt templates as input.\n",
|
||||
"- Under the hood, `MultiPromptChain` routed the query by instructing the LLM to generate JSON-formatted text, and parses out the intended destination. It took a registry of string prompt templates as input.\n",
|
||||
"- The LangGraph implementation, implemented above via lower-level primitives, uses tool-calling to route to arbitrary chains. In this example, the chains include chat model templates and chat models."
|
||||
]
|
||||
},
|
||||
|
||||
@@ -5,9 +5,7 @@
|
||||
"id": "32eee276-7847-45d8-b303-dccc330c8a1a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from RefineDocumentsChain\n",
|
||||
"---\n",
|
||||
"# Migrating from RefineDocumentsChain\n",
|
||||
"\n",
|
||||
"[RefineDocumentsChain](https://api.python.langchain.com/en/latest/chains/langchain.chains.combine_documents.refine.RefineDocumentsChain.html) implements a strategy for analyzing long texts. The strategy is as follows:\n",
|
||||
"\n",
|
||||
@@ -28,11 +26,9 @@
|
||||
"\n",
|
||||
"Let's first load a chat model:\n",
|
||||
"\n",
|
||||
"```{=mdx}\n",
|
||||
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
|
||||
"\n",
|
||||
"<ChatModelTabs customVarName=\"llm\" />\n",
|
||||
"```"
|
||||
"<ChatModelTabs customVarName=\"llm\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -1,21 +1,13 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "eddcd5c1-cbe9-4a7d-8903-7d1ab29f9094",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from RetrievalQA\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b2d37868-dd01-4814-a76a-256f36cf66f7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The [`RetrievalQA`](https://api.python.langchain.com/en/latest/chains/langchain.chains.retrieval_qa.base.RetrievalQA.html) chain performed natural-language question answering over a data source using retrieval-augmented generation.\n",
|
||||
"# Migrating from RetrievalQA\n",
|
||||
"\n",
|
||||
"The [`RetrievalQA` chain](https://api.python.langchain.com/en/latest/chains/langchain.chains.retrieval_qa.base.RetrievalQA.html) performed natural-language question answering over a data source using retrieval-augmented generation.\n",
|
||||
"\n",
|
||||
"Some advantages of switching to the LCEL implementation are:\n",
|
||||
"\n",
|
||||
@@ -23,7 +15,13 @@
|
||||
"- More easily return source documents.\n",
|
||||
"- Support for runnable methods like streaming and async operations.\n",
|
||||
"\n",
|
||||
"Now let's look at them side-by-side. We'll use the same ingestion code to load a [blog post by Lilian Weng](https://lilianweng.github.io/posts/2023-06-23-agent/) on autonomous agents into a local vector store:"
|
||||
"Now let's look at them side-by-side. We'll use the following ingestion code to load a [blog post by Lilian Weng](https://lilianweng.github.io/posts/2023-06-23-agent/) on autonomous agents into a local vector store:\n",
|
||||
"\n",
|
||||
"## Shared setup\n",
|
||||
"\n",
|
||||
"For both versions, we'll need to load the data with the `WebBaseLoader` document loader, split it with `RecursiveCharacterTextSplitter`, and add it to an in-memory `FAISS` vector store.\n",
|
||||
"\n",
|
||||
"We will also instantiate a chat model to use."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -227,7 +225,7 @@
|
||||
"\n",
|
||||
"## Next steps\n",
|
||||
"\n",
|
||||
"Check out the [LCEL conceptual docs](/docs/concepts/#langchain-expression-language-lcel) for more background information."
|
||||
"Check out the [LCEL conceptual docs](/docs/concepts/#langchain-expression-language-lcel) for more background information on the LangChain expression language."
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
@@ -5,9 +5,7 @@
|
||||
"id": "ed78c53c-55ad-4ea2-9cc2-a39a1963c098",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"title: Migrating from StuffDocumentsChain\n",
|
||||
"---\n",
|
||||
"# Migrating from StuffDocumentsChain\n",
|
||||
"\n",
|
||||
"[StuffDocumentsChain](https://api.python.langchain.com/en/latest/chains/langchain.chains.combine_documents.stuff.StuffDocumentsChain.html) combines documents by concatenating them into a single context window. It is a straightforward and effective strategy for combining documents for question-answering, summarization, and other purposes.\n",
|
||||
"\n",
|
||||
@@ -17,11 +15,9 @@
|
||||
"\n",
|
||||
"Let's first load a chat model:\n",
|
||||
"\n",
|
||||
"```{=mdx}\n",
|
||||
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
|
||||
"\n",
|
||||
"<ChatModelTabs customVarName=\"llm\" />\n",
|
||||
"```"
|
||||
"<ChatModelTabs customVarName=\"llm\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -8,11 +8,14 @@ export default function Compatibility({ packagesAndVersions }) {
|
||||
The code in this guide requires{" "}
|
||||
{packagesAndVersions.map(([pkg, version], i) => {
|
||||
return (
|
||||
<code key={`compatiblity-map${pkg}>=${version}-${i}`}>{`${pkg}>=${version}`}</code>
|
||||
<span key={`compatibility-map${pkg}>=${version}-${i}`}>
|
||||
<code>{`${pkg}>=${version}`}</code>
|
||||
{i < packagesAndVersions.length - 1 && ", "}
|
||||
</span>
|
||||
);
|
||||
})}.
|
||||
Please ensure you have the correct packages installed.
|
||||
</span>
|
||||
</Admonition>
|
||||
);
|
||||
}
|
||||
}
|
||||
238
libs/cli/poetry.lock
generated
238
libs/cli/poetry.lock
generated
@@ -216,18 +216,18 @@ test = ["pytest (>=6)"]
|
||||
|
||||
[[package]]
|
||||
name = "fastapi"
|
||||
version = "0.112.0"
|
||||
version = "0.112.1"
|
||||
description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "fastapi-0.112.0-py3-none-any.whl", hash = "sha256:3487ded9778006a45834b8c816ec4a48d522e2631ca9e75ec5a774f1b052f821"},
|
||||
{file = "fastapi-0.112.0.tar.gz", hash = "sha256:d262bc56b7d101d1f4e8fc0ad2ac75bb9935fec504d2b7117686cec50710cf05"},
|
||||
{file = "fastapi-0.112.1-py3-none-any.whl", hash = "sha256:bcbd45817fc2a1cd5da09af66815b84ec0d3d634eb173d1ab468ae3103e183e4"},
|
||||
{file = "fastapi-0.112.1.tar.gz", hash = "sha256:b2537146f8c23389a7faa8b03d0bd38d4986e6983874557d95eed2acc46448ef"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
pydantic = ">=1.7.4,<1.8 || >1.8,<1.8.1 || >1.8.1,<2.0.0 || >2.0.0,<2.0.1 || >2.0.1,<2.1.0 || >2.1.0,<3.0.0"
|
||||
starlette = ">=0.37.2,<0.38.0"
|
||||
starlette = ">=0.37.2,<0.39.0"
|
||||
typing-extensions = ">=4.8.0"
|
||||
|
||||
[package.extras]
|
||||
@@ -335,21 +335,25 @@ files = [
|
||||
|
||||
[[package]]
|
||||
name = "importlib-resources"
|
||||
version = "6.4.0"
|
||||
version = "6.4.4"
|
||||
description = "Read resources from Python packages"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "importlib_resources-6.4.0-py3-none-any.whl", hash = "sha256:50d10f043df931902d4194ea07ec57960f66a80449ff867bfe782b4c486ba78c"},
|
||||
{file = "importlib_resources-6.4.0.tar.gz", hash = "sha256:cdb2b453b8046ca4e3798eb1d84f3cce1446a0e8e7b5ef4efb600f19fc398145"},
|
||||
{file = "importlib_resources-6.4.4-py3-none-any.whl", hash = "sha256:dda242603d1c9cd836c3368b1174ed74cb4049ecd209e7a1a0104620c18c5c11"},
|
||||
{file = "importlib_resources-6.4.4.tar.gz", hash = "sha256:20600c8b7361938dc0bb2d5ec0297802e575df486f5a544fa414da65e13721f7"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
zipp = {version = ">=3.1.0", markers = "python_version < \"3.10\""}
|
||||
|
||||
[package.extras]
|
||||
docs = ["furo", "jaraco.packaging (>=9.3)", "jaraco.tidelift (>=1.4)", "rst.linker (>=1.9)", "sphinx (<7.2.5)", "sphinx (>=3.5)", "sphinx-lint"]
|
||||
testing = ["jaraco.test (>=5.4)", "pytest (>=6)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=2.2)", "pytest-mypy", "pytest-ruff (>=0.2.1)", "zipp (>=3.17)"]
|
||||
check = ["pytest-checkdocs (>=2.4)", "pytest-ruff (>=0.2.1)"]
|
||||
cover = ["pytest-cov"]
|
||||
doc = ["furo", "jaraco.packaging (>=9.3)", "jaraco.tidelift (>=1.4)", "rst.linker (>=1.9)", "sphinx (>=3.5)", "sphinx-lint"]
|
||||
enabler = ["pytest-enabler (>=2.2)"]
|
||||
test = ["jaraco.test (>=5.4)", "pytest (>=6,!=8.1.*)", "zipp (>=3.17)"]
|
||||
type = ["pytest-mypy"]
|
||||
|
||||
[[package]]
|
||||
name = "iniconfig"
|
||||
@@ -427,13 +431,13 @@ referencing = ">=0.31.0"
|
||||
|
||||
[[package]]
|
||||
name = "langchain-core"
|
||||
version = "0.2.29"
|
||||
version = "0.2.34"
|
||||
description = "Building applications with LLMs through composability"
|
||||
optional = false
|
||||
python-versions = "<4.0,>=3.8.1"
|
||||
files = [
|
||||
{file = "langchain_core-0.2.29-py3-none-any.whl", hash = "sha256:846c04a3bb72e409a9b928e0eb3ea1762e1473f2c4fb6df2596fbd7b3ab75973"},
|
||||
{file = "langchain_core-0.2.29.tar.gz", hash = "sha256:491324745a7afee5a7b285c3904edd9dd0c6efa7daf26b92fec6e84a2d2f5d10"},
|
||||
{file = "langchain_core-0.2.34-py3-none-any.whl", hash = "sha256:c4fd158273e28cef758b4eccc956b424b76d4bb9117ce6014ae6eb2fb985801d"},
|
||||
{file = "langchain_core-0.2.34.tar.gz", hash = "sha256:50048d90b175c0d5a7e28164628b3c7f8c82b0dc2cd766a663d346a18d5c9eb2"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -475,16 +479,17 @@ server = ["fastapi (>=0.90.1,<1)", "sse-starlette (>=1.3.0,<2.0.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "langsmith"
|
||||
version = "0.1.98"
|
||||
version = "0.1.101"
|
||||
description = "Client library to connect to the LangSmith LLM Tracing and Evaluation Platform."
|
||||
optional = false
|
||||
python-versions = "<4.0,>=3.8.1"
|
||||
files = [
|
||||
{file = "langsmith-0.1.98-py3-none-any.whl", hash = "sha256:f79e8a128652bbcee4606d10acb6236973b5cd7dde76e3741186d3b97b5698e9"},
|
||||
{file = "langsmith-0.1.98.tar.gz", hash = "sha256:e07678219a0502e8f26d35294e72127a39d25e32fafd091af5a7bb661e9a6bd1"},
|
||||
{file = "langsmith-0.1.101-py3-none-any.whl", hash = "sha256:572e2c90709cda1ad837ac86cedda7295f69933f2124c658a92a35fb890477cc"},
|
||||
{file = "langsmith-0.1.101.tar.gz", hash = "sha256:caf4d95f314bb6cd3c4e0632eed821fd5cd5d0f18cb824772fce6d7a9113895b"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
httpx = ">=0.23.0,<1"
|
||||
orjson = ">=3.9.14,<4.0.0"
|
||||
pydantic = [
|
||||
{version = ">=1,<3", markers = "python_full_version < \"3.12.4\""},
|
||||
@@ -569,62 +574,68 @@ files = [
|
||||
|
||||
[[package]]
|
||||
name = "orjson"
|
||||
version = "3.10.6"
|
||||
version = "3.10.7"
|
||||
description = "Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "orjson-3.10.6-cp310-cp310-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:fb0ee33124db6eaa517d00890fc1a55c3bfe1cf78ba4a8899d71a06f2d6ff5c7"},
|
||||
{file = "orjson-3.10.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9c1c4b53b24a4c06547ce43e5fee6ec4e0d8fe2d597f4647fc033fd205707365"},
|
||||
{file = "orjson-3.10.6-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:eadc8fd310edb4bdbd333374f2c8fec6794bbbae99b592f448d8214a5e4050c0"},
|
||||
{file = "orjson-3.10.6-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:61272a5aec2b2661f4fa2b37c907ce9701e821b2c1285d5c3ab0207ebd358d38"},
|
||||
{file = "orjson-3.10.6-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:57985ee7e91d6214c837936dc1608f40f330a6b88bb13f5a57ce5257807da143"},
|
||||
{file = "orjson-3.10.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:633a3b31d9d7c9f02d49c4ab4d0a86065c4a6f6adc297d63d272e043472acab5"},
|
||||
{file = "orjson-3.10.6-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:1c680b269d33ec444afe2bdc647c9eb73166fa47a16d9a75ee56a374f4a45f43"},
|
||||
{file = "orjson-3.10.6-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:f759503a97a6ace19e55461395ab0d618b5a117e8d0fbb20e70cfd68a47327f2"},
|
||||
{file = "orjson-3.10.6-cp310-none-win32.whl", hash = "sha256:95a0cce17f969fb5391762e5719575217bd10ac5a189d1979442ee54456393f3"},
|
||||
{file = "orjson-3.10.6-cp310-none-win_amd64.whl", hash = "sha256:df25d9271270ba2133cc88ee83c318372bdc0f2cd6f32e7a450809a111efc45c"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:b1ec490e10d2a77c345def52599311849fc063ae0e67cf4f84528073152bb2ba"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:55d43d3feb8f19d07e9f01e5b9be4f28801cf7c60d0fa0d279951b18fae1932b"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:ac3045267e98fe749408eee1593a142e02357c5c99be0802185ef2170086a863"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c27bc6a28ae95923350ab382c57113abd38f3928af3c80be6f2ba7eb8d8db0b0"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:d27456491ca79532d11e507cadca37fb8c9324a3976294f68fb1eff2dc6ced5a"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:05ac3d3916023745aa3b3b388e91b9166be1ca02b7c7e41045da6d12985685f0"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:1335d4ef59ab85cab66fe73fd7a4e881c298ee7f63ede918b7faa1b27cbe5212"},
|
||||
{file = "orjson-3.10.6-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:4bbc6d0af24c1575edc79994c20e1b29e6fb3c6a570371306db0993ecf144dc5"},
|
||||
{file = "orjson-3.10.6-cp311-none-win32.whl", hash = "sha256:450e39ab1f7694465060a0550b3f6d328d20297bf2e06aa947b97c21e5241fbd"},
|
||||
{file = "orjson-3.10.6-cp311-none-win_amd64.whl", hash = "sha256:227df19441372610b20e05bdb906e1742ec2ad7a66ac8350dcfd29a63014a83b"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:ea2977b21f8d5d9b758bb3f344a75e55ca78e3ff85595d248eee813ae23ecdfb"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b6f3d167d13a16ed263b52dbfedff52c962bfd3d270b46b7518365bcc2121eed"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f710f346e4c44a4e8bdf23daa974faede58f83334289df80bc9cd12fe82573c7"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:7275664f84e027dcb1ad5200b8b18373e9c669b2a9ec33d410c40f5ccf4b257e"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0943e4c701196b23c240b3d10ed8ecd674f03089198cf503105b474a4f77f21f"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:446dee5a491b5bc7d8f825d80d9637e7af43f86a331207b9c9610e2f93fee22a"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:64c81456d2a050d380786413786b057983892db105516639cb5d3ee3c7fd5148"},
|
||||
{file = "orjson-3.10.6-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:960db0e31c4e52fa0fc3ecbaea5b2d3b58f379e32a95ae6b0ebeaa25b93dfd34"},
|
||||
{file = "orjson-3.10.6-cp312-none-win32.whl", hash = "sha256:a6ea7afb5b30b2317e0bee03c8d34c8181bc5a36f2afd4d0952f378972c4efd5"},
|
||||
{file = "orjson-3.10.6-cp312-none-win_amd64.whl", hash = "sha256:874ce88264b7e655dde4aeaacdc8fd772a7962faadfb41abe63e2a4861abc3dc"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:66680eae4c4e7fc193d91cfc1353ad6d01b4801ae9b5314f17e11ba55e934183"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:caff75b425db5ef8e8f23af93c80f072f97b4fb3afd4af44482905c9f588da28"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:3722fddb821b6036fd2a3c814f6bd9b57a89dc6337b9924ecd614ebce3271394"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c2c116072a8533f2fec435fde4d134610f806bdac20188c7bd2081f3e9e0133f"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6eeb13218c8cf34c61912e9df2de2853f1d009de0e46ea09ccdf3d757896af0a"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:965a916373382674e323c957d560b953d81d7a8603fbeee26f7b8248638bd48b"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-musllinux_1_2_aarch64.whl", hash = "sha256:03c95484d53ed8e479cade8628c9cea00fd9d67f5554764a1110e0d5aa2de96e"},
|
||||
{file = "orjson-3.10.6-cp38-cp38-musllinux_1_2_x86_64.whl", hash = "sha256:e060748a04cccf1e0a6f2358dffea9c080b849a4a68c28b1b907f272b5127e9b"},
|
||||
{file = "orjson-3.10.6-cp38-none-win32.whl", hash = "sha256:738dbe3ef909c4b019d69afc19caf6b5ed0e2f1c786b5d6215fbb7539246e4c6"},
|
||||
{file = "orjson-3.10.6-cp38-none-win_amd64.whl", hash = "sha256:d40f839dddf6a7d77114fe6b8a70218556408c71d4d6e29413bb5f150a692ff7"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:697a35a083c4f834807a6232b3e62c8b280f7a44ad0b759fd4dce748951e70db"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fd502f96bf5ea9a61cbc0b2b5900d0dd68aa0da197179042bdd2be67e51a1e4b"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f215789fb1667cdc874c1b8af6a84dc939fd802bf293a8334fce185c79cd359b"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a2debd8ddce948a8c0938c8c93ade191d2f4ba4649a54302a7da905a81f00b56"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5410111d7b6681d4b0d65e0f58a13be588d01b473822483f77f513c7f93bd3b2"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bb1f28a137337fdc18384079fa5726810681055b32b92253fa15ae5656e1dddb"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:bf2fbbce5fe7cd1aa177ea3eab2b8e6a6bc6e8592e4279ed3db2d62e57c0e1b2"},
|
||||
{file = "orjson-3.10.6-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:79b9b9e33bd4c517445a62b90ca0cc279b0f1f3970655c3df9e608bc3f91741a"},
|
||||
{file = "orjson-3.10.6-cp39-none-win32.whl", hash = "sha256:30b0a09a2014e621b1adf66a4f705f0809358350a757508ee80209b2d8dae219"},
|
||||
{file = "orjson-3.10.6-cp39-none-win_amd64.whl", hash = "sha256:49e3bc615652617d463069f91b867a4458114c5b104e13b7ae6872e5f79d0844"},
|
||||
{file = "orjson-3.10.6.tar.gz", hash = "sha256:e54b63d0a7c6c54a5f5f726bc93a2078111ef060fec4ecbf34c5db800ca3b3a7"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:74f4544f5a6405b90da8ea724d15ac9c36da4d72a738c64685003337401f5c12"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:34a566f22c28222b08875b18b0dfbf8a947e69df21a9ed5c51a6bf91cfb944ac"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:bf6ba8ebc8ef5792e2337fb0419f8009729335bb400ece005606336b7fd7bab7"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ac7cf6222b29fbda9e3a472b41e6a5538b48f2c8f99261eecd60aafbdb60690c"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:de817e2f5fc75a9e7dd350c4b0f54617b280e26d1631811a43e7e968fa71e3e9"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:348bdd16b32556cf8d7257b17cf2bdb7ab7976af4af41ebe79f9796c218f7e91"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:479fd0844ddc3ca77e0fd99644c7fe2de8e8be1efcd57705b5c92e5186e8a250"},
|
||||
{file = "orjson-3.10.7-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:fdf5197a21dd660cf19dfd2a3ce79574588f8f5e2dbf21bda9ee2d2b46924d84"},
|
||||
{file = "orjson-3.10.7-cp310-none-win32.whl", hash = "sha256:d374d36726746c81a49f3ff8daa2898dccab6596864ebe43d50733275c629175"},
|
||||
{file = "orjson-3.10.7-cp310-none-win_amd64.whl", hash = "sha256:cb61938aec8b0ffb6eef484d480188a1777e67b05d58e41b435c74b9d84e0b9c"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:7db8539039698ddfb9a524b4dd19508256107568cdad24f3682d5773e60504a2"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:480f455222cb7a1dea35c57a67578848537d2602b46c464472c995297117fa09"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:8a9c9b168b3a19e37fe2778c0003359f07822c90fdff8f98d9d2a91b3144d8e0"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8de062de550f63185e4c1c54151bdddfc5625e37daf0aa1e75d2a1293e3b7d9a"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6b0dd04483499d1de9c8f6203f8975caf17a6000b9c0c54630cef02e44ee624e"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b58d3795dafa334fc8fd46f7c5dc013e6ad06fd5b9a4cc98cb1456e7d3558bd6"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:33cfb96c24034a878d83d1a9415799a73dc77480e6c40417e5dda0710d559ee6"},
|
||||
{file = "orjson-3.10.7-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:e724cebe1fadc2b23c6f7415bad5ee6239e00a69f30ee423f319c6af70e2a5c0"},
|
||||
{file = "orjson-3.10.7-cp311-none-win32.whl", hash = "sha256:82763b46053727a7168d29c772ed5c870fdae2f61aa8a25994c7984a19b1021f"},
|
||||
{file = "orjson-3.10.7-cp311-none-win_amd64.whl", hash = "sha256:eb8d384a24778abf29afb8e41d68fdd9a156cf6e5390c04cc07bbc24b89e98b5"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:44a96f2d4c3af51bfac6bc4ef7b182aa33f2f054fd7f34cc0ee9a320d051d41f"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:76ac14cd57df0572453543f8f2575e2d01ae9e790c21f57627803f5e79b0d3c3"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:bdbb61dcc365dd9be94e8f7df91975edc9364d6a78c8f7adb69c1cdff318ec93"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b48b3db6bb6e0a08fa8c83b47bc169623f801e5cc4f24442ab2b6617da3b5313"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:23820a1563a1d386414fef15c249040042b8e5d07b40ab3fe3efbfbbcbcb8864"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a0c6a008e91d10a2564edbb6ee5069a9e66df3fbe11c9a005cb411f441fd2c09"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:d352ee8ac1926d6193f602cbe36b1643bbd1bbcb25e3c1a657a4390f3000c9a5"},
|
||||
{file = "orjson-3.10.7-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:d2d9f990623f15c0ae7ac608103c33dfe1486d2ed974ac3f40b693bad1a22a7b"},
|
||||
{file = "orjson-3.10.7-cp312-none-win32.whl", hash = "sha256:7c4c17f8157bd520cdb7195f75ddbd31671997cbe10aee559c2d613592e7d7eb"},
|
||||
{file = "orjson-3.10.7-cp312-none-win_amd64.whl", hash = "sha256:1d9c0e733e02ada3ed6098a10a8ee0052dd55774de3d9110d29868d24b17faa1"},
|
||||
{file = "orjson-3.10.7-cp313-cp313-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:77d325ed866876c0fa6492598ec01fe30e803272a6e8b10e992288b009cbe149"},
|
||||
{file = "orjson-3.10.7-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9ea2c232deedcb605e853ae1db2cc94f7390ac776743b699b50b071b02bea6fe"},
|
||||
{file = "orjson-3.10.7-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:3dcfbede6737fdbef3ce9c37af3fb6142e8e1ebc10336daa05872bfb1d87839c"},
|
||||
{file = "orjson-3.10.7-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:11748c135f281203f4ee695b7f80bb1358a82a63905f9f0b794769483ea854ad"},
|
||||
{file = "orjson-3.10.7-cp313-none-win32.whl", hash = "sha256:a7e19150d215c7a13f39eb787d84db274298d3f83d85463e61d277bbd7f401d2"},
|
||||
{file = "orjson-3.10.7-cp313-none-win_amd64.whl", hash = "sha256:eef44224729e9525d5261cc8d28d6b11cafc90e6bd0be2157bde69a52ec83024"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:6ea2b2258eff652c82652d5e0f02bd5e0463a6a52abb78e49ac288827aaa1469"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:430ee4d85841e1483d487e7b81401785a5dfd69db5de01314538f31f8fbf7ee1"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:4b6146e439af4c2472c56f8540d799a67a81226e11992008cb47e1267a9b3225"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:084e537806b458911137f76097e53ce7bf5806dda33ddf6aaa66a028f8d43a23"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4829cf2195838e3f93b70fd3b4292156fc5e097aac3739859ac0dcc722b27ac0"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1193b2416cbad1a769f868b1749535d5da47626ac29445803dae7cc64b3f5c98"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-musllinux_1_2_aarch64.whl", hash = "sha256:4e6c3da13e5a57e4b3dca2de059f243ebec705857522f188f0180ae88badd354"},
|
||||
{file = "orjson-3.10.7-cp38-cp38-musllinux_1_2_x86_64.whl", hash = "sha256:c31008598424dfbe52ce8c5b47e0752dca918a4fdc4a2a32004efd9fab41d866"},
|
||||
{file = "orjson-3.10.7-cp38-none-win32.whl", hash = "sha256:7122a99831f9e7fe977dc45784d3b2edc821c172d545e6420c375e5a935f5a1c"},
|
||||
{file = "orjson-3.10.7-cp38-none-win_amd64.whl", hash = "sha256:a763bc0e58504cc803739e7df040685816145a6f3c8a589787084b54ebc9f16e"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:e76be12658a6fa376fcd331b1ea4e58f5a06fd0220653450f0d415b8fd0fbe20"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ed350d6978d28b92939bfeb1a0570c523f6170efc3f0a0ef1f1df287cd4f4960"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:144888c76f8520e39bfa121b31fd637e18d4cc2f115727865fdf9fa325b10412"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:09b2d92fd95ad2402188cf51573acde57eb269eddabaa60f69ea0d733e789fe9"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5b24a579123fa884f3a3caadaed7b75eb5715ee2b17ab5c66ac97d29b18fe57f"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e72591bcfe7512353bd609875ab38050efe3d55e18934e2f18950c108334b4ff"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:f4db56635b58cd1a200b0a23744ff44206ee6aa428185e2b6c4a65b3197abdcd"},
|
||||
{file = "orjson-3.10.7-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:0fa5886854673222618638c6df7718ea7fe2f3f2384c452c9ccedc70b4a510a5"},
|
||||
{file = "orjson-3.10.7-cp39-none-win32.whl", hash = "sha256:8272527d08450ab16eb405f47e0f4ef0e5ff5981c3d82afe0efd25dcbef2bcd2"},
|
||||
{file = "orjson-3.10.7-cp39-none-win_amd64.whl", hash = "sha256:974683d4618c0c7dbf4f69c95a979734bf183d0658611760017f6e70a145af58"},
|
||||
{file = "orjson-3.10.7.tar.gz", hash = "sha256:75ef0640403f945f3a1f9f6400686560dbfb0fb5b16589ad62cd477043c4eee3"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1143,19 +1154,19 @@ files = [
|
||||
|
||||
[[package]]
|
||||
name = "setuptools"
|
||||
version = "72.1.0"
|
||||
version = "73.0.1"
|
||||
description = "Easily download, build, install, upgrade, and uninstall Python packages"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "setuptools-72.1.0-py3-none-any.whl", hash = "sha256:5a03e1860cf56bb6ef48ce186b0e557fdba433237481a9a625176c2831be15d1"},
|
||||
{file = "setuptools-72.1.0.tar.gz", hash = "sha256:8d243eff56d095e5817f796ede6ae32941278f542e0f941867cc05ae52b162ec"},
|
||||
{file = "setuptools-73.0.1-py3-none-any.whl", hash = "sha256:b208925fcb9f7af924ed2dc04708ea89791e24bde0d3020b27df0e116088b34e"},
|
||||
{file = "setuptools-73.0.1.tar.gz", hash = "sha256:d59a3e788ab7e012ab2c4baed1b376da6366883ee20d7a5fc426816e3d7b1193"},
|
||||
]
|
||||
|
||||
[package.extras]
|
||||
core = ["importlib-metadata (>=6)", "importlib-resources (>=5.10.2)", "jaraco.text (>=3.7)", "more-itertools (>=8.8)", "ordered-set (>=3.1.1)", "packaging (>=24)", "platformdirs (>=2.6.2)", "tomli (>=2.0.1)", "wheel (>=0.43.0)"]
|
||||
doc = ["furo", "jaraco.packaging (>=9.3)", "jaraco.tidelift (>=1.4)", "pygments-github-lexers (==0.0.5)", "pyproject-hooks (!=1.1)", "rst.linker (>=1.9)", "sphinx (>=3.5)", "sphinx-favicon", "sphinx-inline-tabs", "sphinx-lint", "sphinx-notfound-page (>=1,<2)", "sphinx-reredirects", "sphinxcontrib-towncrier"]
|
||||
test = ["build[virtualenv] (>=1.0.3)", "filelock (>=3.4.0)", "importlib-metadata", "ini2toml[lite] (>=0.14)", "jaraco.develop (>=7.21)", "jaraco.envs (>=2.2)", "jaraco.path (>=3.2.0)", "jaraco.test", "mypy (==1.11.*)", "packaging (>=23.2)", "pip (>=19.1)", "pyproject-hooks (!=1.1)", "pytest (>=6,!=8.1.*)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=2.2)", "pytest-home (>=0.5)", "pytest-mypy", "pytest-perf", "pytest-ruff (<0.4)", "pytest-ruff (>=0.2.1)", "pytest-ruff (>=0.3.2)", "pytest-subprocess", "pytest-timeout", "pytest-xdist (>=3)", "tomli", "tomli-w (>=1.0.0)", "virtualenv (>=13.0.0)", "wheel"]
|
||||
core = ["importlib-metadata (>=6)", "importlib-resources (>=5.10.2)", "jaraco.text (>=3.7)", "more-itertools (>=8.8)", "packaging (>=24)", "platformdirs (>=2.6.2)", "tomli (>=2.0.1)", "wheel (>=0.43.0)"]
|
||||
doc = ["furo", "jaraco.packaging (>=9.3)", "jaraco.tidelift (>=1.4)", "pygments-github-lexers (==0.0.5)", "pyproject-hooks (!=1.1)", "rst.linker (>=1.9)", "sphinx (>=3.5)", "sphinx-favicon", "sphinx-inline-tabs", "sphinx-lint", "sphinx-notfound-page (>=1,<2)", "sphinx-reredirects", "sphinxcontrib-towncrier", "towncrier (<24.7)"]
|
||||
test = ["build[virtualenv] (>=1.0.3)", "filelock (>=3.4.0)", "importlib-metadata", "ini2toml[lite] (>=0.14)", "jaraco.develop (>=7.21)", "jaraco.envs (>=2.2)", "jaraco.path (>=3.2.0)", "jaraco.test", "mypy (==1.11.*)", "packaging (>=23.2)", "pip (>=19.1)", "pyproject-hooks (!=1.1)", "pytest (>=6,!=8.1.*)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=2.2)", "pytest-home (>=0.5)", "pytest-mypy", "pytest-perf", "pytest-ruff (<0.4)", "pytest-ruff (>=0.2.1)", "pytest-ruff (>=0.3.2)", "pytest-subprocess", "pytest-timeout", "pytest-xdist (>=3)", "tomli", "tomli-w (>=1.0.0)", "virtualenv (>=13.0.0)", "wheel (>=0.44.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "shellingham"
|
||||
@@ -1209,13 +1220,13 @@ uvicorn = "*"
|
||||
|
||||
[[package]]
|
||||
name = "starlette"
|
||||
version = "0.37.2"
|
||||
version = "0.38.2"
|
||||
description = "The little ASGI library that shines."
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "starlette-0.37.2-py3-none-any.whl", hash = "sha256:6fe59f29268538e5d0d182f2791a479a0c64638e6935d1c6989e63fb2699c6ee"},
|
||||
{file = "starlette-0.37.2.tar.gz", hash = "sha256:9af890290133b79fc3db55474ade20f6220a364a0402e0b556e7cd5e1e093823"},
|
||||
{file = "starlette-0.38.2-py3-none-any.whl", hash = "sha256:4ec6a59df6bbafdab5f567754481657f7ed90dc9d69b0c9ff017907dd54faeff"},
|
||||
{file = "starlette-0.38.2.tar.gz", hash = "sha256:c7c0441065252160993a1a37cf2a73bb64d271b17303e0b0c1eb7191cfb12d75"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -1346,43 +1357,46 @@ standard = ["colorama (>=0.4)", "httptools (>=0.5.0)", "python-dotenv (>=0.13)",
|
||||
|
||||
[[package]]
|
||||
name = "watchdog"
|
||||
version = "4.0.1"
|
||||
version = "4.0.2"
|
||||
description = "Filesystem events monitoring"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "watchdog-4.0.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:da2dfdaa8006eb6a71051795856bedd97e5b03e57da96f98e375682c48850645"},
|
||||
{file = "watchdog-4.0.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:e93f451f2dfa433d97765ca2634628b789b49ba8b504fdde5837cdcf25fdb53b"},
|
||||
{file = "watchdog-4.0.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ef0107bbb6a55f5be727cfc2ef945d5676b97bffb8425650dadbb184be9f9a2b"},
|
||||
{file = "watchdog-4.0.1-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:17e32f147d8bf9657e0922c0940bcde863b894cd871dbb694beb6704cfbd2fb5"},
|
||||
{file = "watchdog-4.0.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:03e70d2df2258fb6cb0e95bbdbe06c16e608af94a3ffbd2b90c3f1e83eb10767"},
|
||||
{file = "watchdog-4.0.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:123587af84260c991dc5f62a6e7ef3d1c57dfddc99faacee508c71d287248459"},
|
||||
{file = "watchdog-4.0.1-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:093b23e6906a8b97051191a4a0c73a77ecc958121d42346274c6af6520dec175"},
|
||||
{file = "watchdog-4.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:611be3904f9843f0529c35a3ff3fd617449463cb4b73b1633950b3d97fa4bfb7"},
|
||||
{file = "watchdog-4.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:62c613ad689ddcb11707f030e722fa929f322ef7e4f18f5335d2b73c61a85c28"},
|
||||
{file = "watchdog-4.0.1-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:d4925e4bf7b9bddd1c3de13c9b8a2cdb89a468f640e66fbfabaf735bd85b3e35"},
|
||||
{file = "watchdog-4.0.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:cad0bbd66cd59fc474b4a4376bc5ac3fc698723510cbb64091c2a793b18654db"},
|
||||
{file = "watchdog-4.0.1-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:a3c2c317a8fb53e5b3d25790553796105501a235343f5d2bf23bb8649c2c8709"},
|
||||
{file = "watchdog-4.0.1-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:c9904904b6564d4ee8a1ed820db76185a3c96e05560c776c79a6ce5ab71888ba"},
|
||||
{file = "watchdog-4.0.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:667f3c579e813fcbad1b784db7a1aaa96524bed53437e119f6a2f5de4db04235"},
|
||||
{file = "watchdog-4.0.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:d10a681c9a1d5a77e75c48a3b8e1a9f2ae2928eda463e8d33660437705659682"},
|
||||
{file = "watchdog-4.0.1-pp310-pypy310_pp73-macosx_10_9_x86_64.whl", hash = "sha256:0144c0ea9997b92615af1d94afc0c217e07ce2c14912c7b1a5731776329fcfc7"},
|
||||
{file = "watchdog-4.0.1-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:998d2be6976a0ee3a81fb8e2777900c28641fb5bfbd0c84717d89bca0addcdc5"},
|
||||
{file = "watchdog-4.0.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:e7921319fe4430b11278d924ef66d4daa469fafb1da679a2e48c935fa27af193"},
|
||||
{file = "watchdog-4.0.1-pp38-pypy38_pp73-macosx_11_0_arm64.whl", hash = "sha256:f0de0f284248ab40188f23380b03b59126d1479cd59940f2a34f8852db710625"},
|
||||
{file = "watchdog-4.0.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl", hash = "sha256:bca36be5707e81b9e6ce3208d92d95540d4ca244c006b61511753583c81c70dd"},
|
||||
{file = "watchdog-4.0.1-pp39-pypy39_pp73-macosx_11_0_arm64.whl", hash = "sha256:ab998f567ebdf6b1da7dc1e5accfaa7c6992244629c0fdaef062f43249bd8dee"},
|
||||
{file = "watchdog-4.0.1-py3-none-manylinux2014_aarch64.whl", hash = "sha256:dddba7ca1c807045323b6af4ff80f5ddc4d654c8bce8317dde1bd96b128ed253"},
|
||||
{file = "watchdog-4.0.1-py3-none-manylinux2014_armv7l.whl", hash = "sha256:4513ec234c68b14d4161440e07f995f231be21a09329051e67a2118a7a612d2d"},
|
||||
{file = "watchdog-4.0.1-py3-none-manylinux2014_i686.whl", hash = "sha256:4107ac5ab936a63952dea2a46a734a23230aa2f6f9db1291bf171dac3ebd53c6"},
|
||||
{file = "watchdog-4.0.1-py3-none-manylinux2014_ppc64.whl", hash = "sha256:6e8c70d2cd745daec2a08734d9f63092b793ad97612470a0ee4cbb8f5f705c57"},
|
||||
{file = "watchdog-4.0.1-py3-none-manylinux2014_ppc64le.whl", hash = "sha256:f27279d060e2ab24c0aa98363ff906d2386aa6c4dc2f1a374655d4e02a6c5e5e"},
|
||||
{file = "watchdog-4.0.1-py3-none-manylinux2014_s390x.whl", hash = "sha256:f8affdf3c0f0466e69f5b3917cdd042f89c8c63aebdb9f7c078996f607cdb0f5"},
|
||||
{file = "watchdog-4.0.1-py3-none-manylinux2014_x86_64.whl", hash = "sha256:ac7041b385f04c047fcc2951dc001671dee1b7e0615cde772e84b01fbf68ee84"},
|
||||
{file = "watchdog-4.0.1-py3-none-win32.whl", hash = "sha256:206afc3d964f9a233e6ad34618ec60b9837d0582b500b63687e34011e15bb429"},
|
||||
{file = "watchdog-4.0.1-py3-none-win_amd64.whl", hash = "sha256:7577b3c43e5909623149f76b099ac49a1a01ca4e167d1785c76eb52fa585745a"},
|
||||
{file = "watchdog-4.0.1-py3-none-win_ia64.whl", hash = "sha256:d7b9f5f3299e8dd230880b6c55504a1f69cf1e4316275d1b215ebdd8187ec88d"},
|
||||
{file = "watchdog-4.0.1.tar.gz", hash = "sha256:eebaacf674fa25511e8867028d281e602ee6500045b57f43b08778082f7f8b44"},
|
||||
{file = "watchdog-4.0.2-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:ede7f010f2239b97cc79e6cb3c249e72962404ae3865860855d5cbe708b0fd22"},
|
||||
{file = "watchdog-4.0.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:a2cffa171445b0efa0726c561eca9a27d00a1f2b83846dbd5a4f639c4f8ca8e1"},
|
||||
{file = "watchdog-4.0.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:c50f148b31b03fbadd6d0b5980e38b558046b127dc483e5e4505fcef250f9503"},
|
||||
{file = "watchdog-4.0.2-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:7c7d4bf585ad501c5f6c980e7be9c4f15604c7cc150e942d82083b31a7548930"},
|
||||
{file = "watchdog-4.0.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:914285126ad0b6eb2258bbbcb7b288d9dfd655ae88fa28945be05a7b475a800b"},
|
||||
{file = "watchdog-4.0.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:984306dc4720da5498b16fc037b36ac443816125a3705dfde4fd90652d8028ef"},
|
||||
{file = "watchdog-4.0.2-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:1cdcfd8142f604630deef34722d695fb455d04ab7cfe9963055df1fc69e6727a"},
|
||||
{file = "watchdog-4.0.2-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:d7ab624ff2f663f98cd03c8b7eedc09375a911794dfea6bf2a359fcc266bff29"},
|
||||
{file = "watchdog-4.0.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:132937547a716027bd5714383dfc40dc66c26769f1ce8a72a859d6a48f371f3a"},
|
||||
{file = "watchdog-4.0.2-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:cd67c7df93eb58f360c43802acc945fa8da70c675b6fa37a241e17ca698ca49b"},
|
||||
{file = "watchdog-4.0.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:bcfd02377be80ef3b6bc4ce481ef3959640458d6feaae0bd43dd90a43da90a7d"},
|
||||
{file = "watchdog-4.0.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:980b71510f59c884d684b3663d46e7a14b457c9611c481e5cef08f4dd022eed7"},
|
||||
{file = "watchdog-4.0.2-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:aa160781cafff2719b663c8a506156e9289d111d80f3387cf3af49cedee1f040"},
|
||||
{file = "watchdog-4.0.2-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:f6ee8dedd255087bc7fe82adf046f0b75479b989185fb0bdf9a98b612170eac7"},
|
||||
{file = "watchdog-4.0.2-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:0b4359067d30d5b864e09c8597b112fe0a0a59321a0f331498b013fb097406b4"},
|
||||
{file = "watchdog-4.0.2-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:770eef5372f146997638d737c9a3c597a3b41037cfbc5c41538fc27c09c3a3f9"},
|
||||
{file = "watchdog-4.0.2-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:eeea812f38536a0aa859972d50c76e37f4456474b02bd93674d1947cf1e39578"},
|
||||
{file = "watchdog-4.0.2-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:b2c45f6e1e57ebb4687690c05bc3a2c1fb6ab260550c4290b8abb1335e0fd08b"},
|
||||
{file = "watchdog-4.0.2-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:10b6683df70d340ac3279eff0b2766813f00f35a1d37515d2c99959ada8f05fa"},
|
||||
{file = "watchdog-4.0.2-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:f7c739888c20f99824f7aa9d31ac8a97353e22d0c0e54703a547a218f6637eb3"},
|
||||
{file = "watchdog-4.0.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:c100d09ac72a8a08ddbf0629ddfa0b8ee41740f9051429baa8e31bb903ad7508"},
|
||||
{file = "watchdog-4.0.2-pp38-pypy38_pp73-macosx_11_0_arm64.whl", hash = "sha256:f5315a8c8dd6dd9425b974515081fc0aadca1d1d61e078d2246509fd756141ee"},
|
||||
{file = "watchdog-4.0.2-pp39-pypy39_pp73-macosx_10_15_x86_64.whl", hash = "sha256:2d468028a77b42cc685ed694a7a550a8d1771bb05193ba7b24006b8241a571a1"},
|
||||
{file = "watchdog-4.0.2-pp39-pypy39_pp73-macosx_11_0_arm64.whl", hash = "sha256:f15edcae3830ff20e55d1f4e743e92970c847bcddc8b7509bcd172aa04de506e"},
|
||||
{file = "watchdog-4.0.2-py3-none-manylinux2014_aarch64.whl", hash = "sha256:936acba76d636f70db8f3c66e76aa6cb5136a936fc2a5088b9ce1c7a3508fc83"},
|
||||
{file = "watchdog-4.0.2-py3-none-manylinux2014_armv7l.whl", hash = "sha256:e252f8ca942a870f38cf785aef420285431311652d871409a64e2a0a52a2174c"},
|
||||
{file = "watchdog-4.0.2-py3-none-manylinux2014_i686.whl", hash = "sha256:0e83619a2d5d436a7e58a1aea957a3c1ccbf9782c43c0b4fed80580e5e4acd1a"},
|
||||
{file = "watchdog-4.0.2-py3-none-manylinux2014_ppc64.whl", hash = "sha256:88456d65f207b39f1981bf772e473799fcdc10801062c36fd5ad9f9d1d463a73"},
|
||||
{file = "watchdog-4.0.2-py3-none-manylinux2014_ppc64le.whl", hash = "sha256:32be97f3b75693a93c683787a87a0dc8db98bb84701539954eef991fb35f5fbc"},
|
||||
{file = "watchdog-4.0.2-py3-none-manylinux2014_s390x.whl", hash = "sha256:c82253cfc9be68e3e49282831afad2c1f6593af80c0daf1287f6a92657986757"},
|
||||
{file = "watchdog-4.0.2-py3-none-manylinux2014_x86_64.whl", hash = "sha256:c0b14488bd336c5b1845cee83d3e631a1f8b4e9c5091ec539406e4a324f882d8"},
|
||||
{file = "watchdog-4.0.2-py3-none-win32.whl", hash = "sha256:0d8a7e523ef03757a5aa29f591437d64d0d894635f8a50f370fe37f913ce4e19"},
|
||||
{file = "watchdog-4.0.2-py3-none-win_amd64.whl", hash = "sha256:c344453ef3bf875a535b0488e3ad28e341adbd5a9ffb0f7d62cefacc8824ef2b"},
|
||||
{file = "watchdog-4.0.2-py3-none-win_ia64.whl", hash = "sha256:baececaa8edff42cd16558a639a9b0ddf425f93d892e8392a56bf904f5eff22c"},
|
||||
{file = "watchdog-4.0.2.tar.gz", hash = "sha256:b4dfbb6c49221be4535623ea4474a4d6ee0a9cef4a80b20c28db4d858b64e270"},
|
||||
]
|
||||
|
||||
[package.extras]
|
||||
@@ -1404,13 +1418,13 @@ test = ["pytest (>=6.0.0)", "setuptools (>=65)"]
|
||||
|
||||
[[package]]
|
||||
name = "zipp"
|
||||
version = "3.19.2"
|
||||
version = "3.20.0"
|
||||
description = "Backport of pathlib-compatible object wrapper for zip files"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "zipp-3.19.2-py3-none-any.whl", hash = "sha256:f091755f667055f2d02b32c53771a7a6c8b47e1fdbc4b72a8b9072b3eef8015c"},
|
||||
{file = "zipp-3.19.2.tar.gz", hash = "sha256:bf1dcf6450f873a13e952a29504887c89e6de7506209e5b1bcc3460135d4de19"},
|
||||
{file = "zipp-3.20.0-py3-none-any.whl", hash = "sha256:58da6168be89f0be59beb194da1250516fdaa062ccebd30127ac65d30045e10d"},
|
||||
{file = "zipp-3.20.0.tar.gz", hash = "sha256:0145e43d89664cfe1a2e533adc75adafed82fe2da404b4bbb6b026c0157bdb31"},
|
||||
]
|
||||
|
||||
[package.extras]
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
[tool.poetry]
|
||||
name = "langchain-cli"
|
||||
version = "0.0.29"
|
||||
version = "0.0.30"
|
||||
description = "CLI for interacting with LangChain"
|
||||
authors = ["Erick Friis <erick@langchain.dev>"]
|
||||
readme = "README.md"
|
||||
|
||||
@@ -5,12 +5,9 @@ against a vector database.
|
||||
|
||||
import datetime
|
||||
import inspect
|
||||
import json
|
||||
import logging
|
||||
from http import HTTPStatus
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import requests # type: ignore
|
||||
from langchain.chains.base import Chain
|
||||
from langchain.chains.combine_documents.base import BaseCombineDocumentsChain
|
||||
from langchain_core.callbacks import (
|
||||
@@ -29,16 +26,14 @@ from langchain_community.chains.pebblo_retrieval.enforcement_filters import (
|
||||
from langchain_community.chains.pebblo_retrieval.models import (
|
||||
App,
|
||||
AuthContext,
|
||||
Qa,
|
||||
ChainInfo,
|
||||
Model,
|
||||
SemanticContext,
|
||||
VectorDB,
|
||||
)
|
||||
from langchain_community.chains.pebblo_retrieval.utilities import (
|
||||
APP_DISCOVER_URL,
|
||||
CLASSIFIER_URL,
|
||||
PEBBLO_CLOUD_URL,
|
||||
PLUGIN_VERSION,
|
||||
PROMPT_GOV_URL,
|
||||
PROMPT_URL,
|
||||
PebbloRetrievalAPIWrapper,
|
||||
get_runtime,
|
||||
)
|
||||
|
||||
@@ -72,16 +67,18 @@ class PebbloRetrievalQA(Chain):
|
||||
"""Description of app."""
|
||||
api_key: Optional[str] = None #: :meta private:
|
||||
"""Pebblo cloud API key for app."""
|
||||
classifier_url: str = CLASSIFIER_URL #: :meta private:
|
||||
classifier_url: Optional[str] = None #: :meta private:
|
||||
"""Classifier endpoint."""
|
||||
classifier_location: str = "local" #: :meta private:
|
||||
"""Classifier location. It could be either of 'local' or 'pebblo-cloud'."""
|
||||
_discover_sent: bool = False #: :meta private:
|
||||
"""Flag to check if discover payload has been sent."""
|
||||
_prompt_sent: bool = False #: :meta private:
|
||||
"""Flag to check if prompt payload has been sent."""
|
||||
enable_prompt_gov: bool = True #: :meta private:
|
||||
"""Flag to check if prompt governance is enabled or not"""
|
||||
pb_client: PebbloRetrievalAPIWrapper = Field(
|
||||
default_factory=PebbloRetrievalAPIWrapper
|
||||
)
|
||||
"""Pebblo Retrieval API client"""
|
||||
|
||||
def _call(
|
||||
self,
|
||||
@@ -100,12 +97,11 @@ class PebbloRetrievalQA(Chain):
|
||||
answer, docs = res['result'], res['source_documents']
|
||||
"""
|
||||
prompt_time = datetime.datetime.now().isoformat()
|
||||
PebbloRetrievalQA.set_prompt_sent(value=False)
|
||||
_run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
|
||||
question = inputs[self.input_key]
|
||||
auth_context = inputs.get(self.auth_context_key, {})
|
||||
semantic_context = inputs.get(self.semantic_context_key, {})
|
||||
_, prompt_entities = self._check_prompt_validity(question)
|
||||
auth_context = inputs.get(self.auth_context_key)
|
||||
semantic_context = inputs.get(self.semantic_context_key)
|
||||
_, prompt_entities = self.pb_client.check_prompt_validity(question)
|
||||
|
||||
accepts_run_manager = (
|
||||
"run_manager" in inspect.signature(self._get_docs).parameters
|
||||
@@ -120,43 +116,17 @@ class PebbloRetrievalQA(Chain):
|
||||
input_documents=docs, question=question, callbacks=_run_manager.get_child()
|
||||
)
|
||||
|
||||
qa = {
|
||||
"name": self.app_name,
|
||||
"context": [
|
||||
{
|
||||
"retrieved_from": doc.metadata.get(
|
||||
"full_path", doc.metadata.get("source")
|
||||
),
|
||||
"doc": doc.page_content,
|
||||
"vector_db": self.retriever.vectorstore.__class__.__name__,
|
||||
**(
|
||||
{"pb_checksum": doc.metadata.get("pb_checksum")}
|
||||
if doc.metadata.get("pb_checksum")
|
||||
else {}
|
||||
),
|
||||
}
|
||||
for doc in docs
|
||||
if isinstance(doc, Document)
|
||||
],
|
||||
"prompt": {
|
||||
"data": question,
|
||||
"entities": prompt_entities.get("entities", {}),
|
||||
"entityCount": prompt_entities.get("entityCount", 0),
|
||||
"prompt_gov_enabled": self.enable_prompt_gov,
|
||||
},
|
||||
"response": {
|
||||
"data": answer,
|
||||
},
|
||||
"prompt_time": prompt_time,
|
||||
"user": auth_context.user_id if auth_context else "unknown",
|
||||
"user_identities": auth_context.user_auth
|
||||
if auth_context and hasattr(auth_context, "user_auth")
|
||||
else [],
|
||||
"classifier_location": self.classifier_location,
|
||||
}
|
||||
|
||||
qa_payload = Qa(**qa)
|
||||
self._send_prompt(qa_payload)
|
||||
self.pb_client.send_prompt(
|
||||
self.app_name,
|
||||
self.retriever,
|
||||
question,
|
||||
answer,
|
||||
auth_context,
|
||||
docs,
|
||||
prompt_entities,
|
||||
prompt_time,
|
||||
self.enable_prompt_gov,
|
||||
)
|
||||
|
||||
if self.return_source_documents:
|
||||
return {self.output_key: answer, "source_documents": docs}
|
||||
@@ -187,7 +157,7 @@ class PebbloRetrievalQA(Chain):
|
||||
"run_manager" in inspect.signature(self._aget_docs).parameters
|
||||
)
|
||||
|
||||
_, prompt_entities = self._check_prompt_validity(question)
|
||||
_, prompt_entities = self.pb_client.check_prompt_validity(question)
|
||||
|
||||
if accepts_run_manager:
|
||||
docs = await self._aget_docs(
|
||||
@@ -243,7 +213,7 @@ class PebbloRetrievalQA(Chain):
|
||||
chain_type: str = "stuff",
|
||||
chain_type_kwargs: Optional[dict] = None,
|
||||
api_key: Optional[str] = None,
|
||||
classifier_url: str = CLASSIFIER_URL,
|
||||
classifier_url: Optional[str] = None,
|
||||
classifier_location: str = "local",
|
||||
**kwargs: Any,
|
||||
) -> "PebbloRetrievalQA":
|
||||
@@ -263,14 +233,14 @@ class PebbloRetrievalQA(Chain):
|
||||
llm=llm,
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
PebbloRetrievalQA._send_discover(
|
||||
app,
|
||||
# initialize Pebblo API client
|
||||
pb_client = PebbloRetrievalAPIWrapper(
|
||||
api_key=api_key,
|
||||
classifier_url=classifier_url,
|
||||
classifier_location=classifier_location,
|
||||
classifier_url=classifier_url,
|
||||
)
|
||||
|
||||
# send app discovery request
|
||||
pb_client.send_app_discover(app)
|
||||
return cls(
|
||||
combine_documents_chain=combine_documents_chain,
|
||||
app_name=app_name,
|
||||
@@ -279,6 +249,7 @@ class PebbloRetrievalQA(Chain):
|
||||
api_key=api_key,
|
||||
classifier_url=classifier_url,
|
||||
classifier_location=classifier_location,
|
||||
pb_client=pb_client,
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
@@ -346,259 +317,36 @@ class PebbloRetrievalQA(Chain):
|
||||
)
|
||||
return app
|
||||
|
||||
@staticmethod
|
||||
def _send_discover(
|
||||
app: App,
|
||||
api_key: Optional[str],
|
||||
classifier_url: str,
|
||||
classifier_location: str,
|
||||
) -> None: # type: ignore
|
||||
"""Send app discovery payload to pebblo-server. Internal method."""
|
||||
headers = {
|
||||
"Accept": "application/json",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
payload = app.dict(exclude_unset=True)
|
||||
if classifier_location == "local":
|
||||
app_discover_url = f"{classifier_url}{APP_DISCOVER_URL}"
|
||||
try:
|
||||
pebblo_resp = requests.post(
|
||||
app_discover_url, headers=headers, json=payload, timeout=20
|
||||
)
|
||||
logger.debug("discover-payload: %s", payload)
|
||||
logger.debug(
|
||||
"send_discover[local]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_resp.request.url,
|
||||
str(pebblo_resp.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_resp.request.body if pebblo_resp.request.body else []
|
||||
)
|
||||
),
|
||||
str(pebblo_resp.status_code),
|
||||
pebblo_resp.json(),
|
||||
)
|
||||
if pebblo_resp.status_code in [HTTPStatus.OK, HTTPStatus.BAD_GATEWAY]:
|
||||
PebbloRetrievalQA.set_discover_sent()
|
||||
else:
|
||||
logger.warning(
|
||||
"Received unexpected HTTP response code:"
|
||||
+ f"{pebblo_resp.status_code}"
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach pebblo server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_discover: local %s", e)
|
||||
|
||||
if api_key:
|
||||
try:
|
||||
headers.update({"x-api-key": api_key})
|
||||
pebblo_cloud_url = f"{PEBBLO_CLOUD_URL}{APP_DISCOVER_URL}"
|
||||
pebblo_cloud_response = requests.post(
|
||||
pebblo_cloud_url, headers=headers, json=payload, timeout=20
|
||||
)
|
||||
|
||||
logger.debug(
|
||||
"send_discover[cloud]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_cloud_response.request.url,
|
||||
str(pebblo_cloud_response.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_cloud_response.request.body
|
||||
if pebblo_cloud_response.request.body
|
||||
else []
|
||||
)
|
||||
),
|
||||
str(pebblo_cloud_response.status_code),
|
||||
pebblo_cloud_response.json(),
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach Pebblo cloud server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_discover: cloud %s", e)
|
||||
|
||||
@classmethod
|
||||
def set_discover_sent(cls) -> None:
|
||||
cls._discover_sent = True
|
||||
|
||||
@classmethod
|
||||
def set_prompt_sent(cls, value: bool = True) -> None:
|
||||
cls._prompt_sent = value
|
||||
|
||||
def _send_prompt(self, qa_payload: Qa) -> None:
|
||||
headers = {
|
||||
"Accept": "application/json",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
app_discover_url = f"{self.classifier_url}{PROMPT_URL}"
|
||||
pebblo_resp = None
|
||||
payload = qa_payload.dict(exclude_unset=True)
|
||||
if self.classifier_location == "local":
|
||||
try:
|
||||
pebblo_resp = requests.post(
|
||||
app_discover_url,
|
||||
headers=headers,
|
||||
json=payload,
|
||||
timeout=20,
|
||||
)
|
||||
logger.debug("prompt-payload: %s", payload)
|
||||
logger.debug(
|
||||
"send_prompt[local]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_resp.request.url,
|
||||
str(pebblo_resp.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_resp.request.body if pebblo_resp.request.body else []
|
||||
)
|
||||
),
|
||||
str(pebblo_resp.status_code),
|
||||
pebblo_resp.json(),
|
||||
)
|
||||
if pebblo_resp.status_code in [HTTPStatus.OK, HTTPStatus.BAD_GATEWAY]:
|
||||
PebbloRetrievalQA.set_prompt_sent()
|
||||
else:
|
||||
logger.warning(
|
||||
"Received unexpected HTTP response code:"
|
||||
+ f"{pebblo_resp.status_code}"
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach pebblo server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_discover: local %s", e)
|
||||
|
||||
# If classifier location is local, then response, context and prompt
|
||||
# should be fetched from pebblo_resp and replaced in payload.
|
||||
if self.api_key:
|
||||
if self.classifier_location == "local":
|
||||
if pebblo_resp:
|
||||
resp = json.loads(pebblo_resp.text)
|
||||
if resp:
|
||||
payload["response"].update(
|
||||
resp.get("retrieval_data", {}).get("response", {})
|
||||
)
|
||||
payload["response"].pop("data")
|
||||
payload["prompt"].update(
|
||||
resp.get("retrieval_data", {}).get("prompt", {})
|
||||
)
|
||||
payload["prompt"].pop("data")
|
||||
context = payload["context"]
|
||||
for context_data in context:
|
||||
context_data.pop("doc")
|
||||
payload["context"] = context
|
||||
else:
|
||||
payload["response"] = {}
|
||||
payload["prompt"] = {}
|
||||
payload["context"] = []
|
||||
headers.update({"x-api-key": self.api_key})
|
||||
pebblo_cloud_url = f"{PEBBLO_CLOUD_URL}{PROMPT_URL}"
|
||||
try:
|
||||
pebblo_cloud_response = requests.post(
|
||||
pebblo_cloud_url,
|
||||
headers=headers,
|
||||
json=payload,
|
||||
timeout=20,
|
||||
)
|
||||
|
||||
logger.debug(
|
||||
"send_prompt[cloud]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_cloud_response.request.url,
|
||||
str(pebblo_cloud_response.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_cloud_response.request.body
|
||||
if pebblo_cloud_response.request.body
|
||||
else []
|
||||
)
|
||||
),
|
||||
str(pebblo_cloud_response.status_code),
|
||||
pebblo_cloud_response.json(),
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach Pebblo cloud server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_prompt: cloud %s", e)
|
||||
elif self.classifier_location == "pebblo-cloud":
|
||||
logger.warning("API key is missing for sending prompt to Pebblo cloud.")
|
||||
raise NameError("API key is missing for sending prompt to Pebblo cloud.")
|
||||
|
||||
def _check_prompt_validity(self, question: str) -> Tuple[bool, Dict[str, Any]]:
|
||||
def get_chain_details(
|
||||
cls, llm: BaseLanguageModel, **kwargs: Any
|
||||
) -> List[ChainInfo]:
|
||||
"""
|
||||
Check the validity of the given prompt using a remote classification service.
|
||||
|
||||
This method sends a prompt to a remote classifier service and return entities
|
||||
present in prompt or not.
|
||||
Get chain details.
|
||||
|
||||
Args:
|
||||
question (str): The prompt question to be validated.
|
||||
llm (BaseLanguageModel): Language model instance.
|
||||
**kwargs: Additional keyword arguments.
|
||||
|
||||
Returns:
|
||||
bool: True if the prompt is valid (does not contain deny list entities),
|
||||
False otherwise.
|
||||
dict: The entities present in the prompt
|
||||
List[ChainInfo]: Chain details.
|
||||
"""
|
||||
|
||||
headers = {
|
||||
"Accept": "application/json",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
prompt_payload = {"prompt": question}
|
||||
is_valid_prompt: bool = True
|
||||
prompt_gov_api_url = f"{self.classifier_url}{PROMPT_GOV_URL}"
|
||||
pebblo_resp = None
|
||||
prompt_entities: dict = {"entities": {}, "entityCount": 0}
|
||||
if self.classifier_location == "local":
|
||||
try:
|
||||
pebblo_resp = requests.post(
|
||||
prompt_gov_api_url,
|
||||
headers=headers,
|
||||
json=prompt_payload,
|
||||
timeout=20,
|
||||
)
|
||||
|
||||
logger.debug("prompt-payload: %s", prompt_payload)
|
||||
logger.debug(
|
||||
"send_prompt[local]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_resp.request.url,
|
||||
str(pebblo_resp.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_resp.request.body if pebblo_resp.request.body else []
|
||||
)
|
||||
),
|
||||
str(pebblo_resp.status_code),
|
||||
pebblo_resp.json(),
|
||||
)
|
||||
logger.debug(f"pebblo_resp.json() {pebblo_resp.json()}")
|
||||
prompt_entities["entities"] = pebblo_resp.json().get("entities", {})
|
||||
prompt_entities["entityCount"] = pebblo_resp.json().get(
|
||||
"entityCount", 0
|
||||
)
|
||||
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach pebblo server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_discover: local %s", e)
|
||||
return is_valid_prompt, prompt_entities
|
||||
|
||||
@classmethod
|
||||
def get_chain_details(cls, llm: BaseLanguageModel, **kwargs): # type: ignore
|
||||
llm_dict = llm.__dict__
|
||||
chain = [
|
||||
{
|
||||
"name": cls.__name__,
|
||||
"model": {
|
||||
"name": llm_dict.get("model_name", llm_dict.get("model")),
|
||||
"vendor": llm.__class__.__name__,
|
||||
},
|
||||
"vector_dbs": [
|
||||
{
|
||||
"name": kwargs["retriever"].vectorstore.__class__.__name__,
|
||||
"embedding_model": str(
|
||||
chains = [
|
||||
ChainInfo(
|
||||
name=cls.__name__,
|
||||
model=Model(
|
||||
name=llm_dict.get("model_name", llm_dict.get("model")),
|
||||
vendor=llm.__class__.__name__,
|
||||
),
|
||||
vector_dbs=[
|
||||
VectorDB(
|
||||
name=kwargs["retriever"].vectorstore.__class__.__name__,
|
||||
embedding_model=str(
|
||||
kwargs["retriever"].vectorstore._embeddings.model
|
||||
)
|
||||
if hasattr(kwargs["retriever"].vectorstore, "_embeddings")
|
||||
@@ -607,8 +355,8 @@ class PebbloRetrievalQA(Chain):
|
||||
if hasattr(kwargs["retriever"].vectorstore, "_embedding")
|
||||
else None
|
||||
),
|
||||
}
|
||||
)
|
||||
],
|
||||
},
|
||||
),
|
||||
]
|
||||
return chain
|
||||
return chains
|
||||
|
||||
@@ -109,7 +109,7 @@ class VectorDB(BaseModel):
|
||||
embedding_model: Optional[str] = None
|
||||
|
||||
|
||||
class Chains(BaseModel):
|
||||
class ChainInfo(BaseModel):
|
||||
name: str
|
||||
model: Optional[Model]
|
||||
vector_dbs: Optional[List[VectorDB]]
|
||||
@@ -121,7 +121,7 @@ class App(BaseModel):
|
||||
description: Optional[str]
|
||||
runtime: Runtime
|
||||
framework: Framework
|
||||
chains: List[Chains]
|
||||
chains: List[ChainInfo]
|
||||
plugin_version: str
|
||||
|
||||
|
||||
@@ -134,9 +134,9 @@ class Context(BaseModel):
|
||||
|
||||
class Prompt(BaseModel):
|
||||
data: Optional[Union[list, str]]
|
||||
entityCount: Optional[int]
|
||||
entities: Optional[dict]
|
||||
prompt_gov_enabled: Optional[bool]
|
||||
entityCount: Optional[int] = None
|
||||
entities: Optional[dict] = None
|
||||
prompt_gov_enabled: Optional[bool] = None
|
||||
|
||||
|
||||
class Qa(BaseModel):
|
||||
|
||||
@@ -1,22 +1,43 @@
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import platform
|
||||
from typing import Tuple
|
||||
from enum import Enum
|
||||
from http import HTTPStatus
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.env import get_runtime_environment
|
||||
from langchain_core.pydantic_v1 import BaseModel
|
||||
from langchain_core.utils import get_from_dict_or_env
|
||||
from langchain_core.vectorstores import VectorStoreRetriever
|
||||
from requests import Response, request
|
||||
from requests.exceptions import RequestException
|
||||
|
||||
from langchain_community.chains.pebblo_retrieval.models import Framework, Runtime
|
||||
from langchain_community.chains.pebblo_retrieval.models import (
|
||||
App,
|
||||
AuthContext,
|
||||
Context,
|
||||
Framework,
|
||||
Prompt,
|
||||
Qa,
|
||||
Runtime,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
PLUGIN_VERSION = "0.1.1"
|
||||
|
||||
CLASSIFIER_URL = os.getenv("PEBBLO_CLASSIFIER_URL", "http://localhost:8000")
|
||||
PEBBLO_CLOUD_URL = os.getenv("PEBBLO_CLOUD_URL", "https://api.daxa.ai")
|
||||
_DEFAULT_CLASSIFIER_URL = "http://localhost:8000"
|
||||
_DEFAULT_PEBBLO_CLOUD_URL = "https://api.daxa.ai"
|
||||
|
||||
PROMPT_URL = "/v1/prompt"
|
||||
PROMPT_GOV_URL = "/v1/prompt/governance"
|
||||
APP_DISCOVER_URL = "/v1/app/discover"
|
||||
|
||||
class Routes(str, Enum):
|
||||
"""Routes available for the Pebblo API as enumerator."""
|
||||
|
||||
retrieval_app_discover = "/v1/app/discover"
|
||||
prompt = "/v1/prompt"
|
||||
prompt_governance = "/v1/prompt/governance"
|
||||
|
||||
|
||||
def get_runtime() -> Tuple[Framework, Runtime]:
|
||||
@@ -64,3 +85,308 @@ def get_ip() -> str:
|
||||
except Exception:
|
||||
public_ip = socket.gethostbyname("localhost")
|
||||
return public_ip
|
||||
|
||||
|
||||
class PebbloRetrievalAPIWrapper(BaseModel):
|
||||
"""Wrapper for Pebblo Retrieval API."""
|
||||
|
||||
api_key: Optional[str] # Use SecretStr
|
||||
"""API key for Pebblo Cloud"""
|
||||
classifier_location: str = "local"
|
||||
"""Location of the classifier, local or cloud. Defaults to 'local'"""
|
||||
classifier_url: Optional[str]
|
||||
"""URL of the Pebblo Classifier"""
|
||||
cloud_url: Optional[str]
|
||||
"""URL of the Pebblo Cloud"""
|
||||
|
||||
def __init__(self, **kwargs: Any):
|
||||
"""Validate that api key in environment."""
|
||||
kwargs["api_key"] = get_from_dict_or_env(
|
||||
kwargs, "api_key", "PEBBLO_API_KEY", ""
|
||||
)
|
||||
kwargs["classifier_url"] = get_from_dict_or_env(
|
||||
kwargs, "classifier_url", "PEBBLO_CLASSIFIER_URL", _DEFAULT_CLASSIFIER_URL
|
||||
)
|
||||
kwargs["cloud_url"] = get_from_dict_or_env(
|
||||
kwargs, "cloud_url", "PEBBLO_CLOUD_URL", _DEFAULT_PEBBLO_CLOUD_URL
|
||||
)
|
||||
super().__init__(**kwargs)
|
||||
|
||||
def send_app_discover(self, app: App) -> None:
|
||||
"""
|
||||
Send app discovery request to Pebblo server & cloud.
|
||||
|
||||
Args:
|
||||
app (App): App instance to be discovered.
|
||||
"""
|
||||
pebblo_resp = None
|
||||
payload = app.dict(exclude_unset=True)
|
||||
|
||||
if self.classifier_location == "local":
|
||||
# Send app details to local classifier
|
||||
headers = self._make_headers()
|
||||
app_discover_url = f"{self.classifier_url}{Routes.retrieval_app_discover}"
|
||||
pebblo_resp = self.make_request("POST", app_discover_url, headers, payload)
|
||||
|
||||
if self.api_key:
|
||||
# Send app details to Pebblo cloud if api_key is present
|
||||
headers = self._make_headers(cloud_request=True)
|
||||
if pebblo_resp:
|
||||
pebblo_server_version = json.loads(pebblo_resp.text).get(
|
||||
"pebblo_server_version"
|
||||
)
|
||||
payload.update({"pebblo_server_version": pebblo_server_version})
|
||||
|
||||
payload.update({"pebblo_client_version": PLUGIN_VERSION})
|
||||
pebblo_cloud_url = f"{self.cloud_url}{Routes.retrieval_app_discover}"
|
||||
_ = self.make_request("POST", pebblo_cloud_url, headers, payload)
|
||||
|
||||
def send_prompt(
|
||||
self,
|
||||
app_name: str,
|
||||
retriever: VectorStoreRetriever,
|
||||
question: str,
|
||||
answer: str,
|
||||
auth_context: Optional[AuthContext],
|
||||
docs: List[Document],
|
||||
prompt_entities: Dict[str, Any],
|
||||
prompt_time: str,
|
||||
prompt_gov_enabled: bool = False,
|
||||
) -> None:
|
||||
"""
|
||||
Send prompt to Pebblo server for classification.
|
||||
Then send prompt to Daxa cloud(If api_key is present).
|
||||
|
||||
Args:
|
||||
app_name (str): Name of the app.
|
||||
retriever (VectorStoreRetriever): Retriever instance.
|
||||
question (str): Question asked in the prompt.
|
||||
answer (str): Answer generated by the model.
|
||||
auth_context (Optional[AuthContext]): Authentication context.
|
||||
docs (List[Document]): List of documents retrieved.
|
||||
prompt_entities (Dict[str, Any]): Entities present in the prompt.
|
||||
prompt_time (str): Time when the prompt was generated.
|
||||
prompt_gov_enabled (bool): Whether prompt governance is enabled.
|
||||
"""
|
||||
pebblo_resp = None
|
||||
payload = self.build_prompt_qa_payload(
|
||||
app_name,
|
||||
retriever,
|
||||
question,
|
||||
answer,
|
||||
auth_context,
|
||||
docs,
|
||||
prompt_entities,
|
||||
prompt_time,
|
||||
prompt_gov_enabled,
|
||||
)
|
||||
|
||||
if self.classifier_location == "local":
|
||||
# Send prompt to local classifier
|
||||
headers = self._make_headers()
|
||||
prompt_url = f"{self.classifier_url}{Routes.prompt}"
|
||||
pebblo_resp = self.make_request("POST", prompt_url, headers, payload)
|
||||
|
||||
if self.api_key:
|
||||
# Send prompt to Pebblo cloud if api_key is present
|
||||
if self.classifier_location == "local":
|
||||
# If classifier location is local, then response, context and prompt
|
||||
# should be fetched from pebblo_resp and replaced in payload.
|
||||
pebblo_resp = pebblo_resp.json() if pebblo_resp else None
|
||||
self.update_cloud_payload(payload, pebblo_resp)
|
||||
|
||||
headers = self._make_headers(cloud_request=True)
|
||||
pebblo_cloud_prompt_url = f"{self.cloud_url}{Routes.prompt}"
|
||||
_ = self.make_request("POST", pebblo_cloud_prompt_url, headers, payload)
|
||||
elif self.classifier_location == "pebblo-cloud":
|
||||
logger.warning("API key is missing for sending prompt to Pebblo cloud.")
|
||||
raise NameError("API key is missing for sending prompt to Pebblo cloud.")
|
||||
|
||||
def check_prompt_validity(self, question: str) -> Tuple[bool, Dict[str, Any]]:
|
||||
"""
|
||||
Check the validity of the given prompt using a remote classification service.
|
||||
|
||||
This method sends a prompt to a remote classifier service and return entities
|
||||
present in prompt or not.
|
||||
|
||||
Args:
|
||||
question (str): The prompt question to be validated.
|
||||
|
||||
Returns:
|
||||
bool: True if the prompt is valid (does not contain deny list entities),
|
||||
False otherwise.
|
||||
dict: The entities present in the prompt
|
||||
"""
|
||||
prompt_payload = {"prompt": question}
|
||||
prompt_entities: dict = {"entities": {}, "entityCount": 0}
|
||||
is_valid_prompt: bool = True
|
||||
if self.classifier_location == "local":
|
||||
headers = self._make_headers()
|
||||
prompt_gov_api_url = f"{self.classifier_url}{Routes.prompt_governance}"
|
||||
pebblo_resp = self.make_request(
|
||||
"POST", prompt_gov_api_url, headers, prompt_payload
|
||||
)
|
||||
if pebblo_resp:
|
||||
logger.debug(f"pebblo_resp.json() {pebblo_resp.json()}")
|
||||
prompt_entities["entities"] = pebblo_resp.json().get("entities", {})
|
||||
prompt_entities["entityCount"] = pebblo_resp.json().get(
|
||||
"entityCount", 0
|
||||
)
|
||||
return is_valid_prompt, prompt_entities
|
||||
|
||||
def _make_headers(self, cloud_request: bool = False) -> dict:
|
||||
"""
|
||||
Generate headers for the request.
|
||||
|
||||
args:
|
||||
cloud_request (bool): flag indicating whether the request is for Pebblo
|
||||
cloud.
|
||||
returns:
|
||||
dict: Headers for the request.
|
||||
|
||||
"""
|
||||
headers = {
|
||||
"Accept": "application/json",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
if cloud_request:
|
||||
# Add API key for Pebblo cloud request
|
||||
if self.api_key:
|
||||
headers.update({"x-api-key": self.api_key})
|
||||
else:
|
||||
logger.warning("API key is missing for Pebblo cloud request.")
|
||||
return headers
|
||||
|
||||
@staticmethod
|
||||
def make_request(
|
||||
method: str,
|
||||
url: str,
|
||||
headers: dict,
|
||||
payload: Optional[dict] = None,
|
||||
timeout: int = 20,
|
||||
) -> Optional[Response]:
|
||||
"""
|
||||
Make a request to the Pebblo server/cloud API.
|
||||
|
||||
Args:
|
||||
method (str): HTTP method (GET, POST, PUT, DELETE, etc.).
|
||||
url (str): URL for the request.
|
||||
headers (dict): Headers for the request.
|
||||
payload (Optional[dict]): Payload for the request (for POST, PUT, etc.).
|
||||
timeout (int): Timeout for the request in seconds.
|
||||
|
||||
Returns:
|
||||
Optional[Response]: Response object if the request is successful.
|
||||
"""
|
||||
try:
|
||||
response = request(
|
||||
method=method, url=url, headers=headers, json=payload, timeout=timeout
|
||||
)
|
||||
logger.debug(
|
||||
"Request: method %s, url %s, len %s response status %s",
|
||||
method,
|
||||
response.request.url,
|
||||
str(len(response.request.body if response.request.body else [])),
|
||||
str(response.status_code),
|
||||
)
|
||||
|
||||
if response.status_code >= HTTPStatus.INTERNAL_SERVER_ERROR:
|
||||
logger.warning(f"Pebblo Server: Error {response.status_code}")
|
||||
elif response.status_code >= HTTPStatus.BAD_REQUEST:
|
||||
logger.warning(f"Pebblo received an invalid payload: {response.text}")
|
||||
elif response.status_code != HTTPStatus.OK:
|
||||
logger.warning(
|
||||
f"Pebblo returned an unexpected response code: "
|
||||
f"{response.status_code}"
|
||||
)
|
||||
|
||||
return response
|
||||
except RequestException:
|
||||
logger.warning("Unable to reach server %s", url)
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in make_request: %s", e)
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def update_cloud_payload(payload: dict, pebblo_resp: Optional[dict]) -> None:
|
||||
"""
|
||||
Update the payload with response, prompt and context from Pebblo response.
|
||||
|
||||
Args:
|
||||
payload (dict): Payload to be updated.
|
||||
pebblo_resp (Optional[dict]): Response from Pebblo server.
|
||||
"""
|
||||
if pebblo_resp:
|
||||
# Update response, prompt and context from pebblo response
|
||||
response = payload.get("response", {})
|
||||
response.update(pebblo_resp.get("retrieval_data", {}).get("response", {}))
|
||||
response.pop("data", None)
|
||||
prompt = payload.get("prompt", {})
|
||||
prompt.update(pebblo_resp.get("retrieval_data", {}).get("prompt", {}))
|
||||
prompt.pop("data", None)
|
||||
context = payload.get("context", [])
|
||||
for context_data in context:
|
||||
context_data.pop("doc", None)
|
||||
else:
|
||||
payload["response"] = {}
|
||||
payload["prompt"] = {}
|
||||
payload["context"] = []
|
||||
|
||||
def build_prompt_qa_payload(
|
||||
self,
|
||||
app_name: str,
|
||||
retriever: VectorStoreRetriever,
|
||||
question: str,
|
||||
answer: str,
|
||||
auth_context: Optional[AuthContext],
|
||||
docs: List[Document],
|
||||
prompt_entities: Dict[str, Any],
|
||||
prompt_time: str,
|
||||
prompt_gov_enabled: bool = False,
|
||||
) -> dict:
|
||||
"""
|
||||
Build the QA payload for the prompt.
|
||||
|
||||
Args:
|
||||
app_name (str): Name of the app.
|
||||
retriever (VectorStoreRetriever): Retriever instance.
|
||||
question (str): Question asked in the prompt.
|
||||
answer (str): Answer generated by the model.
|
||||
auth_context (Optional[AuthContext]): Authentication context.
|
||||
docs (List[Document]): List of documents retrieved.
|
||||
prompt_entities (Dict[str, Any]): Entities present in the prompt.
|
||||
prompt_time (str): Time when the prompt was generated.
|
||||
prompt_gov_enabled (bool): Whether prompt governance is enabled.
|
||||
|
||||
Returns:
|
||||
dict: The QA payload for the prompt.
|
||||
"""
|
||||
qa = Qa(
|
||||
name=app_name,
|
||||
context=[
|
||||
Context(
|
||||
retrieved_from=doc.metadata.get(
|
||||
"full_path", doc.metadata.get("source")
|
||||
),
|
||||
doc=doc.page_content,
|
||||
vector_db=retriever.vectorstore.__class__.__name__,
|
||||
pb_checksum=doc.metadata.get("pb_checksum"),
|
||||
)
|
||||
for doc in docs
|
||||
if isinstance(doc, Document)
|
||||
],
|
||||
prompt=Prompt(
|
||||
data=question,
|
||||
entities=prompt_entities.get("entities", {}),
|
||||
entityCount=prompt_entities.get("entityCount", 0),
|
||||
prompt_gov_enabled=prompt_gov_enabled,
|
||||
),
|
||||
response=Prompt(data=answer),
|
||||
prompt_time=prompt_time,
|
||||
user=auth_context.user_id if auth_context else "unknown",
|
||||
user_identities=auth_context.user_auth
|
||||
if auth_context and hasattr(auth_context, "user_auth")
|
||||
else [],
|
||||
classifier_location=self.classifier_location,
|
||||
)
|
||||
return qa.dict(exclude_unset=True)
|
||||
|
||||
@@ -449,7 +449,9 @@ class ChatDeepInfra(BaseChatModel):
|
||||
|
||||
def _handle_status(self, code: int, text: Any) -> None:
|
||||
if code >= 500:
|
||||
raise ChatDeepInfraException(f"DeepInfra Server: Error {code}")
|
||||
raise ChatDeepInfraException(
|
||||
f"DeepInfra Server error status {code}: {text}"
|
||||
)
|
||||
elif code >= 400:
|
||||
raise ValueError(f"DeepInfra received an invalid payload: {text}")
|
||||
elif code != 200:
|
||||
|
||||
@@ -117,7 +117,12 @@ def _get_jwt_token(api_key: str) -> str:
|
||||
Returns:
|
||||
The JWT token.
|
||||
"""
|
||||
import jwt
|
||||
try:
|
||||
import jwt
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"jwt package not found, please install it with" "`pip install pyjwt`"
|
||||
)
|
||||
|
||||
try:
|
||||
id, secret = api_key.split(".")
|
||||
@@ -323,6 +328,67 @@ class ChatZhipuAI(BaseChatModel):
|
||||
|
||||
[AIMessage(content='I enjoy programming.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 23, 'total_tokens': 29}, 'model_name': 'glm-4', 'finish_reason': 'stop'}, id='run-ba06af9d-4baa-40b2-9298-be9c62aa0849-0')]
|
||||
|
||||
Tool calling:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_core.pydantic_v1 import BaseModel, Field
|
||||
|
||||
|
||||
class GetWeather(BaseModel):
|
||||
'''Get the current weather in a given location'''
|
||||
|
||||
location: str = Field(
|
||||
..., description="The city and state, e.g. San Francisco, CA"
|
||||
)
|
||||
|
||||
|
||||
class GetPopulation(BaseModel):
|
||||
'''Get the current population in a given location'''
|
||||
|
||||
location: str = Field(
|
||||
..., description="The city and state, e.g. San Francisco, CA"
|
||||
)
|
||||
|
||||
chat_with_tools = zhipuai_chat.bind_tools([GetWeather, GetPopulation])
|
||||
ai_msg = chat_with_tools.invoke(
|
||||
"Which city is hotter today and which is bigger: LA or NY?"
|
||||
)
|
||||
ai_msg.tool_calls
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
{
|
||||
'name': 'GetWeather',
|
||||
'args': {'location': 'Los Angeles, CA'},
|
||||
'id': 'call_202408222146464ea49ec8731145a9',
|
||||
'type': 'tool_call'
|
||||
}
|
||||
]
|
||||
|
||||
Structured output:
|
||||
.. code-block:: python
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from langchain_core.pydantic_v1 import BaseModel, Field
|
||||
|
||||
|
||||
class Joke(BaseModel):
|
||||
'''Joke to tell user.'''
|
||||
|
||||
setup: str = Field(description="The setup of the joke")
|
||||
punchline: str = Field(description="The punchline to the joke")
|
||||
rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")
|
||||
|
||||
|
||||
structured_chat = zhipuai_chat.with_structured_output(Joke)
|
||||
structured_chat.invoke("Tell me a joke about cats")
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
Joke(setup='What do cats like to eat for breakfast?', punchline='Mice Krispies!', rating=None)
|
||||
|
||||
Response metadata
|
||||
.. code-block:: python
|
||||
|
||||
|
||||
@@ -1,31 +1,25 @@
|
||||
"""Pebblo's safe dataloader is a wrapper for document loaders"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import uuid
|
||||
from http import HTTPStatus
|
||||
from typing import Any, Dict, Iterator, List, Optional
|
||||
from typing import Dict, Iterator, List, Optional
|
||||
|
||||
import requests # type: ignore
|
||||
from langchain_core.documents import Document
|
||||
|
||||
from langchain_community.document_loaders.base import BaseLoader
|
||||
from langchain_community.utilities.pebblo import (
|
||||
APP_DISCOVER_URL,
|
||||
BATCH_SIZE_BYTES,
|
||||
CLASSIFIER_URL,
|
||||
LOADER_DOC_URL,
|
||||
PEBBLO_CLOUD_URL,
|
||||
PLUGIN_VERSION,
|
||||
App,
|
||||
Doc,
|
||||
IndexedDocument,
|
||||
PebbloLoaderAPIWrapper,
|
||||
generate_size_based_batches,
|
||||
get_full_path,
|
||||
get_loader_full_path,
|
||||
get_loader_type,
|
||||
get_runtime,
|
||||
get_source_size,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -37,7 +31,6 @@ class PebbloSafeLoader(BaseLoader):
|
||||
"""
|
||||
|
||||
_discover_sent: bool = False
|
||||
_loader_sent: bool = False
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
@@ -54,22 +47,17 @@ class PebbloSafeLoader(BaseLoader):
|
||||
if not name or not isinstance(name, str):
|
||||
raise NameError("Must specify a valid name.")
|
||||
self.app_name = name
|
||||
self.api_key = os.environ.get("PEBBLO_API_KEY") or api_key
|
||||
self.load_id = str(uuid.uuid4())
|
||||
self.loader = langchain_loader
|
||||
self.load_semantic = os.environ.get("PEBBLO_LOAD_SEMANTIC") or load_semantic
|
||||
self.owner = owner
|
||||
self.description = description
|
||||
self.source_path = get_loader_full_path(self.loader)
|
||||
self.source_owner = PebbloSafeLoader.get_file_owner_from_path(self.source_path)
|
||||
self.docs: List[Document] = []
|
||||
self.docs_with_id: List[IndexedDocument] = []
|
||||
loader_name = str(type(self.loader)).split(".")[-1].split("'")[0]
|
||||
self.source_type = get_loader_type(loader_name)
|
||||
self.source_path_size = self.get_source_size(self.source_path)
|
||||
self.source_aggregate_size = 0
|
||||
self.classifier_url = classifier_url or CLASSIFIER_URL
|
||||
self.classifier_location = classifier_location
|
||||
self.source_path_size = get_source_size(self.source_path)
|
||||
self.batch_size = BATCH_SIZE_BYTES
|
||||
self.loader_details = {
|
||||
"loader": loader_name,
|
||||
@@ -83,7 +71,13 @@ class PebbloSafeLoader(BaseLoader):
|
||||
}
|
||||
# generate app
|
||||
self.app = self._get_app_details()
|
||||
self._send_discover()
|
||||
# initialize Pebblo Loader API client
|
||||
self.pb_client = PebbloLoaderAPIWrapper(
|
||||
api_key=api_key,
|
||||
classifier_location=classifier_location,
|
||||
classifier_url=classifier_url,
|
||||
)
|
||||
self.pb_client.send_loader_discover(self.app)
|
||||
|
||||
def load(self) -> List[Document]:
|
||||
"""Load Documents.
|
||||
@@ -113,7 +107,12 @@ class PebbloSafeLoader(BaseLoader):
|
||||
is_last_batch: bool = i == total_batches - 1
|
||||
self.docs = batch
|
||||
self.docs_with_id = self._index_docs()
|
||||
classified_docs = self._classify_doc(loading_end=is_last_batch)
|
||||
classified_docs = self.pb_client.classify_documents(
|
||||
self.docs_with_id,
|
||||
self.app,
|
||||
self.loader_details,
|
||||
loading_end=is_last_batch,
|
||||
)
|
||||
self._add_pebblo_specific_metadata(classified_docs)
|
||||
if self.load_semantic:
|
||||
batch_processed_docs = self._add_semantic_to_docs(classified_docs)
|
||||
@@ -147,7 +146,9 @@ class PebbloSafeLoader(BaseLoader):
|
||||
break
|
||||
self.docs = list((doc,))
|
||||
self.docs_with_id = self._index_docs()
|
||||
classified_doc = self._classify_doc()
|
||||
classified_doc = self.pb_client.classify_documents(
|
||||
self.docs_with_id, self.app, self.loader_details
|
||||
)
|
||||
self._add_pebblo_specific_metadata(classified_doc)
|
||||
if self.load_semantic:
|
||||
self.docs = self._add_semantic_to_docs(classified_doc)
|
||||
@@ -159,263 +160,6 @@ class PebbloSafeLoader(BaseLoader):
|
||||
def set_discover_sent(cls) -> None:
|
||||
cls._discover_sent = True
|
||||
|
||||
@classmethod
|
||||
def set_loader_sent(cls) -> None:
|
||||
cls._loader_sent = True
|
||||
|
||||
def _classify_doc(self, loading_end: bool = False) -> dict:
|
||||
"""Send documents fetched from loader to pebblo-server. Then send
|
||||
classified documents to Daxa cloud(If api_key is present). Internal method.
|
||||
|
||||
Args:
|
||||
|
||||
loading_end (bool, optional): Flag indicating the halt of data
|
||||
loading by loader. Defaults to False.
|
||||
"""
|
||||
headers = {
|
||||
"Accept": "application/json",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
if loading_end is True:
|
||||
PebbloSafeLoader.set_loader_sent()
|
||||
doc_content = [doc.dict() for doc in self.docs_with_id]
|
||||
docs = []
|
||||
for doc in doc_content:
|
||||
doc_metadata = doc.get("metadata", {})
|
||||
doc_authorized_identities = doc_metadata.get("authorized_identities", [])
|
||||
doc_source_path = get_full_path(
|
||||
doc_metadata.get(
|
||||
"full_path", doc_metadata.get("source", self.source_path)
|
||||
)
|
||||
)
|
||||
doc_source_owner = doc_metadata.get(
|
||||
"owner", PebbloSafeLoader.get_file_owner_from_path(doc_source_path)
|
||||
)
|
||||
doc_source_size = doc_metadata.get(
|
||||
"size", self.get_source_size(doc_source_path)
|
||||
)
|
||||
page_content = str(doc.get("page_content"))
|
||||
page_content_size = self.calculate_content_size(page_content)
|
||||
self.source_aggregate_size += page_content_size
|
||||
doc_id = doc.get("pb_id", None) or 0
|
||||
docs.append(
|
||||
{
|
||||
"doc": page_content,
|
||||
"source_path": doc_source_path,
|
||||
"pb_id": doc_id,
|
||||
"last_modified": doc.get("metadata", {}).get("last_modified"),
|
||||
"file_owner": doc_source_owner,
|
||||
**(
|
||||
{"authorized_identities": doc_authorized_identities}
|
||||
if doc_authorized_identities
|
||||
else {}
|
||||
),
|
||||
**(
|
||||
{"source_path_size": doc_source_size}
|
||||
if doc_source_size is not None
|
||||
else {}
|
||||
),
|
||||
}
|
||||
)
|
||||
payload: Dict[str, Any] = {
|
||||
"name": self.app_name,
|
||||
"owner": self.owner,
|
||||
"docs": docs,
|
||||
"plugin_version": PLUGIN_VERSION,
|
||||
"load_id": self.load_id,
|
||||
"loader_details": self.loader_details,
|
||||
"loading_end": "false",
|
||||
"source_owner": self.source_owner,
|
||||
"classifier_location": self.classifier_location,
|
||||
}
|
||||
if loading_end is True:
|
||||
payload["loading_end"] = "true"
|
||||
if "loader_details" in payload:
|
||||
payload["loader_details"]["source_aggregate_size"] = (
|
||||
self.source_aggregate_size
|
||||
)
|
||||
payload = Doc(**payload).dict(exclude_unset=True)
|
||||
classified_docs = {}
|
||||
# Raw payload to be sent to classifier
|
||||
if self.classifier_location == "local":
|
||||
load_doc_url = f"{self.classifier_url}{LOADER_DOC_URL}"
|
||||
try:
|
||||
pebblo_resp = requests.post(
|
||||
load_doc_url, headers=headers, json=payload, timeout=300
|
||||
)
|
||||
|
||||
# Updating the structure of pebblo response docs for efficient searching
|
||||
for classified_doc in json.loads(pebblo_resp.text).get("docs", []):
|
||||
classified_docs.update({classified_doc["pb_id"]: classified_doc})
|
||||
if pebblo_resp.status_code not in [
|
||||
HTTPStatus.OK,
|
||||
HTTPStatus.BAD_GATEWAY,
|
||||
]:
|
||||
logger.warning(
|
||||
"Received unexpected HTTP response code: %s",
|
||||
pebblo_resp.status_code,
|
||||
)
|
||||
logger.debug(
|
||||
"send_loader_doc[local]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_resp.request.url,
|
||||
str(pebblo_resp.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_resp.request.body if pebblo_resp.request.body else []
|
||||
)
|
||||
),
|
||||
str(pebblo_resp.status_code),
|
||||
pebblo_resp.json(),
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach pebblo server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_loader_doc: local %s", e)
|
||||
|
||||
if self.api_key:
|
||||
if self.classifier_location == "local":
|
||||
docs = payload["docs"]
|
||||
for doc_data in docs:
|
||||
classified_data = classified_docs.get(doc_data["pb_id"], {})
|
||||
doc_data.update(
|
||||
{
|
||||
"pb_checksum": classified_data.get("pb_checksum", None),
|
||||
"loader_source_path": classified_data.get(
|
||||
"loader_source_path", None
|
||||
),
|
||||
"entities": classified_data.get("entities", {}),
|
||||
"topics": classified_data.get("topics", {}),
|
||||
}
|
||||
)
|
||||
doc_data.pop("doc")
|
||||
|
||||
headers.update({"x-api-key": self.api_key})
|
||||
pebblo_cloud_url = f"{PEBBLO_CLOUD_URL}{LOADER_DOC_URL}"
|
||||
try:
|
||||
pebblo_cloud_response = requests.post(
|
||||
pebblo_cloud_url, headers=headers, json=payload, timeout=20
|
||||
)
|
||||
logger.debug(
|
||||
"send_loader_doc[cloud]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_cloud_response.request.url,
|
||||
str(pebblo_cloud_response.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_cloud_response.request.body
|
||||
if pebblo_cloud_response.request.body
|
||||
else []
|
||||
)
|
||||
),
|
||||
str(pebblo_cloud_response.status_code),
|
||||
pebblo_cloud_response.json(),
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach Pebblo cloud server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_loader_doc: cloud %s", e)
|
||||
elif self.classifier_location == "pebblo-cloud":
|
||||
logger.warning("API key is missing for sending docs to Pebblo cloud.")
|
||||
raise NameError("API key is missing for sending docs to Pebblo cloud.")
|
||||
|
||||
return classified_docs
|
||||
|
||||
@staticmethod
|
||||
def calculate_content_size(page_content: str) -> int:
|
||||
"""Calculate the content size in bytes:
|
||||
- Encode the string to bytes using a specific encoding (e.g., UTF-8)
|
||||
- Get the length of the encoded bytes.
|
||||
|
||||
Args:
|
||||
page_content (str): Data string.
|
||||
|
||||
Returns:
|
||||
int: Size of string in bytes.
|
||||
"""
|
||||
|
||||
# Encode the content to bytes using UTF-8
|
||||
encoded_content = page_content.encode("utf-8")
|
||||
size = len(encoded_content)
|
||||
return size
|
||||
|
||||
def _send_discover(self) -> None:
|
||||
"""Send app discovery payload to pebblo-server. Internal method."""
|
||||
pebblo_resp = None
|
||||
headers = {
|
||||
"Accept": "application/json",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
payload = self.app.dict(exclude_unset=True)
|
||||
# Raw discover payload to be sent to classifier
|
||||
if self.classifier_location == "local":
|
||||
app_discover_url = f"{self.classifier_url}{APP_DISCOVER_URL}"
|
||||
try:
|
||||
pebblo_resp = requests.post(
|
||||
app_discover_url, headers=headers, json=payload, timeout=20
|
||||
)
|
||||
logger.debug(
|
||||
"send_discover[local]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_resp.request.url,
|
||||
str(pebblo_resp.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_resp.request.body if pebblo_resp.request.body else []
|
||||
)
|
||||
),
|
||||
str(pebblo_resp.status_code),
|
||||
pebblo_resp.json(),
|
||||
)
|
||||
if pebblo_resp.status_code in [HTTPStatus.OK, HTTPStatus.BAD_GATEWAY]:
|
||||
PebbloSafeLoader.set_discover_sent()
|
||||
else:
|
||||
logger.warning(
|
||||
f"Received unexpected HTTP response code:\
|
||||
{pebblo_resp.status_code}"
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach pebblo server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_discover: local %s", e)
|
||||
|
||||
if self.api_key:
|
||||
try:
|
||||
headers.update({"x-api-key": self.api_key})
|
||||
# If the pebblo_resp is None,
|
||||
# then the pebblo server version is not available
|
||||
if pebblo_resp:
|
||||
pebblo_server_version = json.loads(pebblo_resp.text).get(
|
||||
"pebblo_server_version"
|
||||
)
|
||||
payload.update({"pebblo_server_version": pebblo_server_version})
|
||||
|
||||
payload.update({"pebblo_client_version": PLUGIN_VERSION})
|
||||
pebblo_cloud_url = f"{PEBBLO_CLOUD_URL}{APP_DISCOVER_URL}"
|
||||
pebblo_cloud_response = requests.post(
|
||||
pebblo_cloud_url, headers=headers, json=payload, timeout=20
|
||||
)
|
||||
|
||||
logger.debug(
|
||||
"send_discover[cloud]: request url %s, body %s len %s\
|
||||
response status %s body %s",
|
||||
pebblo_cloud_response.request.url,
|
||||
str(pebblo_cloud_response.request.body),
|
||||
str(
|
||||
len(
|
||||
pebblo_cloud_response.request.body
|
||||
if pebblo_cloud_response.request.body
|
||||
else []
|
||||
)
|
||||
),
|
||||
str(pebblo_cloud_response.status_code),
|
||||
pebblo_cloud_response.json(),
|
||||
)
|
||||
except requests.exceptions.RequestException:
|
||||
logger.warning("Unable to reach Pebblo cloud server.")
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in _send_discover: cloud %s", e)
|
||||
|
||||
def _get_app_details(self) -> App:
|
||||
"""Fetch app details. Internal method.
|
||||
|
||||
@@ -434,49 +178,6 @@ class PebbloSafeLoader(BaseLoader):
|
||||
)
|
||||
return app
|
||||
|
||||
@staticmethod
|
||||
def get_file_owner_from_path(file_path: str) -> str:
|
||||
"""Fetch owner of local file path.
|
||||
|
||||
Args:
|
||||
file_path (str): Local file path.
|
||||
|
||||
Returns:
|
||||
str: Name of owner.
|
||||
"""
|
||||
try:
|
||||
import pwd
|
||||
|
||||
file_owner_uid = os.stat(file_path).st_uid
|
||||
file_owner_name = pwd.getpwuid(file_owner_uid).pw_name
|
||||
except Exception:
|
||||
file_owner_name = "unknown"
|
||||
return file_owner_name
|
||||
|
||||
def get_source_size(self, source_path: str) -> int:
|
||||
"""Fetch size of source path. Source can be a directory or a file.
|
||||
|
||||
Args:
|
||||
source_path (str): Local path of data source.
|
||||
|
||||
Returns:
|
||||
int: Source size in bytes.
|
||||
"""
|
||||
if not source_path:
|
||||
return 0
|
||||
size = 0
|
||||
if os.path.isfile(source_path):
|
||||
size = os.path.getsize(source_path)
|
||||
elif os.path.isdir(source_path):
|
||||
total_size = 0
|
||||
for dirpath, _, filenames in os.walk(source_path):
|
||||
for f in filenames:
|
||||
fp = os.path.join(dirpath, f)
|
||||
if not os.path.islink(fp):
|
||||
total_size += os.path.getsize(fp)
|
||||
size = total_size
|
||||
return size
|
||||
|
||||
def _index_docs(self) -> List[IndexedDocument]:
|
||||
"""
|
||||
Indexes the documents and returns a list of IndexedDocument objects.
|
||||
|
||||
@@ -1,25 +1,29 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import pathlib
|
||||
import platform
|
||||
from typing import List, Optional, Tuple
|
||||
from enum import Enum
|
||||
from http import HTTPStatus
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.env import get_runtime_environment
|
||||
from langchain_core.pydantic_v1 import BaseModel
|
||||
from langchain_core.utils import get_from_dict_or_env
|
||||
from requests import Response, request
|
||||
from requests.exceptions import RequestException
|
||||
|
||||
from langchain_community.document_loaders.base import BaseLoader
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
PLUGIN_VERSION = "0.1.1"
|
||||
CLASSIFIER_URL = os.getenv("PEBBLO_CLASSIFIER_URL", "http://localhost:8000")
|
||||
PEBBLO_CLOUD_URL = os.getenv("PEBBLO_CLOUD_URL", "https://api.daxa.ai")
|
||||
|
||||
LOADER_DOC_URL = "/v1/loader/doc"
|
||||
APP_DISCOVER_URL = "/v1/app/discover"
|
||||
_DEFAULT_CLASSIFIER_URL = "http://localhost:8000"
|
||||
_DEFAULT_PEBBLO_CLOUD_URL = "https://api.daxa.ai"
|
||||
BATCH_SIZE_BYTES = 100 * 1024 # 100 KB
|
||||
|
||||
# Supported loaders for Pebblo safe data loading
|
||||
@@ -59,9 +63,15 @@ LOADER_TYPE_MAPPING = {
|
||||
"cloud-folder": cloud_folder,
|
||||
}
|
||||
|
||||
SUPPORTED_LOADERS = (*file_loader, *dir_loader, *in_memory)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
class Routes(str, Enum):
|
||||
"""Routes available for the Pebblo API as enumerator."""
|
||||
|
||||
loader_doc = "/v1/loader/doc"
|
||||
loader_app_discover = "/v1/app/discover"
|
||||
retrieval_app_discover = "/v1/app/discover"
|
||||
prompt = "/v1/prompt"
|
||||
prompt_governance = "/v1/prompt/governance"
|
||||
|
||||
|
||||
class IndexedDocument(Document):
|
||||
@@ -342,3 +352,386 @@ def generate_size_based_batches(
|
||||
batches.append(current_batch)
|
||||
|
||||
return batches
|
||||
|
||||
|
||||
def get_file_owner_from_path(file_path: str) -> str:
|
||||
"""Fetch owner of local file path.
|
||||
|
||||
Args:
|
||||
file_path (str): Local file path.
|
||||
|
||||
Returns:
|
||||
str: Name of owner.
|
||||
"""
|
||||
try:
|
||||
import pwd
|
||||
|
||||
file_owner_uid = os.stat(file_path).st_uid
|
||||
file_owner_name = pwd.getpwuid(file_owner_uid).pw_name
|
||||
except Exception:
|
||||
file_owner_name = "unknown"
|
||||
return file_owner_name
|
||||
|
||||
|
||||
def get_source_size(source_path: str) -> int:
|
||||
"""Fetch size of source path. Source can be a directory or a file.
|
||||
|
||||
Args:
|
||||
source_path (str): Local path of data source.
|
||||
|
||||
Returns:
|
||||
int: Source size in bytes.
|
||||
"""
|
||||
if not source_path:
|
||||
return 0
|
||||
size = 0
|
||||
if os.path.isfile(source_path):
|
||||
size = os.path.getsize(source_path)
|
||||
elif os.path.isdir(source_path):
|
||||
total_size = 0
|
||||
for dirpath, _, filenames in os.walk(source_path):
|
||||
for f in filenames:
|
||||
fp = os.path.join(dirpath, f)
|
||||
if not os.path.islink(fp):
|
||||
total_size += os.path.getsize(fp)
|
||||
size = total_size
|
||||
return size
|
||||
|
||||
|
||||
def calculate_content_size(data: str) -> int:
|
||||
"""Calculate the content size in bytes:
|
||||
- Encode the string to bytes using a specific encoding (e.g., UTF-8)
|
||||
- Get the length of the encoded bytes.
|
||||
|
||||
Args:
|
||||
data (str): Data string.
|
||||
|
||||
Returns:
|
||||
int: Size of string in bytes.
|
||||
"""
|
||||
encoded_content = data.encode("utf-8")
|
||||
size = len(encoded_content)
|
||||
return size
|
||||
|
||||
|
||||
class PebbloLoaderAPIWrapper(BaseModel):
|
||||
"""Wrapper for Pebblo Loader API."""
|
||||
|
||||
api_key: Optional[str] # Use SecretStr
|
||||
"""API key for Pebblo Cloud"""
|
||||
classifier_location: str = "local"
|
||||
"""Location of the classifier, local or cloud. Defaults to 'local'"""
|
||||
classifier_url: Optional[str]
|
||||
"""URL of the Pebblo Classifier"""
|
||||
cloud_url: Optional[str]
|
||||
"""URL of the Pebblo Cloud"""
|
||||
|
||||
def __init__(self, **kwargs: Any):
|
||||
"""Validate that api key in environment."""
|
||||
kwargs["api_key"] = get_from_dict_or_env(
|
||||
kwargs, "api_key", "PEBBLO_API_KEY", ""
|
||||
)
|
||||
kwargs["classifier_url"] = get_from_dict_or_env(
|
||||
kwargs, "classifier_url", "PEBBLO_CLASSIFIER_URL", _DEFAULT_CLASSIFIER_URL
|
||||
)
|
||||
kwargs["cloud_url"] = get_from_dict_or_env(
|
||||
kwargs, "cloud_url", "PEBBLO_CLOUD_URL", _DEFAULT_PEBBLO_CLOUD_URL
|
||||
)
|
||||
super().__init__(**kwargs)
|
||||
|
||||
def send_loader_discover(self, app: App) -> None:
|
||||
"""
|
||||
Send app discovery request to Pebblo server & cloud.
|
||||
|
||||
Args:
|
||||
app (App): App instance to be discovered.
|
||||
"""
|
||||
pebblo_resp = None
|
||||
payload = app.dict(exclude_unset=True)
|
||||
|
||||
if self.classifier_location == "local":
|
||||
# Send app details to local classifier
|
||||
headers = self._make_headers()
|
||||
app_discover_url = f"{self.classifier_url}{Routes.loader_app_discover}"
|
||||
pebblo_resp = self.make_request("POST", app_discover_url, headers, payload)
|
||||
|
||||
if self.api_key:
|
||||
# Send app details to Pebblo cloud if api_key is present
|
||||
headers = self._make_headers(cloud_request=True)
|
||||
if pebblo_resp:
|
||||
pebblo_server_version = json.loads(pebblo_resp.text).get(
|
||||
"pebblo_server_version"
|
||||
)
|
||||
payload.update({"pebblo_server_version": pebblo_server_version})
|
||||
|
||||
payload.update({"pebblo_client_version": PLUGIN_VERSION})
|
||||
pebblo_cloud_url = f"{self.cloud_url}{Routes.loader_app_discover}"
|
||||
_ = self.make_request("POST", pebblo_cloud_url, headers, payload)
|
||||
|
||||
def classify_documents(
|
||||
self,
|
||||
docs_with_id: List[IndexedDocument],
|
||||
app: App,
|
||||
loader_details: dict,
|
||||
loading_end: bool = False,
|
||||
) -> dict:
|
||||
"""
|
||||
Send documents to Pebblo server for classification.
|
||||
Then send classified documents to Daxa cloud(If api_key is present).
|
||||
|
||||
Args:
|
||||
docs_with_id (List[IndexedDocument]): List of documents to be classified.
|
||||
app (App): App instance.
|
||||
loader_details (dict): Loader details.
|
||||
loading_end (bool): Boolean, indicating the halt of data loading by loader.
|
||||
"""
|
||||
source_path = loader_details.get("source_path", "")
|
||||
source_owner = get_file_owner_from_path(source_path)
|
||||
# Prepare docs for classification
|
||||
docs, source_aggregate_size = self.prepare_docs_for_classification(
|
||||
docs_with_id, source_path
|
||||
)
|
||||
# Build payload for classification
|
||||
payload = self.build_classification_payload(
|
||||
app, docs, loader_details, source_owner, source_aggregate_size, loading_end
|
||||
)
|
||||
|
||||
classified_docs = {}
|
||||
if self.classifier_location == "local":
|
||||
# Send docs to local classifier
|
||||
headers = self._make_headers()
|
||||
load_doc_url = f"{self.classifier_url}{Routes.loader_doc}"
|
||||
try:
|
||||
pebblo_resp = self.make_request(
|
||||
"POST", load_doc_url, headers, payload, 300
|
||||
)
|
||||
|
||||
if pebblo_resp:
|
||||
# Updating structure of pebblo response docs for efficient searching
|
||||
for classified_doc in json.loads(pebblo_resp.text).get("docs", []):
|
||||
classified_docs.update(
|
||||
{classified_doc["pb_id"]: classified_doc}
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in classify_documents: local %s", e)
|
||||
|
||||
if self.api_key:
|
||||
# Send docs to Pebblo cloud if api_key is present
|
||||
if self.classifier_location == "local":
|
||||
# If local classifier is used add the classified information
|
||||
# and remove doc content
|
||||
self.update_doc_data(payload["docs"], classified_docs)
|
||||
self.send_docs_to_pebblo_cloud(payload)
|
||||
elif self.classifier_location == "pebblo-cloud":
|
||||
logger.warning("API key is missing for sending docs to Pebblo cloud.")
|
||||
raise NameError("API key is missing for sending docs to Pebblo cloud.")
|
||||
|
||||
return classified_docs
|
||||
|
||||
def send_docs_to_pebblo_cloud(self, payload: dict) -> None:
|
||||
"""
|
||||
Send documents to Pebblo cloud.
|
||||
|
||||
Args:
|
||||
payload (dict): The payload containing documents to be sent.
|
||||
"""
|
||||
headers = self._make_headers(cloud_request=True)
|
||||
pebblo_cloud_url = f"{self.cloud_url}{Routes.loader_doc}"
|
||||
try:
|
||||
_ = self.make_request("POST", pebblo_cloud_url, headers, payload)
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in classify_documents: cloud %s", e)
|
||||
|
||||
def _make_headers(self, cloud_request: bool = False) -> dict:
|
||||
"""
|
||||
Generate headers for the request.
|
||||
|
||||
args:
|
||||
cloud_request (bool): flag indicating whether the request is for Pebblo
|
||||
cloud.
|
||||
returns:
|
||||
dict: Headers for the request.
|
||||
|
||||
"""
|
||||
headers = {
|
||||
"Accept": "application/json",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
if cloud_request:
|
||||
# Add API key for Pebblo cloud request
|
||||
if self.api_key:
|
||||
headers.update({"x-api-key": self.api_key})
|
||||
else:
|
||||
logger.warning("API key is missing for Pebblo cloud request.")
|
||||
return headers
|
||||
|
||||
def build_classification_payload(
|
||||
self,
|
||||
app: App,
|
||||
docs: List[dict],
|
||||
loader_details: dict,
|
||||
source_owner: str,
|
||||
source_aggregate_size: int,
|
||||
loading_end: bool,
|
||||
) -> dict:
|
||||
"""
|
||||
Build the payload for document classification.
|
||||
|
||||
Args:
|
||||
app (App): App instance.
|
||||
docs (List[dict]): List of documents to be classified.
|
||||
loader_details (dict): Loader details.
|
||||
source_owner (str): Owner of the source.
|
||||
source_aggregate_size (int): Aggregate size of the source.
|
||||
loading_end (bool): Boolean indicating the halt of data loading by loader.
|
||||
|
||||
Returns:
|
||||
dict: Payload for document classification.
|
||||
"""
|
||||
payload: Dict[str, Any] = {
|
||||
"name": app.name,
|
||||
"owner": app.owner,
|
||||
"docs": docs,
|
||||
"plugin_version": PLUGIN_VERSION,
|
||||
"load_id": app.load_id,
|
||||
"loader_details": loader_details,
|
||||
"loading_end": "false",
|
||||
"source_owner": source_owner,
|
||||
"classifier_location": self.classifier_location,
|
||||
}
|
||||
if loading_end is True:
|
||||
payload["loading_end"] = "true"
|
||||
if "loader_details" in payload:
|
||||
payload["loader_details"]["source_aggregate_size"] = (
|
||||
source_aggregate_size
|
||||
)
|
||||
payload = Doc(**payload).dict(exclude_unset=True)
|
||||
return payload
|
||||
|
||||
@staticmethod
|
||||
def make_request(
|
||||
method: str,
|
||||
url: str,
|
||||
headers: dict,
|
||||
payload: Optional[dict] = None,
|
||||
timeout: int = 20,
|
||||
) -> Optional[Response]:
|
||||
"""
|
||||
Make a request to the Pebblo API
|
||||
|
||||
Args:
|
||||
method (str): HTTP method (GET, POST, PUT, DELETE, etc.).
|
||||
url (str): URL for the request.
|
||||
headers (dict): Headers for the request.
|
||||
payload (Optional[dict]): Payload for the request (for POST, PUT, etc.).
|
||||
timeout (int): Timeout for the request in seconds.
|
||||
|
||||
Returns:
|
||||
Optional[Response]: Response object if the request is successful.
|
||||
"""
|
||||
try:
|
||||
response = request(
|
||||
method=method, url=url, headers=headers, json=payload, timeout=timeout
|
||||
)
|
||||
logger.debug(
|
||||
"Request: method %s, url %s, len %s response status %s",
|
||||
method,
|
||||
response.request.url,
|
||||
str(len(response.request.body if response.request.body else [])),
|
||||
str(response.status_code),
|
||||
)
|
||||
|
||||
if response.status_code >= HTTPStatus.INTERNAL_SERVER_ERROR:
|
||||
logger.warning(f"Pebblo Server: Error {response.status_code}")
|
||||
elif response.status_code >= HTTPStatus.BAD_REQUEST:
|
||||
logger.warning(f"Pebblo received an invalid payload: {response.text}")
|
||||
elif response.status_code != HTTPStatus.OK:
|
||||
logger.warning(
|
||||
f"Pebblo returned an unexpected response code: "
|
||||
f"{response.status_code}"
|
||||
)
|
||||
|
||||
return response
|
||||
except RequestException:
|
||||
logger.warning("Unable to reach server %s", url)
|
||||
except Exception as e:
|
||||
logger.warning("An Exception caught in make_request: %s", e)
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def prepare_docs_for_classification(
|
||||
docs_with_id: List[IndexedDocument], source_path: str
|
||||
) -> Tuple[List[dict], int]:
|
||||
"""
|
||||
Prepare documents for classification.
|
||||
|
||||
Args:
|
||||
docs_with_id (List[IndexedDocument]): List of documents to be classified.
|
||||
source_path (str): Source path of the documents.
|
||||
|
||||
Returns:
|
||||
Tuple[List[dict], int]: Documents and the aggregate size of the source.
|
||||
"""
|
||||
docs = []
|
||||
source_aggregate_size = 0
|
||||
doc_content = [doc.dict() for doc in docs_with_id]
|
||||
for doc in doc_content:
|
||||
doc_metadata = doc.get("metadata", {})
|
||||
doc_authorized_identities = doc_metadata.get("authorized_identities", [])
|
||||
doc_source_path = get_full_path(
|
||||
doc_metadata.get(
|
||||
"full_path",
|
||||
doc_metadata.get("source", source_path),
|
||||
)
|
||||
)
|
||||
doc_source_owner = doc_metadata.get(
|
||||
"owner", get_file_owner_from_path(doc_source_path)
|
||||
)
|
||||
doc_source_size = doc_metadata.get("size", get_source_size(doc_source_path))
|
||||
page_content = str(doc.get("page_content"))
|
||||
page_content_size = calculate_content_size(page_content)
|
||||
source_aggregate_size += page_content_size
|
||||
doc_id = doc.get("pb_id", None) or 0
|
||||
docs.append(
|
||||
{
|
||||
"doc": page_content,
|
||||
"source_path": doc_source_path,
|
||||
"pb_id": doc_id,
|
||||
"last_modified": doc.get("metadata", {}).get("last_modified"),
|
||||
"file_owner": doc_source_owner,
|
||||
**(
|
||||
{"authorized_identities": doc_authorized_identities}
|
||||
if doc_authorized_identities
|
||||
else {}
|
||||
),
|
||||
**(
|
||||
{"source_path_size": doc_source_size}
|
||||
if doc_source_size is not None
|
||||
else {}
|
||||
),
|
||||
}
|
||||
)
|
||||
return docs, source_aggregate_size
|
||||
|
||||
@staticmethod
|
||||
def update_doc_data(docs: List[dict], classified_docs: dict) -> None:
|
||||
"""
|
||||
Update the document data with classified information.
|
||||
|
||||
Args:
|
||||
docs (List[dict]): List of document data to be updated.
|
||||
classified_docs (dict): The dictionary containing classified documents.
|
||||
"""
|
||||
for doc_data in docs:
|
||||
classified_data = classified_docs.get(doc_data["pb_id"], {})
|
||||
# Update the document data with classified information
|
||||
doc_data.update(
|
||||
{
|
||||
"pb_checksum": classified_data.get("pb_checksum"),
|
||||
"loader_source_path": classified_data.get("loader_source_path"),
|
||||
"entities": classified_data.get("entities", {}),
|
||||
"topics": classified_data.get("topics", {}),
|
||||
}
|
||||
)
|
||||
# Remove the document content
|
||||
doc_data.pop("doc")
|
||||
|
||||
@@ -587,7 +587,11 @@ class Neo4jVector(VectorStore):
|
||||
pass
|
||||
|
||||
def query(
|
||||
self, query: str, *, params: Optional[dict] = None
|
||||
self,
|
||||
query: str,
|
||||
*,
|
||||
params: Optional[dict] = None,
|
||||
retry_on_session_expired: bool = True,
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
This method sends a Cypher query to the connected Neo4j database
|
||||
@@ -600,7 +604,7 @@ class Neo4jVector(VectorStore):
|
||||
Returns:
|
||||
List[Dict[str, Any]]: List of dictionaries containing the query results.
|
||||
"""
|
||||
from neo4j.exceptions import CypherSyntaxError
|
||||
from neo4j.exceptions import CypherSyntaxError, SessionExpired
|
||||
|
||||
params = params or {}
|
||||
with self._driver.session(database=self._database) as session:
|
||||
@@ -609,6 +613,15 @@ class Neo4jVector(VectorStore):
|
||||
return [r.data() for r in data]
|
||||
except CypherSyntaxError as e:
|
||||
raise ValueError(f"Cypher Statement is not valid\n{e}")
|
||||
except (
|
||||
SessionExpired
|
||||
) as e: # Session expired is a transient error that can be retried
|
||||
if retry_on_session_expired:
|
||||
return self.query(
|
||||
query, params=params, retry_on_session_expired=False
|
||||
)
|
||||
else:
|
||||
raise e
|
||||
|
||||
def verify_version(self) -> None:
|
||||
"""
|
||||
|
||||
@@ -144,4 +144,5 @@ def test_pebblo_safe_loader_api_key() -> None:
|
||||
)
|
||||
|
||||
# Assert
|
||||
assert loader.api_key == api_key
|
||||
assert loader.pb_client.api_key == api_key
|
||||
assert loader.pb_client.classifier_location == "local"
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import json
|
||||
from typing import Dict, List, Type
|
||||
|
||||
import pytest
|
||||
@@ -12,6 +13,7 @@ from langchain_core.messages import (
|
||||
ToolMessage,
|
||||
)
|
||||
from langchain_core.messages.utils import (
|
||||
convert_to_messages,
|
||||
filter_messages,
|
||||
merge_message_runs,
|
||||
trim_messages,
|
||||
@@ -357,3 +359,176 @@ def dummy_token_counter(messages: List[BaseMessage]) -> int:
|
||||
class FakeTokenCountingModel(FakeChatModel):
|
||||
def get_num_tokens_from_messages(self, messages: List[BaseMessage]) -> int:
|
||||
return dummy_token_counter(messages)
|
||||
|
||||
|
||||
def test_convert_to_messages() -> None:
|
||||
message_like: List = [
|
||||
# BaseMessage
|
||||
SystemMessage("1"),
|
||||
HumanMessage([{"type": "image_url", "image_url": {"url": "2.1"}}], name="2.2"),
|
||||
AIMessage(
|
||||
[
|
||||
{"type": "text", "text": "3.1"},
|
||||
{
|
||||
"type": "tool_use",
|
||||
"id": "3.2",
|
||||
"name": "3.3",
|
||||
"input": {"3.4": "3.5"},
|
||||
},
|
||||
]
|
||||
),
|
||||
AIMessage(
|
||||
[
|
||||
{"type": "text", "text": "4.1"},
|
||||
{
|
||||
"type": "tool_use",
|
||||
"id": "4.2",
|
||||
"name": "4.3",
|
||||
"input": {"4.4": "4.5"},
|
||||
},
|
||||
],
|
||||
tool_calls=[
|
||||
{
|
||||
"name": "4.3",
|
||||
"args": {"4.4": "4.5"},
|
||||
"id": "4.2",
|
||||
"type": "tool_call",
|
||||
}
|
||||
],
|
||||
),
|
||||
ToolMessage("5.1", tool_call_id="5.2", name="5.3"),
|
||||
# OpenAI dict
|
||||
{"role": "system", "content": "6"},
|
||||
{
|
||||
"role": "user",
|
||||
"content": [{"type": "image_url", "image_url": {"url": "7.1"}}],
|
||||
"name": "7.2",
|
||||
},
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": [{"type": "text", "text": "8.1"}],
|
||||
"tool_calls": [
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"arguments": json.dumps({"8.2": "8.3"}),
|
||||
"name": "8.4",
|
||||
},
|
||||
"id": "8.5",
|
||||
}
|
||||
],
|
||||
"name": "8.6",
|
||||
},
|
||||
{"role": "tool", "content": "10.1", "tool_call_id": "10.2"},
|
||||
# Tuple/List
|
||||
("system", "11.1"),
|
||||
("human", [{"type": "image_url", "image_url": {"url": "12.1"}}]),
|
||||
(
|
||||
"ai",
|
||||
[
|
||||
{"type": "text", "text": "13.1"},
|
||||
{
|
||||
"type": "tool_use",
|
||||
"id": "13.2",
|
||||
"name": "13.3",
|
||||
"input": {"13.4": "13.5"},
|
||||
},
|
||||
],
|
||||
),
|
||||
# String
|
||||
"14.1",
|
||||
# LangChain dict
|
||||
{
|
||||
"role": "ai",
|
||||
"content": [{"type": "text", "text": "15.1"}],
|
||||
"tool_calls": [{"args": {"15.2": "15.3"}, "name": "15.4", "id": "15.5"}],
|
||||
"name": "15.6",
|
||||
},
|
||||
]
|
||||
expected = [
|
||||
SystemMessage(content="1"),
|
||||
HumanMessage(
|
||||
content=[{"type": "image_url", "image_url": {"url": "2.1"}}], name="2.2"
|
||||
),
|
||||
AIMessage(
|
||||
content=[
|
||||
{"type": "text", "text": "3.1"},
|
||||
{
|
||||
"type": "tool_use",
|
||||
"id": "3.2",
|
||||
"name": "3.3",
|
||||
"input": {"3.4": "3.5"},
|
||||
},
|
||||
]
|
||||
),
|
||||
AIMessage(
|
||||
content=[
|
||||
{"type": "text", "text": "4.1"},
|
||||
{
|
||||
"type": "tool_use",
|
||||
"id": "4.2",
|
||||
"name": "4.3",
|
||||
"input": {"4.4": "4.5"},
|
||||
},
|
||||
],
|
||||
tool_calls=[
|
||||
{
|
||||
"name": "4.3",
|
||||
"args": {"4.4": "4.5"},
|
||||
"id": "4.2",
|
||||
"type": "tool_call",
|
||||
}
|
||||
],
|
||||
),
|
||||
ToolMessage(content="5.1", name="5.3", tool_call_id="5.2"),
|
||||
SystemMessage(content="6"),
|
||||
HumanMessage(
|
||||
content=[{"type": "image_url", "image_url": {"url": "7.1"}}], name="7.2"
|
||||
),
|
||||
AIMessage(
|
||||
content=[{"type": "text", "text": "8.1"}],
|
||||
name="8.6",
|
||||
tool_calls=[
|
||||
{
|
||||
"name": "8.4",
|
||||
"args": {"8.2": "8.3"},
|
||||
"id": "8.5",
|
||||
"type": "tool_call",
|
||||
}
|
||||
],
|
||||
),
|
||||
ToolMessage(content="10.1", tool_call_id="10.2"),
|
||||
SystemMessage(content="11.1"),
|
||||
HumanMessage(content=[{"type": "image_url", "image_url": {"url": "12.1"}}]),
|
||||
AIMessage(
|
||||
content=[
|
||||
{"type": "text", "text": "13.1"},
|
||||
{
|
||||
"type": "tool_use",
|
||||
"id": "13.2",
|
||||
"name": "13.3",
|
||||
"input": {"13.4": "13.5"},
|
||||
},
|
||||
]
|
||||
),
|
||||
HumanMessage(content="14.1"),
|
||||
AIMessage(
|
||||
content=[{"type": "text", "text": "15.1"}],
|
||||
name="15.6",
|
||||
tool_calls=[
|
||||
{
|
||||
"name": "15.4",
|
||||
"args": {"15.2": "15.3"},
|
||||
"id": "15.5",
|
||||
"type": "tool_call",
|
||||
}
|
||||
],
|
||||
),
|
||||
]
|
||||
actual = convert_to_messages(message_like)
|
||||
assert expected == actual
|
||||
|
||||
|
||||
@pytest.mark.xfail(reason="AI message does not support refusal key yet.")
|
||||
def test_convert_to_messages_openai_refusal() -> None:
|
||||
convert_to_messages([{"role": "assistant", "refusal": "9.1"}])
|
||||
|
||||
@@ -1,7 +1,8 @@
|
||||
from importlib import metadata
|
||||
|
||||
from langchain_box.document_loaders import BoxLoader
|
||||
from langchain_box.utilities import BoxAPIWrapper, BoxAuth, BoxAuthType
|
||||
from langchain_box.retrievers import BoxRetriever
|
||||
from langchain_box.utilities import BoxAuth, BoxAuthType, _BoxAPIWrapper
|
||||
|
||||
try:
|
||||
__version__ = metadata.version(__package__)
|
||||
@@ -12,8 +13,9 @@ del metadata # optional, avoids polluting the results of dir(__package__)
|
||||
|
||||
__all__ = [
|
||||
"BoxLoader",
|
||||
"BoxRetriever",
|
||||
"BoxAuth",
|
||||
"BoxAuthType",
|
||||
"BoxAPIWrapper",
|
||||
"_BoxAPIWrapper",
|
||||
"__version__",
|
||||
]
|
||||
|
||||
@@ -3,14 +3,14 @@ from typing import Any, Dict, Iterator, List, Optional
|
||||
from box_sdk_gen import FileBaseTypeField # type: ignore
|
||||
from langchain_core.document_loaders.base import BaseLoader
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.pydantic_v1 import BaseModel, ConfigDict, root_validator
|
||||
from langchain_core.pydantic_v1 import BaseModel, root_validator
|
||||
from langchain_core.utils import get_from_dict_or_env
|
||||
|
||||
from langchain_box.utilities import BoxAPIWrapper, BoxAuth
|
||||
from langchain_box.utilities import BoxAuth, _BoxAPIWrapper
|
||||
|
||||
|
||||
class BoxLoader(BaseLoader, BaseModel):
|
||||
"""
|
||||
BoxLoader
|
||||
"""BoxLoader.
|
||||
|
||||
This class will help you load files from your Box instance. You must have a
|
||||
Box account. If you need one, you can sign up for a free developer account.
|
||||
@@ -33,18 +33,18 @@ class BoxLoader(BaseLoader, BaseModel):
|
||||
pip install -U langchain-box
|
||||
export BOX_DEVELOPER_TOKEN="your-api-key"
|
||||
|
||||
|
||||
This loader returns ``Document `` objects built from text representations of files
|
||||
in Box. It will skip any document without a text representation available. You can
|
||||
provide either a ``List[str]`` containing Box file IDS, or you can provide a
|
||||
``str`` contining a Box folder ID. If providing a folder ID, you can also enable
|
||||
recursive mode to get the full tree under that folder.
|
||||
|
||||
:::info
|
||||
.. note::
|
||||
A Box instance can contain Petabytes of files, and folders can contain millions
|
||||
of files. Be intentional when choosing what folders you choose to index. And we
|
||||
recommend never getting all files from folder 0 recursively. Folder ID 0 is your
|
||||
root folder.
|
||||
:::
|
||||
|
||||
Instantiate:
|
||||
|
||||
@@ -121,32 +121,36 @@ class BoxLoader(BaseLoader, BaseModel):
|
||||
Terrarium: $120\nTotal: $920')
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(use_enum_values=True)
|
||||
|
||||
"""String containing the Box Developer Token generated in the developer console"""
|
||||
box_developer_token: Optional[str] = None
|
||||
"""Configured langchain_box.utilities.BoxAuth object"""
|
||||
"""String containing the Box Developer Token generated in the developer console"""
|
||||
|
||||
box_auth: Optional[BoxAuth] = None
|
||||
"""List[str] containing Box file ids"""
|
||||
"""Configured langchain_box.utilities.BoxAuth object"""
|
||||
|
||||
box_file_ids: Optional[List[str]] = None
|
||||
"""String containing box folder id to load files from"""
|
||||
"""List[str] containing Box file ids"""
|
||||
|
||||
box_folder_id: Optional[str] = None
|
||||
"""String containing box folder id to load files from"""
|
||||
|
||||
recursive: Optional[bool] = False
|
||||
"""If getting files by folder id, recursive is a bool to determine if you wish
|
||||
to traverse subfolders to return child documents. Default is False"""
|
||||
recursive: Optional[bool] = False
|
||||
|
||||
character_limit: Optional[int] = -1
|
||||
"""character_limit is an int that caps the number of characters to
|
||||
return per document."""
|
||||
character_limit: Optional[int] = -1
|
||||
|
||||
box: Optional[BoxAPIWrapper]
|
||||
_box: Optional[_BoxAPIWrapper]
|
||||
|
||||
class Config:
|
||||
arbitrary_types_allowed = True
|
||||
extra = "allow"
|
||||
use_enum_values = True
|
||||
|
||||
@root_validator(allow_reuse=True)
|
||||
def validate_box_loader_inputs(cls, values: Dict[str, Any]) -> Dict[str, Any]:
|
||||
box = None
|
||||
_box = None
|
||||
|
||||
"""Validate that has either box_file_ids or box_folder_id."""
|
||||
if not values.get("box_file_ids") and not values.get("box_folder_id"):
|
||||
@@ -159,19 +163,30 @@ class BoxLoader(BaseLoader, BaseModel):
|
||||
)
|
||||
|
||||
"""Validate that we have either a box_developer_token or box_auth."""
|
||||
if not values.get("box_auth") and not values.get("box_developer_token"):
|
||||
raise ValueError(
|
||||
"you must provide box_developer_token or a box_auth "
|
||||
"generated with langchain_box.utilities.BoxAuth"
|
||||
if not values.get("box_auth"):
|
||||
if not get_from_dict_or_env(
|
||||
values, "box_developer_token", "BOX_DEVELOPER_TOKEN"
|
||||
):
|
||||
raise ValueError(
|
||||
"you must provide box_developer_token or a box_auth "
|
||||
"generated with langchain_box.utilities.BoxAuth"
|
||||
)
|
||||
else:
|
||||
token = get_from_dict_or_env(
|
||||
values, "box_developer_token", "BOX_DEVELOPER_TOKEN"
|
||||
)
|
||||
|
||||
_box = _BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box_developer_token=token,
|
||||
character_limit=values.get("character_limit"),
|
||||
)
|
||||
else:
|
||||
_box = _BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box_auth=values.get("box_auth"),
|
||||
character_limit=values.get("character_limit"),
|
||||
)
|
||||
|
||||
box = BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box_developer_token=values.get("box_developer_token"),
|
||||
box_auth=values.get("box_auth"),
|
||||
character_limit=values.get("character_limit"),
|
||||
)
|
||||
|
||||
values["box"] = box
|
||||
values["_box"] = _box
|
||||
|
||||
return values
|
||||
|
||||
@@ -181,7 +196,7 @@ class BoxLoader(BaseLoader, BaseModel):
|
||||
for file in folder_content:
|
||||
try:
|
||||
if file.type == FileBaseTypeField.FILE:
|
||||
doc = self.box.get_document_by_file_id(file.id)
|
||||
doc = self._box.get_document_by_file_id(file.id)
|
||||
|
||||
if doc is not None:
|
||||
yield doc
|
||||
@@ -199,7 +214,7 @@ class BoxLoader(BaseLoader, BaseModel):
|
||||
if self.box_file_ids:
|
||||
for file_id in self.box_file_ids:
|
||||
try:
|
||||
file = self.box.get_document_by_file_id(file_id) # type: ignore[union-attr]
|
||||
file = self._box.get_document_by_file_id(file_id) # type: ignore[union-attr]
|
||||
|
||||
if file is not None:
|
||||
yield file
|
||||
|
||||
5
libs/partners/box/langchain_box/retrievers/__init__.py
Normal file
5
libs/partners/box/langchain_box/retrievers/__init__.py
Normal file
@@ -0,0 +1,5 @@
|
||||
"""Box Document Loaders."""
|
||||
|
||||
from langchain_box.retrievers.box import BoxRetriever
|
||||
|
||||
__all__ = ["BoxRetriever"]
|
||||
158
libs/partners/box/langchain_box/retrievers/box.py
Normal file
158
libs/partners/box/langchain_box/retrievers/box.py
Normal file
@@ -0,0 +1,158 @@
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
from langchain_core.callbacks import CallbackManagerForRetrieverRun
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.pydantic_v1 import root_validator
|
||||
from langchain_core.retrievers import BaseRetriever
|
||||
|
||||
from langchain_box.utilities import BoxAuth, _BoxAPIWrapper
|
||||
|
||||
|
||||
class BoxRetriever(BaseRetriever):
|
||||
"""Box retriever.
|
||||
|
||||
`BoxRetriever` provides the ability to retrieve content from
|
||||
your Box instance in a couple of ways.
|
||||
|
||||
1. You can use the Box full-text search to retrieve the
|
||||
complete document(s) that match your search query, as
|
||||
`List[Document]`
|
||||
2. You can use the Box AI Platform API to retrieve the results
|
||||
from a Box AI prompt. This can be a `Document` containing
|
||||
the result of the prompt, or you can retrieve the citations
|
||||
used to generate the prompt to include in your vectorstore.
|
||||
|
||||
Setup:
|
||||
Install ``langchain-box``:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install -U langchain-box
|
||||
|
||||
Instantiate:
|
||||
|
||||
To use search:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_box.retrievers import BoxRetriever
|
||||
|
||||
retriever = BoxRetriever()
|
||||
|
||||
To use Box AI:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_box.retrievers import BoxRetriever
|
||||
|
||||
file_ids=["12345","67890"]
|
||||
|
||||
retriever = BoxRetriever(file_ids)
|
||||
|
||||
|
||||
Usage:
|
||||
.. code-block:: python
|
||||
|
||||
retriever = BoxRetriever()
|
||||
retriever.invoke("victor")
|
||||
print(docs[0].page_content[:100])
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[
|
||||
Document(
|
||||
metadata={
|
||||
'source': 'url',
|
||||
'title': 'FIVE_FEET_AND_RISING_by_Peter_Sollett_pdf'
|
||||
},
|
||||
page_content='\\n3/20/23, 5:31 PM F...'
|
||||
)
|
||||
]
|
||||
|
||||
Use within a chain:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_core.output_parsers import StrOutputParser
|
||||
from langchain_core.prompts import ChatPromptTemplate
|
||||
from langchain_core.runnables import RunnablePassthrough
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
retriever = BoxRetriever(box_developer_token=box_developer_token, character_limit=10000)
|
||||
|
||||
context="You are an actor reading scripts to learn about your role in an upcoming movie."
|
||||
question="describe the character Victor"
|
||||
|
||||
prompt = ChatPromptTemplate.from_template(
|
||||
\"""Answer the question based only on the context provided.
|
||||
|
||||
Context: {context}
|
||||
|
||||
Question: {question}\"""
|
||||
)
|
||||
|
||||
def format_docs(docs):
|
||||
return "\\n\\n".join(doc.page_content for doc in docs)
|
||||
|
||||
chain = (
|
||||
{"context": retriever | format_docs, "question": RunnablePassthrough()}
|
||||
| prompt
|
||||
| llm
|
||||
| StrOutputParser()
|
||||
)
|
||||
|
||||
chain.invoke("Victor") # search query to find files in Box
|
||||
)
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
'Victor is a skinny 12-year-old with sloppy hair who is seen
|
||||
sleeping on his fire escape in the sun. He is hesitant to go to
|
||||
the pool with his friend Carlos because he is afraid of getting
|
||||
in trouble for not letting his mother cut his hair. Ultimately,
|
||||
he decides to go to the pool with Carlos.'
|
||||
""" # noqa: E501
|
||||
|
||||
box_developer_token: Optional[str] = None
|
||||
"""String containing the Box Developer Token generated in the developer console"""
|
||||
|
||||
box_auth: Optional[BoxAuth] = None
|
||||
"""Configured langchain_box.utilities.BoxAuth object"""
|
||||
|
||||
box_file_ids: Optional[List[str]] = None
|
||||
"""List[str] containing Box file ids"""
|
||||
character_limit: Optional[int] = -1
|
||||
"""character_limit is an int that caps the number of characters to
|
||||
return per document."""
|
||||
|
||||
_box: Optional[_BoxAPIWrapper]
|
||||
|
||||
class Config:
|
||||
arbitrary_types_allowed = True
|
||||
extra = "allow"
|
||||
|
||||
@root_validator(allow_reuse=True)
|
||||
def validate_box_loader_inputs(cls, values: Dict[str, Any]) -> Dict[str, Any]:
|
||||
_box = None
|
||||
|
||||
"""Validate that we have either a box_developer_token or box_auth."""
|
||||
if not values.get("box_auth") and not values.get("box_developer_token"):
|
||||
raise ValueError(
|
||||
"you must provide box_developer_token or a box_auth "
|
||||
"generated with langchain_box.utilities.BoxAuth"
|
||||
)
|
||||
|
||||
_box = _BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box_developer_token=values.get("box_developer_token"),
|
||||
box_auth=values.get("box_auth"),
|
||||
character_limit=values.get("character_limit"),
|
||||
)
|
||||
|
||||
values["_box"] = _box
|
||||
|
||||
return values
|
||||
|
||||
def _get_relevant_documents(
|
||||
self, query: str, *, run_manager: CallbackManagerForRetrieverRun
|
||||
) -> List[Document]:
|
||||
if self.box_file_ids: # If using Box AI
|
||||
return self._box.ask_box_ai(query=query, box_file_ids=self.box_file_ids) # type: ignore[union-attr]
|
||||
else: # If using Search
|
||||
return self._box.search_box(query=query) # type: ignore[union-attr]
|
||||
@@ -1,5 +1,5 @@
|
||||
"""Box API Utilities."""
|
||||
|
||||
from langchain_box.utilities.box import BoxAPIWrapper, BoxAuth, BoxAuthType
|
||||
from langchain_box.utilities.box import BoxAuth, BoxAuthType, _BoxAPIWrapper
|
||||
|
||||
__all__ = ["BoxAuth", "BoxAuthType", "BoxAPIWrapper"]
|
||||
__all__ = ["BoxAuth", "BoxAuthType", "_BoxAPIWrapper"]
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
"""Util that calls Box APIs."""
|
||||
|
||||
from enum import Enum
|
||||
from typing import Any, Dict, Optional
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import box_sdk_gen # type: ignore
|
||||
import requests
|
||||
@@ -11,6 +11,13 @@ from langchain_core.utils import get_from_dict_or_env
|
||||
|
||||
|
||||
class DocumentFiles(Enum):
|
||||
"""DocumentFiles(Enum).
|
||||
|
||||
An enum containing all of the supported extensions for files
|
||||
Box considers Documents. These files should have text
|
||||
representations.
|
||||
"""
|
||||
|
||||
DOC = "doc"
|
||||
DOCX = "docx"
|
||||
GDOC = "gdoc"
|
||||
@@ -89,6 +96,12 @@ class DocumentFiles(Enum):
|
||||
|
||||
|
||||
class ImageFiles(Enum):
|
||||
"""ImageFiles(Enum).
|
||||
|
||||
An enum containing all of the supported extensions for files
|
||||
Box considers images.
|
||||
"""
|
||||
|
||||
ARW = "arw"
|
||||
BMP = "bmp"
|
||||
CR2 = "cr2"
|
||||
@@ -115,8 +128,9 @@ class ImageFiles(Enum):
|
||||
SVS = "svs"
|
||||
|
||||
|
||||
"""
|
||||
BoxAuthType
|
||||
class BoxAuthType(Enum):
|
||||
"""BoxAuthType(Enum).
|
||||
|
||||
an enum to tell BoxLoader how you wish to autheticate your Box connection.
|
||||
|
||||
Options are:
|
||||
@@ -128,22 +142,23 @@ class ImageFiles(Enum):
|
||||
and `box_enterprise_id` or optionally `box_user_id`.
|
||||
JWT - Use JWT for authentication. Config should be stored on the file
|
||||
system accessible to your app.
|
||||
provide `box_jwt_path`. Optionally, provide `box_user_id` to
|
||||
provide `box_jwt_path`. Optionally, provide `box_user_id` to
|
||||
act as a specific user
|
||||
"""
|
||||
|
||||
|
||||
class BoxAuthType(Enum):
|
||||
"""Use a developer token or a token retrieved from box-sdk-gen"""
|
||||
"""
|
||||
|
||||
TOKEN = "token"
|
||||
"""Use `client_credentials` type grant"""
|
||||
"""Use a developer token or a token retrieved from box-sdk-gen"""
|
||||
|
||||
CCG = "ccg"
|
||||
"""Use JWT bearer token auth"""
|
||||
"""Use `client_credentials` type grant"""
|
||||
|
||||
JWT = "jwt"
|
||||
"""Use JWT bearer token auth"""
|
||||
|
||||
|
||||
"""
|
||||
class BoxAuth(BaseModel):
|
||||
"""BoxAuth.
|
||||
|
||||
`BoxAuth` supports the following authentication methods:
|
||||
|
||||
* Token — either a developer token or any token generated through the Box SDK
|
||||
@@ -152,16 +167,15 @@ class BoxAuthType(Enum):
|
||||
* CCG with a service account
|
||||
* CCG with a specified user
|
||||
|
||||
:::note
|
||||
If using JWT authentication, you will need to download the configuration from the
|
||||
Box developer console after generating your public/private key pair. Place this
|
||||
file in your application directory structure somewhere. You will use the path to
|
||||
.. note::
|
||||
If using JWT authentication, you will need to download the configuration from the
|
||||
Box developer console after generating your public/private key pair. Place this
|
||||
file in your application directory structure somewhere. You will use the path to
|
||||
this file when using the `BoxAuth` helper class.
|
||||
:::
|
||||
|
||||
For more information, learn about how to
|
||||
For more information, learn about how to
|
||||
[set up a Box application](https://developer.box.com/guides/getting-started/first-application/),
|
||||
and check out the
|
||||
and check out the
|
||||
[Box authentication guide](https://developer.box.com/guides/authentication/select/)
|
||||
for more about our different authentication options.
|
||||
|
||||
@@ -169,7 +183,7 @@ class BoxAuthType(Enum):
|
||||
|
||||
To instantiate, you must provide a ``langchain_box.utilities.BoxAuthType``.
|
||||
|
||||
BoxAuthType is an enum to tell BoxLoader how you wish to autheticate your
|
||||
BoxAuthType is an enum to tell BoxLoader how you wish to autheticate your
|
||||
Box connection.
|
||||
|
||||
Options are:
|
||||
@@ -181,7 +195,7 @@ class BoxAuthType(Enum):
|
||||
and `box_enterprise_id` or optionally `box_user_id`.
|
||||
JWT - Use JWT for authentication. Config should be stored on the file
|
||||
system accessible to your app.
|
||||
provide `box_jwt_path`. Optionally, provide `box_user_id` to
|
||||
provide `box_jwt_path`. Optionally, provide `box_user_id` to
|
||||
act as a specific user
|
||||
|
||||
.. code-block:: python
|
||||
@@ -198,36 +212,40 @@ class BoxAuthType(Enum):
|
||||
...
|
||||
)
|
||||
|
||||
To see examples for each supported authentication methodology, visit the
|
||||
[Box providers](/docs/integrations/providers/box) page. If you want to
|
||||
use OAuth 2.0 `authorization_code` flow, use
|
||||
[box-sdk-gen](https://github.com/box/box-python-sdk-gen) SDK, get your
|
||||
To see examples for each supported authentication methodology, visit the
|
||||
[Box providers](/docs/integrations/providers/box) page. If you want to
|
||||
use OAuth 2.0 `authorization_code` flow, use
|
||||
[box-sdk-gen](https://github.com/box/box-python-sdk-gen) SDK, get your
|
||||
token, and use `BoxAuthType.TOKEN` type.
|
||||
"""
|
||||
|
||||
|
||||
class BoxAuth(BaseModel):
|
||||
"""Authentication type to use. Must pass BoxAuthType enum"""
|
||||
"""
|
||||
|
||||
auth_type: BoxAuthType
|
||||
""" If using BoxAuthType.TOKEN, provide your token here"""
|
||||
"""langchain_box.utilities.BoxAuthType. Enum describing how to
|
||||
authenticate against Box"""
|
||||
|
||||
box_developer_token: Optional[str] = None
|
||||
""" If using BoxAuthType.TOKEN, provide your token here"""
|
||||
|
||||
box_jwt_path: Optional[str] = None
|
||||
"""If using BoxAuthType.JWT, provide local path to your
|
||||
JWT configuration file"""
|
||||
box_jwt_path: Optional[str] = None
|
||||
"""If using BoxAuthType.CCG, provide your app's client ID"""
|
||||
|
||||
box_client_id: Optional[str] = None
|
||||
"""If using BoxAuthType.CCG, provide your app's client secret"""
|
||||
"""If using BoxAuthType.CCG, provide your app's client ID"""
|
||||
|
||||
box_client_secret: Optional[str] = None
|
||||
"""If using BoxAuthType.CCG, provide your app's client secret"""
|
||||
|
||||
box_enterprise_id: Optional[str] = None
|
||||
"""If using BoxAuthType.CCG, provide your enterprise ID.
|
||||
Only required if you are not sending `box_user_id`"""
|
||||
box_enterprise_id: Optional[str] = None
|
||||
|
||||
box_user_id: Optional[str] = None
|
||||
"""If using BoxAuthType.CCG or BoxAuthType.JWT, providing
|
||||
`box_user_id` will act on behalf of a specific user"""
|
||||
box_user_id: Optional[str] = None
|
||||
|
||||
box_client: Optional[box_sdk_gen.BoxClient] = None
|
||||
custom_header: Dict = dict({"x-box-ai-library": "langchain"})
|
||||
_box_client: Optional[box_sdk_gen.BoxClient] = None
|
||||
_custom_header: Dict = dict({"x-box-ai-library": "langchain"})
|
||||
|
||||
class Config:
|
||||
arbitrary_types_allowed = True
|
||||
@@ -276,16 +294,16 @@ class BoxAuth(BaseModel):
|
||||
|
||||
return values
|
||||
|
||||
def authorize(self) -> None:
|
||||
def _authorize(self) -> None:
|
||||
match self.auth_type:
|
||||
case "token":
|
||||
try:
|
||||
auth = box_sdk_gen.BoxDeveloperTokenAuth(
|
||||
token=self.box_developer_token
|
||||
)
|
||||
self.box_client = box_sdk_gen.BoxClient(
|
||||
self._box_client = box_sdk_gen.BoxClient(
|
||||
auth=auth
|
||||
).with_extra_headers(extra_headers=self.custom_header)
|
||||
).with_extra_headers(extra_headers=self._custom_header)
|
||||
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(
|
||||
@@ -304,15 +322,15 @@ class BoxAuth(BaseModel):
|
||||
)
|
||||
auth = box_sdk_gen.BoxJWTAuth(config=jwt_config)
|
||||
|
||||
self.box_client = box_sdk_gen.BoxClient(
|
||||
self._box_client = box_sdk_gen.BoxClient(
|
||||
auth=auth
|
||||
).with_extra_headers(extra_headers=self.custom_header)
|
||||
).with_extra_headers(extra_headers=self._custom_header)
|
||||
|
||||
if self.box_user_id is not None:
|
||||
user_auth = auth.with_user_subject(self.box_user_id)
|
||||
self.box_client = box_sdk_gen.BoxClient(
|
||||
self._box_client = box_sdk_gen.BoxClient(
|
||||
auth=user_auth
|
||||
).with_extra_headers(extra_headers=self.custom_header)
|
||||
).with_extra_headers(extra_headers=self._custom_header)
|
||||
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(
|
||||
@@ -340,9 +358,9 @@ class BoxAuth(BaseModel):
|
||||
)
|
||||
auth = box_sdk_gen.BoxCCGAuth(config=ccg_config)
|
||||
|
||||
self.box_client = box_sdk_gen.BoxClient(
|
||||
self._box_client = box_sdk_gen.BoxClient(
|
||||
auth=auth
|
||||
).with_extra_headers(extra_headers=self.custom_header)
|
||||
).with_extra_headers(extra_headers=self._custom_header)
|
||||
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(
|
||||
@@ -363,25 +381,26 @@ class BoxAuth(BaseModel):
|
||||
|
||||
def get_client(self) -> box_sdk_gen.BoxClient:
|
||||
"""Instantiate the Box SDK."""
|
||||
if self.box_client is None:
|
||||
self.authorize()
|
||||
if self._box_client is None:
|
||||
self._authorize()
|
||||
|
||||
return self.box_client
|
||||
return self._box_client
|
||||
|
||||
|
||||
class BoxAPIWrapper(BaseModel):
|
||||
class _BoxAPIWrapper(BaseModel):
|
||||
"""Wrapper for Box API."""
|
||||
|
||||
"""String containing the Box Developer Token generated in the developer console"""
|
||||
box_developer_token: Optional[str] = None
|
||||
"""Configured langchain_box.utilities.BoxAuth object"""
|
||||
"""String containing the Box Developer Token generated in the developer console"""
|
||||
|
||||
box_auth: Optional[BoxAuth] = None
|
||||
"""Configured langchain_box.utilities.BoxAuth object"""
|
||||
|
||||
character_limit: Optional[int] = -1
|
||||
"""character_limit is an int that caps the number of characters to
|
||||
return per document."""
|
||||
character_limit: Optional[int] = -1
|
||||
|
||||
box: Optional[box_sdk_gen.BoxClient]
|
||||
file_count: int = 0
|
||||
_box: Optional[box_sdk_gen.BoxClient]
|
||||
|
||||
class Config:
|
||||
arbitrary_types_allowed = True
|
||||
@@ -390,7 +409,7 @@ class BoxAPIWrapper(BaseModel):
|
||||
|
||||
@root_validator(allow_reuse=True)
|
||||
def validate_box_api_inputs(cls, values: Dict[str, Any]) -> Dict[str, Any]:
|
||||
values["box"] = None
|
||||
values["_box"] = None
|
||||
|
||||
"""Validate that TOKEN auth type provides box_developer_token."""
|
||||
if not values.get("box_auth"):
|
||||
@@ -402,7 +421,7 @@ class BoxAPIWrapper(BaseModel):
|
||||
)
|
||||
else:
|
||||
box_auth = values.get("box_auth")
|
||||
values["box"] = box_auth.get_client() # type: ignore[union-attr]
|
||||
values["_box"] = box_auth.get_client() # type: ignore[union-attr]
|
||||
|
||||
return values
|
||||
|
||||
@@ -411,11 +430,11 @@ class BoxAPIWrapper(BaseModel):
|
||||
auth_type=BoxAuthType.TOKEN, box_developer_token=self.box_developer_token
|
||||
)
|
||||
|
||||
self.box = box_auth.get_client()
|
||||
self._box = box_auth.get_client()
|
||||
|
||||
def _do_request(self, url: str) -> Any:
|
||||
try:
|
||||
access_token = self.box.auth.retrieve_token().access_token # type: ignore[union-attr]
|
||||
access_token = self._box.auth.retrieve_token().access_token # type: ignore[union-attr]
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(f"Error getting client from jwt token: {bse.message}")
|
||||
|
||||
@@ -423,38 +442,17 @@ class BoxAPIWrapper(BaseModel):
|
||||
resp.raise_for_status()
|
||||
return resp.content
|
||||
|
||||
def get_folder_items(self, folder_id: str) -> box_sdk_gen.Items:
|
||||
"""Get all the items in a folder. Accepts folder_id as str.
|
||||
returns box_sdk_gen.Items"""
|
||||
if self.box is None:
|
||||
self.get_box_client()
|
||||
|
||||
try:
|
||||
folder_contents = self.box.folders.get_folder_items( # type: ignore[union-attr]
|
||||
folder_id, fields=["id", "type", "name"]
|
||||
)
|
||||
except box_sdk_gen.BoxAPIError as bae:
|
||||
raise RuntimeError(
|
||||
f"BoxAPIError: Error getting folder content: {bae.message}"
|
||||
)
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(
|
||||
f"BoxSDKError: Error getting folder content: {bse.message}"
|
||||
)
|
||||
|
||||
return folder_contents.entries
|
||||
|
||||
def get_text_representation(self, file_id: str = "") -> tuple[str, str, str]:
|
||||
def _get_text_representation(self, file_id: str = "") -> tuple[str, str, str]:
|
||||
try:
|
||||
from box_sdk_gen import BoxAPIError, BoxSDKError
|
||||
except ImportError:
|
||||
raise ImportError("You must run `pip install box-sdk-gen`")
|
||||
|
||||
if self.box is None:
|
||||
if self._box is None:
|
||||
self.get_box_client()
|
||||
|
||||
try:
|
||||
file = self.box.files.get_file_by_id( # type: ignore[union-attr]
|
||||
file = self._box.files.get_file_by_id( # type: ignore[union-attr]
|
||||
file_id,
|
||||
x_rep_hints="[extracted_text]",
|
||||
fields=["name", "representations", "type"],
|
||||
@@ -486,8 +484,10 @@ class BoxAPIWrapper(BaseModel):
|
||||
except requests.exceptions.HTTPError:
|
||||
return None, None, None # type: ignore[return-value]
|
||||
|
||||
if self.character_limit > 0: # type: ignore[operator]
|
||||
content = raw_content[0 : self.character_limit]
|
||||
if (
|
||||
self.character_limit is not None and self.character_limit > 0 # type: ignore[operator]
|
||||
):
|
||||
content = raw_content[0 : (self.character_limit - 1)]
|
||||
else:
|
||||
content = raw_content
|
||||
|
||||
@@ -499,16 +499,16 @@ class BoxAPIWrapper(BaseModel):
|
||||
"""Load a file from a Box id. Accepts file_id as str.
|
||||
Returns `Document`"""
|
||||
|
||||
if self.box is None:
|
||||
if self._box is None:
|
||||
self.get_box_client()
|
||||
|
||||
file = self.box.files.get_file_by_id( # type: ignore[union-attr]
|
||||
file = self._box.files.get_file_by_id( # type: ignore[union-attr]
|
||||
file_id, fields=["name", "type", "extension"]
|
||||
)
|
||||
|
||||
if file.type == "file":
|
||||
if hasattr(DocumentFiles, file.extension.upper()):
|
||||
file_name, content, url = self.get_text_representation(file_id=file_id)
|
||||
file_name, content, url = self._get_text_representation(file_id=file_id)
|
||||
|
||||
if file_name is None or content is None or url is None:
|
||||
return None
|
||||
@@ -523,3 +523,95 @@ class BoxAPIWrapper(BaseModel):
|
||||
return None
|
||||
|
||||
return None
|
||||
|
||||
def get_folder_items(self, folder_id: str) -> box_sdk_gen.Items:
|
||||
"""Get all the items in a folder. Accepts folder_id as str.
|
||||
returns box_sdk_gen.Items"""
|
||||
if self._box is None:
|
||||
self.get_box_client()
|
||||
|
||||
try:
|
||||
folder_contents = self._box.folders.get_folder_items( # type: ignore[union-attr]
|
||||
folder_id, fields=["id", "type", "name"]
|
||||
)
|
||||
except box_sdk_gen.BoxAPIError as bae:
|
||||
raise RuntimeError(
|
||||
f"BoxAPIError: Error getting folder content: {bae.message}"
|
||||
)
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(
|
||||
f"BoxSDKError: Error getting folder content: {bse.message}"
|
||||
)
|
||||
|
||||
return folder_contents.entries
|
||||
|
||||
def search_box(self, query: str) -> List[Document]:
|
||||
if self._box is None:
|
||||
self.get_box_client()
|
||||
|
||||
files = []
|
||||
|
||||
try:
|
||||
results = self._box.search.search_for_content( # type: ignore[union-attr]
|
||||
query=query, fields=["id", "type", "extension"]
|
||||
)
|
||||
|
||||
if results.entries is None or len(results.entries) <= 0:
|
||||
return None # type: ignore[return-value]
|
||||
|
||||
for file in results.entries:
|
||||
if (
|
||||
file is not None
|
||||
and file.type == "file"
|
||||
and hasattr(DocumentFiles, file.extension.upper())
|
||||
):
|
||||
doc = self.get_document_by_file_id(file.id)
|
||||
|
||||
if doc is not None:
|
||||
files.append(doc)
|
||||
|
||||
return files
|
||||
except box_sdk_gen.BoxAPIError as bae:
|
||||
raise RuntimeError(
|
||||
f"BoxAPIError: Error getting search results: {bae.message}"
|
||||
)
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(
|
||||
f"BoxSDKError: Error getting search results: {bse.message}"
|
||||
)
|
||||
|
||||
def ask_box_ai(self, query: str, box_file_ids: List[str]) -> List[Document]:
|
||||
if self._box is None:
|
||||
self.get_box_client()
|
||||
|
||||
ai_mode = box_sdk_gen.CreateAiAskMode.SINGLE_ITEM_QA.value
|
||||
|
||||
if len(box_file_ids) > 1:
|
||||
ai_mode = box_sdk_gen.CreateAiAskMode.MULTIPLE_ITEM_QA.value
|
||||
elif len(box_file_ids) <= 0:
|
||||
raise ValueError("BOX_AI_ASK requires at least one file ID")
|
||||
|
||||
items = []
|
||||
|
||||
for file_id in box_file_ids:
|
||||
item = box_sdk_gen.CreateAiAskItems(
|
||||
id=file_id, type=box_sdk_gen.CreateAiAskItemsTypeField.FILE.value
|
||||
)
|
||||
items.append(item)
|
||||
|
||||
try:
|
||||
response = self._box.ai.create_ai_ask(ai_mode, query, items) # type: ignore[union-attr]
|
||||
except box_sdk_gen.BoxAPIError as bae:
|
||||
raise RuntimeError(
|
||||
f"BoxAPIError: Error getting Box AI result: {bae.message}"
|
||||
)
|
||||
except box_sdk_gen.BoxSDKError as bse:
|
||||
raise RuntimeError(
|
||||
f"BoxSDKError: Error getting Box AI result: {bse.message}"
|
||||
)
|
||||
|
||||
content = response.answer
|
||||
|
||||
metadata = {"source": "Box AI", "title": f"Box AI {query}"}
|
||||
|
||||
return [Document(page_content=content, metadata=metadata)]
|
||||
|
||||
@@ -1,42 +1,3 @@
|
||||
from langchain_core.documents import Document
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain_box.document_loaders import BoxLoader
|
||||
|
||||
|
||||
# test Document retrieval
|
||||
def test_file_load(mocker: MockerFixture) -> None:
|
||||
mocker.patch(
|
||||
"langchain_box.utilities.BoxAPIWrapper.get_document_by_file_id", return_value=[]
|
||||
)
|
||||
|
||||
loader = BoxLoader( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token",
|
||||
box_file_ids=["box_file_ids"],
|
||||
)
|
||||
|
||||
documents = loader.load()
|
||||
assert documents
|
||||
|
||||
mocker.patch(
|
||||
"langchain_box.utilities.BoxAPIWrapper.get_document_by_file_id",
|
||||
return_value=(
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
),
|
||||
)
|
||||
|
||||
loader = BoxLoader( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token",
|
||||
box_file_ids=["box_file_ids"],
|
||||
)
|
||||
|
||||
documents = loader.load()
|
||||
assert documents == [
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
"""
|
||||
TODO: build live integration tests
|
||||
"""
|
||||
|
||||
@@ -0,0 +1,3 @@
|
||||
"""
|
||||
TODO: build live integration tests
|
||||
"""
|
||||
@@ -1,47 +1,3 @@
|
||||
from unittest.mock import Mock
|
||||
|
||||
import pytest
|
||||
from langchain_core.documents import Document
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain_box.utilities import BoxAPIWrapper
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def mock_worker(mocker: MockerFixture) -> None:
|
||||
mocker.patch("langchain_box.utilities.BoxAuth.authorize", return_value=Mock())
|
||||
mocker.patch("langchain_box.utilities.BoxAuth.get_client", return_value=Mock())
|
||||
mocker.patch(
|
||||
"langchain_box.utilities.BoxAPIWrapper.get_text_representation",
|
||||
return_value=("filename", "content", "url"),
|
||||
)
|
||||
|
||||
|
||||
def test_get_documents_by_file_ids(mock_worker, mocker: MockerFixture) -> None: # type: ignore[no-untyped-def]
|
||||
mocker.patch(
|
||||
"langchain_box.utilities.BoxAPIWrapper.get_document_by_file_id",
|
||||
return_value=(
|
||||
Document(
|
||||
page_content="content", metadata={"source": "url", "title": "filename"}
|
||||
)
|
||||
),
|
||||
)
|
||||
|
||||
box = BoxAPIWrapper(box_developer_token="box_developer_token") # type: ignore[call-arg]
|
||||
|
||||
documents = box.get_document_by_file_id("box_file_id")
|
||||
assert documents == Document(
|
||||
page_content="content", metadata={"source": "url", "title": "filename"}
|
||||
)
|
||||
|
||||
|
||||
def test_get_documents_by_folder_id(mock_worker, mocker: MockerFixture) -> None: # type: ignore[no-untyped-def]
|
||||
mocker.patch(
|
||||
"langchain_box.utilities.BoxAPIWrapper.get_folder_items",
|
||||
return_value=([{"id": "file_id", "type": "file"}]),
|
||||
)
|
||||
|
||||
box = BoxAPIWrapper(box_developer_token="box_developer_token") # type: ignore[call-arg]
|
||||
|
||||
folder_contents = box.get_folder_items("box_folder_id")
|
||||
assert folder_contents == [{"id": "file_id", "type": "file"}]
|
||||
"""
|
||||
TODO: build live integration tests
|
||||
"""
|
||||
|
||||
@@ -1,4 +1,6 @@
|
||||
import pytest
|
||||
from langchain_core.documents import Document
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain_box.document_loaders import BoxLoader
|
||||
from langchain_box.utilities import BoxAuth, BoxAuthType
|
||||
@@ -56,3 +58,42 @@ def test_failed_initialization_files_and_folders() -> None:
|
||||
box_folder_id="box_folder_id",
|
||||
box_file_ids=["box_file_ids"],
|
||||
)
|
||||
|
||||
|
||||
# test Document retrieval
|
||||
def test_file_load(mocker: MockerFixture) -> None:
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.get_document_by_file_id",
|
||||
return_value=[],
|
||||
)
|
||||
|
||||
loader = BoxLoader( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token",
|
||||
box_file_ids=["box_file_ids"],
|
||||
)
|
||||
|
||||
documents = loader.load()
|
||||
assert documents
|
||||
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.get_document_by_file_id",
|
||||
return_value=(
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
),
|
||||
)
|
||||
|
||||
loader = BoxLoader( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token",
|
||||
box_file_ids=["box_file_ids"],
|
||||
)
|
||||
|
||||
documents = loader.load()
|
||||
assert documents == [
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
|
||||
@@ -0,0 +1,89 @@
|
||||
import pytest
|
||||
from langchain_core.documents import Document
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain_box.retrievers import BoxRetriever
|
||||
from langchain_box.utilities import BoxAuth, BoxAuthType
|
||||
|
||||
|
||||
# Test auth types
|
||||
def test_direct_token_initialization() -> None:
|
||||
retriever = BoxRetriever( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token",
|
||||
box_file_ids=["box_file_ids"],
|
||||
)
|
||||
|
||||
assert retriever.box_developer_token == "box_developer_token"
|
||||
assert retriever.box_file_ids == ["box_file_ids"]
|
||||
|
||||
|
||||
def test_failed_direct_token_initialization() -> None:
|
||||
with pytest.raises(ValueError):
|
||||
retriever = BoxRetriever(box_file_ids=["box_file_ids"]) # type: ignore[call-arg] # noqa: F841
|
||||
|
||||
|
||||
def test_auth_initialization() -> None:
|
||||
auth = BoxAuth(
|
||||
auth_type=BoxAuthType.TOKEN, box_developer_token="box_developer_token"
|
||||
)
|
||||
|
||||
retriever = BoxRetriever( # type: ignore[call-arg]
|
||||
box_auth=auth,
|
||||
box_file_ids=["box_file_ids"],
|
||||
)
|
||||
|
||||
assert retriever.box_file_ids == ["box_file_ids"]
|
||||
|
||||
|
||||
# test search retrieval
|
||||
def test_search(mocker: MockerFixture) -> None:
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.search_box",
|
||||
return_value=(
|
||||
[
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
),
|
||||
)
|
||||
|
||||
retriever = BoxRetriever( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token"
|
||||
)
|
||||
|
||||
documents = retriever.invoke("query")
|
||||
assert documents == [
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
|
||||
|
||||
# test ai retrieval
|
||||
def test_ai(mocker: MockerFixture) -> None:
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.ask_box_ai",
|
||||
return_value=(
|
||||
[
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
),
|
||||
)
|
||||
|
||||
retriever = BoxRetriever( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token", box_file_ids=["box_file_ids"]
|
||||
)
|
||||
|
||||
documents = retriever.invoke("query")
|
||||
assert documents == [
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
@@ -2,9 +2,10 @@ from langchain_box import __all__
|
||||
|
||||
EXPECTED_ALL = [
|
||||
"BoxLoader",
|
||||
"BoxRetriever",
|
||||
"BoxAuth",
|
||||
"BoxAuthType",
|
||||
"BoxAPIWrapper",
|
||||
"_BoxAPIWrapper",
|
||||
"__version__",
|
||||
]
|
||||
|
||||
|
||||
@@ -1,7 +1,21 @@
|
||||
import pytest
|
||||
from pydantic.v1.error_wrappers import ValidationError
|
||||
from unittest.mock import Mock
|
||||
|
||||
from langchain_box.utilities import BoxAPIWrapper, BoxAuth, BoxAuthType
|
||||
import pytest
|
||||
from langchain_core.documents import Document
|
||||
from pydantic.v1.error_wrappers import ValidationError
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain_box.utilities import BoxAuth, BoxAuthType, _BoxAPIWrapper
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def mock_worker(mocker: MockerFixture) -> None:
|
||||
mocker.patch("langchain_box.utilities.BoxAuth._authorize", return_value=Mock())
|
||||
mocker.patch("langchain_box.utilities.BoxAuth.get_client", return_value=Mock())
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper._get_text_representation",
|
||||
return_value=("filename", "content", "url"),
|
||||
)
|
||||
|
||||
|
||||
# Test auth types
|
||||
@@ -79,7 +93,7 @@ def test_failed_ccg_initialization() -> None:
|
||||
|
||||
|
||||
def test_direct_token_initialization() -> None:
|
||||
box = BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box = _BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token"
|
||||
)
|
||||
|
||||
@@ -91,11 +105,126 @@ def test_auth_initialization() -> None:
|
||||
auth_type=BoxAuthType.TOKEN, box_developer_token="box_developer_token"
|
||||
)
|
||||
|
||||
box = BoxAPIWrapper(box_auth=auth) # type: ignore[call-arg] # noqa: F841
|
||||
box = _BoxAPIWrapper(box_auth=auth) # type: ignore[call-arg] # noqa: F841
|
||||
|
||||
assert auth.box_developer_token == "box_developer_token"
|
||||
|
||||
|
||||
def test_failed_initialization_no_auth() -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
box = BoxAPIWrapper() # type: ignore[call-arg] # noqa: F841
|
||||
box = _BoxAPIWrapper() # type: ignore[call-arg] # noqa: F841
|
||||
|
||||
|
||||
def test_get_documents_by_file_ids(mock_worker, mocker: MockerFixture) -> None: # type: ignore[no-untyped-def]
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.get_document_by_file_id",
|
||||
return_value=(
|
||||
Document(
|
||||
page_content="content", metadata={"source": "url", "title": "filename"}
|
||||
)
|
||||
),
|
||||
)
|
||||
|
||||
box = _BoxAPIWrapper(box_developer_token="box_developer_token") # type: ignore[call-arg]
|
||||
|
||||
documents = box.get_document_by_file_id("box_file_id")
|
||||
assert documents == Document(
|
||||
page_content="content", metadata={"source": "url", "title": "filename"}
|
||||
)
|
||||
|
||||
|
||||
def test_get_documents_by_folder_id(mock_worker, mocker: MockerFixture) -> None: # type: ignore[no-untyped-def]
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.get_folder_items",
|
||||
return_value=([{"id": "file_id", "type": "file"}]),
|
||||
)
|
||||
|
||||
box = _BoxAPIWrapper(box_developer_token="box_developer_token") # type: ignore[call-arg]
|
||||
|
||||
folder_contents = box.get_folder_items("box_folder_id")
|
||||
assert folder_contents == [{"id": "file_id", "type": "file"}]
|
||||
|
||||
|
||||
def test_box_search(mock_worker, mocker: MockerFixture) -> None: # type: ignore[no-untyped-def]
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.search_box",
|
||||
return_value=(
|
||||
[
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
),
|
||||
)
|
||||
|
||||
box = _BoxAPIWrapper(box_developer_token="box_developer_token") # type: ignore[call-arg]
|
||||
|
||||
documents = box.search_box("query")
|
||||
assert documents == [
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
|
||||
|
||||
def test_ask_box_ai_single_file(mock_worker, mocker: MockerFixture) -> None: # type: ignore[no-untyped-def]
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.ask_box_ai",
|
||||
return_value=(
|
||||
[
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
),
|
||||
)
|
||||
|
||||
box = _BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token", box_file_ids=["box_file_ids"]
|
||||
)
|
||||
|
||||
documents = box.ask_box_ai("query") # type: ignore[call-arg]
|
||||
assert documents == [
|
||||
Document(
|
||||
page_content="Test file mode\ndocument contents",
|
||||
metadata={"title": "Testing Files"},
|
||||
)
|
||||
]
|
||||
|
||||
|
||||
def test_ask_box_ai_multiple_files(mock_worker, mocker: MockerFixture) -> None: # type: ignore[no-untyped-def]
|
||||
mocker.patch(
|
||||
"langchain_box.utilities._BoxAPIWrapper.ask_box_ai",
|
||||
return_value=(
|
||||
[
|
||||
Document(
|
||||
page_content="Test file 1 mode\ndocument contents",
|
||||
metadata={"title": "Test File 1"},
|
||||
),
|
||||
Document(
|
||||
page_content="Test file 2 mode\ndocument contents",
|
||||
metadata={"title": "Test File 2"},
|
||||
),
|
||||
]
|
||||
),
|
||||
)
|
||||
|
||||
box = _BoxAPIWrapper( # type: ignore[call-arg]
|
||||
box_developer_token="box_developer_token",
|
||||
box_file_ids=["box_file_id 1", "box_file_id 2"],
|
||||
)
|
||||
|
||||
documents = box.ask_box_ai("query") # type: ignore[call-arg]
|
||||
assert documents == [
|
||||
Document(
|
||||
page_content="Test file 1 mode\ndocument contents",
|
||||
metadata={"title": "Test File 1"},
|
||||
),
|
||||
Document(
|
||||
page_content="Test file 2 mode\ndocument contents",
|
||||
metadata={"title": "Test File 2"},
|
||||
),
|
||||
]
|
||||
|
||||
24
libs/partners/chroma/poetry.lock
generated
24
libs/partners/chroma/poetry.lock
generated
@@ -892,7 +892,7 @@ adal = ["adal (>=1.0.2)"]
|
||||
|
||||
[[package]]
|
||||
name = "langchain-core"
|
||||
version = "0.2.33"
|
||||
version = "0.2.34"
|
||||
description = "Building applications with LLMs through composability"
|
||||
optional = false
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
@@ -917,13 +917,13 @@ url = "../../core"
|
||||
|
||||
[[package]]
|
||||
name = "langsmith"
|
||||
version = "0.1.100"
|
||||
version = "0.1.101"
|
||||
description = "Client library to connect to the LangSmith LLM Tracing and Evaluation Platform."
|
||||
optional = false
|
||||
python-versions = "<4.0,>=3.8.1"
|
||||
files = [
|
||||
{file = "langsmith-0.1.100-py3-none-any.whl", hash = "sha256:cae44a884a4166c4d8b9cc5ff99f5d520337bd90b9dadfe3706ed31415d559a7"},
|
||||
{file = "langsmith-0.1.100.tar.gz", hash = "sha256:20ff0126253a5a1d621635a3bc44ccacc036e855f52185ae983420f14eb6c605"},
|
||||
{file = "langsmith-0.1.101-py3-none-any.whl", hash = "sha256:572e2c90709cda1ad837ac86cedda7295f69933f2124c658a92a35fb890477cc"},
|
||||
{file = "langsmith-0.1.101.tar.gz", hash = "sha256:caf4d95f314bb6cd3c4e0632eed821fd5cd5d0f18cb824772fce6d7a9113895b"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -1556,13 +1556,13 @@ testing = ["pytest", "pytest-benchmark"]
|
||||
|
||||
[[package]]
|
||||
name = "posthog"
|
||||
version = "3.5.0"
|
||||
version = "3.5.2"
|
||||
description = "Integrate PostHog into any python application."
|
||||
optional = false
|
||||
python-versions = "*"
|
||||
files = [
|
||||
{file = "posthog-3.5.0-py2.py3-none-any.whl", hash = "sha256:3c672be7ba6f95d555ea207d4486c171d06657eb34b3ce25eb043bfe7b6b5b76"},
|
||||
{file = "posthog-3.5.0.tar.gz", hash = "sha256:8f7e3b2c6e8714d0c0c542a2109b83a7549f63b7113a133ab2763a89245ef2ef"},
|
||||
{file = "posthog-3.5.2-py2.py3-none-any.whl", hash = "sha256:605b3d92369971cc99290b1fcc8534cbddac3726ef7972caa993454a5ecfb644"},
|
||||
{file = "posthog-3.5.2.tar.gz", hash = "sha256:a383a80c1f47e0243f5ce359e81e06e2e7b37eb39d1d6f8d01c3e64ed29df2ee"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -2138,13 +2138,13 @@ dev = ["hypothesis (>=6.70.0)", "pytest (>=7.1.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "syrupy"
|
||||
version = "4.6.1"
|
||||
version = "4.6.4"
|
||||
description = "Pytest Snapshot Test Utility"
|
||||
optional = false
|
||||
python-versions = ">=3.8.1,<4"
|
||||
python-versions = ">=3.8.1"
|
||||
files = [
|
||||
{file = "syrupy-4.6.1-py3-none-any.whl", hash = "sha256:203e52f9cb9fa749cf683f29bd68f02c16c3bc7e7e5fe8f2fc59bdfe488ce133"},
|
||||
{file = "syrupy-4.6.1.tar.gz", hash = "sha256:37a835c9ce7857eeef86d62145885e10b3cb9615bc6abeb4ce404b3f18e1bb36"},
|
||||
{file = "syrupy-4.6.4-py3-none-any.whl", hash = "sha256:5a0e47b187d32b58555b0de6d25bc7bb875e7d60c7a41bd2721f5d44975dcf85"},
|
||||
{file = "syrupy-4.6.4.tar.gz", hash = "sha256:a6facc6a45f1cff598adacb030d9573ed62863521755abd5c5d6d665f848d6cc"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -2796,4 +2796,4 @@ test = ["big-O", "importlib-resources", "jaraco.functools", "jaraco.itertools",
|
||||
[metadata]
|
||||
lock-version = "2.0"
|
||||
python-versions = ">=3.8.1,<4"
|
||||
content-hash = "e45811e74878a9b652fef6ee06b10ad2d9e2cc33071bc8413bf2450aa17e47b7"
|
||||
content-hash = "b6bafda889d07ec7a6d23da03123de6bbd79405f359512df9133d12d5b72a93b"
|
||||
|
||||
@@ -46,6 +46,10 @@ markers = [
|
||||
[tool.poetry.dependencies.chromadb]
|
||||
version = ">=0.4.0,<0.6.0,!=0.5.4,!=0.5.5"
|
||||
|
||||
[tool.poetry.dependencies.fastapi]
|
||||
version = ">=0.95.2,<1"
|
||||
optional = true
|
||||
|
||||
[tool.poetry.group.test]
|
||||
optional = true
|
||||
|
||||
|
||||
1
libs/partners/databricks/.gitignore
vendored
Normal file
1
libs/partners/databricks/.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
||||
__pycache__
|
||||
21
libs/partners/databricks/LICENSE
Normal file
21
libs/partners/databricks/LICENSE
Normal file
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2024 LangChain, Inc.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
62
libs/partners/databricks/Makefile
Normal file
62
libs/partners/databricks/Makefile
Normal file
@@ -0,0 +1,62 @@
|
||||
.PHONY: all format lint test tests integration_tests docker_tests help extended_tests
|
||||
|
||||
# Default target executed when no arguments are given to make.
|
||||
all: help
|
||||
|
||||
# Define a variable for the test file path.
|
||||
TEST_FILE ?= tests/unit_tests/
|
||||
integration_test integration_tests: TEST_FILE = tests/integration_tests/
|
||||
|
||||
|
||||
# unit tests are run with the --disable-socket flag to prevent network calls
|
||||
test tests:
|
||||
poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)
|
||||
|
||||
# integration tests are run without the --disable-socket flag to allow network calls
|
||||
integration_test integration_tests:
|
||||
poetry run pytest $(TEST_FILE)
|
||||
|
||||
######################
|
||||
# LINTING AND FORMATTING
|
||||
######################
|
||||
|
||||
# Define a variable for Python and notebook files.
|
||||
PYTHON_FILES=.
|
||||
MYPY_CACHE=.mypy_cache
|
||||
lint format: PYTHON_FILES=.
|
||||
lint_diff format_diff: PYTHON_FILES=$(shell git diff --relative=libs/partners/databricks --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
|
||||
lint_package: PYTHON_FILES=langchain_databricks
|
||||
lint_tests: PYTHON_FILES=tests
|
||||
lint_tests: MYPY_CACHE=.mypy_cache_test
|
||||
|
||||
lint lint_diff lint_package lint_tests:
|
||||
poetry run ruff check .
|
||||
poetry run ruff format $(PYTHON_FILES) --diff
|
||||
poetry run ruff check --select I $(PYTHON_FILES)
|
||||
mkdir -p $(MYPY_CACHE); poetry run mypy $(PYTHON_FILES) --cache-dir $(MYPY_CACHE)
|
||||
|
||||
format format_diff:
|
||||
poetry run ruff format $(PYTHON_FILES)
|
||||
poetry run ruff check --select I --fix $(PYTHON_FILES)
|
||||
|
||||
spell_check:
|
||||
poetry run codespell --toml pyproject.toml
|
||||
|
||||
spell_fix:
|
||||
poetry run codespell --toml pyproject.toml -w
|
||||
|
||||
check_imports: $(shell find langchain_databricks -name '*.py')
|
||||
poetry run python ./scripts/check_imports.py $^
|
||||
|
||||
######################
|
||||
# HELP
|
||||
######################
|
||||
|
||||
help:
|
||||
@echo '----'
|
||||
@echo 'check_imports - check imports'
|
||||
@echo 'format - run code formatters'
|
||||
@echo 'lint - run linters'
|
||||
@echo 'test - run unit tests'
|
||||
@echo 'tests - run unit tests'
|
||||
@echo 'test TEST_FILE=<test_file> - run all tests in file'
|
||||
24
libs/partners/databricks/README.md
Normal file
24
libs/partners/databricks/README.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# langchain-databricks
|
||||
|
||||
This package contains the LangChain integration with Databricks
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install -U langchain-databricks
|
||||
```
|
||||
|
||||
And you should configure credentials by setting the following environment variables:
|
||||
|
||||
* TODO: fill this out
|
||||
|
||||
## Chat Models
|
||||
|
||||
`ChatDatabricks` class exposes chat models from Databricks.
|
||||
|
||||
```python
|
||||
from langchain_databricks import ChatDatabricks
|
||||
|
||||
llm = ChatDatabricks()
|
||||
llm.invoke("Sing a ballad of LangChain.")
|
||||
```
|
||||
15
libs/partners/databricks/langchain_databricks/__init__.py
Normal file
15
libs/partners/databricks/langchain_databricks/__init__.py
Normal file
@@ -0,0 +1,15 @@
|
||||
from importlib import metadata
|
||||
|
||||
from langchain_databricks.chat_models import ChatDatabricks
|
||||
|
||||
try:
|
||||
__version__ = metadata.version(__package__)
|
||||
except metadata.PackageNotFoundError:
|
||||
# Case where package metadata is not available.
|
||||
__version__ = ""
|
||||
del metadata # optional, avoids polluting the results of dir(__package__)
|
||||
|
||||
__all__ = [
|
||||
"ChatDatabricks",
|
||||
"__version__",
|
||||
]
|
||||
573
libs/partners/databricks/langchain_databricks/chat_models.py
Normal file
573
libs/partners/databricks/langchain_databricks/chat_models.py
Normal file
@@ -0,0 +1,573 @@
|
||||
"""Databricks chat models."""
|
||||
|
||||
import json
|
||||
import logging
|
||||
from typing import (
|
||||
Any,
|
||||
Callable,
|
||||
Dict,
|
||||
Iterator,
|
||||
List,
|
||||
Literal,
|
||||
Mapping,
|
||||
Optional,
|
||||
Sequence,
|
||||
Type,
|
||||
Union,
|
||||
)
|
||||
from urllib.parse import urlparse
|
||||
|
||||
from langchain_core.callbacks import CallbackManagerForLLMRun
|
||||
from langchain_core.language_models import BaseChatModel
|
||||
from langchain_core.language_models.base import LanguageModelInput
|
||||
from langchain_core.messages import (
|
||||
AIMessage,
|
||||
AIMessageChunk,
|
||||
BaseMessage,
|
||||
BaseMessageChunk,
|
||||
ChatMessage,
|
||||
ChatMessageChunk,
|
||||
FunctionMessage,
|
||||
HumanMessage,
|
||||
HumanMessageChunk,
|
||||
SystemMessage,
|
||||
SystemMessageChunk,
|
||||
ToolMessage,
|
||||
ToolMessageChunk,
|
||||
)
|
||||
from langchain_core.messages.tool import tool_call_chunk
|
||||
from langchain_core.output_parsers.openai_tools import (
|
||||
make_invalid_tool_call,
|
||||
parse_tool_call,
|
||||
)
|
||||
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
|
||||
from langchain_core.pydantic_v1 import (
|
||||
BaseModel,
|
||||
Field,
|
||||
PrivateAttr,
|
||||
)
|
||||
from langchain_core.runnables import Runnable
|
||||
from langchain_core.tools import BaseTool
|
||||
from langchain_core.utils.function_calling import convert_to_openai_tool
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ChatDatabricks(BaseChatModel):
|
||||
"""Databricks chat model integration.
|
||||
|
||||
Setup:
|
||||
Install ``langchain-databricks``.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install -U langchain-databricks
|
||||
|
||||
If you are outside Databricks, set the Databricks workspace hostname and personal access token to environment variables:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
export DATABRICKS_HOSTNAME="https://your-databricks-workspace"
|
||||
export DATABRICKS_TOKEN="your-personal-access-token"
|
||||
|
||||
Key init args — completion params:
|
||||
endpoint: str
|
||||
Name of Databricks Model Serving endpoint to query.
|
||||
target_uri: str
|
||||
The target URI to use. Defaults to ``databricks``.
|
||||
temperature: float
|
||||
Sampling temperature. Higher values make the model more creative.
|
||||
n: Optional[int]
|
||||
The number of completion choices to generate.
|
||||
stop: Optional[List[str]]
|
||||
List of strings to stop generation at.
|
||||
max_tokens: Optional[int]
|
||||
Max number of tokens to generate.
|
||||
extra_params: Optional[Dict[str, Any]]
|
||||
Any extra parameters to pass to the endpoint.
|
||||
|
||||
Instantiate:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_databricks import ChatDatabricks
|
||||
llm = ChatDatabricks(
|
||||
endpoint="databricks-meta-llama-3-1-405b-instruct",
|
||||
temperature=0,
|
||||
max_tokens=500,
|
||||
)
|
||||
|
||||
Invoke:
|
||||
.. code-block:: python
|
||||
|
||||
messages = [
|
||||
("system", "You are a helpful translator. Translate the user sentence to French."),
|
||||
("human", "I love programming."),
|
||||
]
|
||||
llm.invoke(messages)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
AIMessage(
|
||||
content="J'adore la programmation.",
|
||||
response_metadata={
|
||||
'prompt_tokens': 32,
|
||||
'completion_tokens': 9,
|
||||
'total_tokens': 41
|
||||
},
|
||||
id='run-64eebbdd-88a8-4a25-b508-21e9a5f146c5-0'
|
||||
)
|
||||
|
||||
Stream:
|
||||
.. code-block:: python
|
||||
|
||||
for chunk in llm.stream(messages):
|
||||
print(chunk)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
content='J' id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content="'" id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content='ad' id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content='ore' id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content=' la' id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content=' programm' id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content='ation' id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content='.' id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
content='' response_metadata={'finish_reason': 'stop'} id='run-609b8f47-e580-4691-9ee4-e2109f53155e'
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
stream = llm.stream(messages)
|
||||
full = next(stream)
|
||||
for chunk in stream:
|
||||
full += chunk
|
||||
full
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
AIMessageChunk(
|
||||
content="J'adore la programmation.",
|
||||
response_metadata={
|
||||
'finish_reason': 'stop'
|
||||
},
|
||||
id='run-4cef851f-6223-424f-ad26-4a54e5852aa5'
|
||||
)
|
||||
|
||||
Async:
|
||||
.. code-block:: python
|
||||
|
||||
await llm.ainvoke(messages)
|
||||
|
||||
# stream:
|
||||
# async for chunk in llm.astream(messages)
|
||||
|
||||
# batch:
|
||||
# await llm.abatch([messages])
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
AIMessage(
|
||||
content="J'adore la programmation.",
|
||||
response_metadata={
|
||||
'prompt_tokens': 32,
|
||||
'completion_tokens': 9,
|
||||
'total_tokens': 41
|
||||
},
|
||||
id='run-e4bb043e-772b-4e1d-9f98-77ccc00c0271-0'
|
||||
)
|
||||
|
||||
Tool calling:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_core.pydantic_v1 import BaseModel, Field
|
||||
|
||||
class GetWeather(BaseModel):
|
||||
'''Get the current weather in a given location'''
|
||||
|
||||
location: str = Field(..., description="The city and state, e.g. San Francisco, CA")
|
||||
|
||||
class GetPopulation(BaseModel):
|
||||
'''Get the current population in a given location'''
|
||||
|
||||
location: str = Field(..., description="The city and state, e.g. San Francisco, CA")
|
||||
|
||||
llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
|
||||
ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?")
|
||||
ai_msg.tool_calls
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
{
|
||||
'name': 'GetWeather',
|
||||
'args': {
|
||||
'location': 'Los Angeles, CA'
|
||||
},
|
||||
'id': 'call_ea0a6004-8e64-4ae8-a192-a40e295bfa24',
|
||||
'type': 'tool_call'
|
||||
}
|
||||
]
|
||||
|
||||
To use tool calls, your model endpoint must support ``tools`` parameter. See [Function calling on Databricks](https://python.langchain.com/v0.2/docs/integrations/chat/databricks/#function-calling-on-databricks) for more information.
|
||||
|
||||
""" # noqa: E501
|
||||
|
||||
endpoint: str
|
||||
"""Name of Databricks Model Serving endpoint to query."""
|
||||
target_uri: str = "databricks"
|
||||
"""The target URI to use. Defaults to ``databricks``."""
|
||||
temperature: float = 0.0
|
||||
"""Sampling temperature. Higher values make the model more creative."""
|
||||
n: int = 1
|
||||
"""The number of completion choices to generate."""
|
||||
stop: Optional[List[str]] = None
|
||||
"""List of strings to stop generation at."""
|
||||
max_tokens: Optional[int] = None
|
||||
"""The maximum number of tokens to generate."""
|
||||
extra_params: dict = Field(default_factory=dict)
|
||||
"""Any extra parameters to pass to the endpoint."""
|
||||
_client: Any = PrivateAttr()
|
||||
|
||||
def __init__(self, **kwargs: Any):
|
||||
super().__init__(**kwargs)
|
||||
self._validate_uri()
|
||||
try:
|
||||
from mlflow.deployments import get_deploy_client # type: ignore
|
||||
|
||||
self._client = get_deploy_client(self.target_uri)
|
||||
except ImportError as e:
|
||||
raise ImportError(
|
||||
"Failed to create the client. Please run `pip install mlflow` to "
|
||||
"install required dependencies."
|
||||
) from e
|
||||
|
||||
def _validate_uri(self) -> None:
|
||||
if self.target_uri == "databricks":
|
||||
return
|
||||
|
||||
if urlparse(self.target_uri).scheme != "databricks":
|
||||
raise ValueError(
|
||||
"Invalid target URI. The target URI must be a valid databricks URI."
|
||||
)
|
||||
|
||||
@property
|
||||
def _default_params(self) -> Dict[str, Any]:
|
||||
params: Dict[str, Any] = {
|
||||
"target_uri": self.target_uri,
|
||||
"endpoint": self.endpoint,
|
||||
"temperature": self.temperature,
|
||||
"n": self.n,
|
||||
"stop": self.stop,
|
||||
"max_tokens": self.max_tokens,
|
||||
"extra_params": self.extra_params,
|
||||
}
|
||||
return params
|
||||
|
||||
def _generate(
|
||||
self,
|
||||
messages: List[BaseMessage],
|
||||
stop: Optional[List[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
**kwargs: Any,
|
||||
) -> ChatResult:
|
||||
data = self._prepare_inputs(messages, stop, **kwargs)
|
||||
resp = self._client.predict(endpoint=self.endpoint, inputs=data)
|
||||
return self._convert_response_to_chat_result(resp)
|
||||
|
||||
def _prepare_inputs(
|
||||
self,
|
||||
messages: List[BaseMessage],
|
||||
stop: Optional[List[str]] = None,
|
||||
**kwargs: Any,
|
||||
) -> Dict[str, Any]:
|
||||
data: Dict[str, Any] = {
|
||||
"messages": [_convert_message_to_dict(msg) for msg in messages],
|
||||
"temperature": self.temperature,
|
||||
"n": self.n,
|
||||
**self.extra_params,
|
||||
**kwargs,
|
||||
}
|
||||
if stop := self.stop or stop:
|
||||
data["stop"] = stop
|
||||
if self.max_tokens is not None:
|
||||
data["max_tokens"] = self.max_tokens
|
||||
|
||||
return data
|
||||
|
||||
def _convert_response_to_chat_result(
|
||||
self, response: Mapping[str, Any]
|
||||
) -> ChatResult:
|
||||
generations = [
|
||||
ChatGeneration(
|
||||
message=_convert_dict_to_message(choice["message"]),
|
||||
generation_info=choice.get("usage", {}),
|
||||
)
|
||||
for choice in response["choices"]
|
||||
]
|
||||
usage = response.get("usage", {})
|
||||
return ChatResult(generations=generations, llm_output=usage)
|
||||
|
||||
def _stream(
|
||||
self,
|
||||
messages: List[BaseMessage],
|
||||
stop: Optional[List[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[ChatGenerationChunk]:
|
||||
data = self._prepare_inputs(messages, stop, **kwargs)
|
||||
first_chunk_role = None
|
||||
for chunk in self._client.predict_stream(endpoint=self.endpoint, inputs=data):
|
||||
if chunk["choices"]:
|
||||
choice = chunk["choices"][0]
|
||||
|
||||
chunk_delta = choice["delta"]
|
||||
if first_chunk_role is None:
|
||||
first_chunk_role = chunk_delta.get("role")
|
||||
|
||||
chunk_message = _convert_dict_to_message_chunk(
|
||||
chunk_delta, first_chunk_role
|
||||
)
|
||||
|
||||
generation_info = {}
|
||||
if finish_reason := choice.get("finish_reason"):
|
||||
generation_info["finish_reason"] = finish_reason
|
||||
if logprobs := choice.get("logprobs"):
|
||||
generation_info["logprobs"] = logprobs
|
||||
|
||||
chunk = ChatGenerationChunk(
|
||||
message=chunk_message, generation_info=generation_info or None
|
||||
)
|
||||
|
||||
if run_manager:
|
||||
run_manager.on_llm_new_token(
|
||||
chunk.text, chunk=chunk, logprobs=logprobs
|
||||
)
|
||||
|
||||
yield chunk
|
||||
else:
|
||||
# Handle the case where choices are empty if needed
|
||||
continue
|
||||
|
||||
def bind_tools(
|
||||
self,
|
||||
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
|
||||
*,
|
||||
tool_choice: Optional[
|
||||
Union[dict, str, Literal["auto", "none", "required", "any"], bool]
|
||||
] = None,
|
||||
**kwargs: Any,
|
||||
) -> Runnable[LanguageModelInput, BaseMessage]:
|
||||
"""Bind tool-like objects to this chat model.
|
||||
|
||||
Assumes model is compatible with OpenAI tool-calling API.
|
||||
|
||||
Args:
|
||||
tools: A list of tool definitions to bind to this chat model.
|
||||
Can be a dictionary, pydantic model, callable, or BaseTool. Pydantic
|
||||
models, callables, and BaseTools will be automatically converted to
|
||||
their schema dictionary representation.
|
||||
tool_choice: Which tool to require the model to call.
|
||||
Options are:
|
||||
name of the tool (str): calls corresponding tool;
|
||||
"auto": automatically selects a tool (including no tool);
|
||||
"none": model does not generate any tool calls and instead must
|
||||
generate a standard assistant message;
|
||||
"required": the model picks the most relevant tool in tools and
|
||||
must generate a tool call;
|
||||
|
||||
or a dict of the form:
|
||||
{"type": "function", "function": {"name": <<tool_name>>}}.
|
||||
**kwargs: Any additional parameters to pass to the
|
||||
:class:`~langchain.runnable.Runnable` constructor.
|
||||
"""
|
||||
formatted_tools = [convert_to_openai_tool(tool) for tool in tools]
|
||||
if tool_choice:
|
||||
if isinstance(tool_choice, str):
|
||||
# tool_choice is a tool/function name
|
||||
if tool_choice not in ("auto", "none", "required"):
|
||||
tool_choice = {
|
||||
"type": "function",
|
||||
"function": {"name": tool_choice},
|
||||
}
|
||||
elif isinstance(tool_choice, dict):
|
||||
tool_names = [
|
||||
formatted_tool["function"]["name"]
|
||||
for formatted_tool in formatted_tools
|
||||
]
|
||||
if not any(
|
||||
tool_name == tool_choice["function"]["name"]
|
||||
for tool_name in tool_names
|
||||
):
|
||||
raise ValueError(
|
||||
f"Tool choice {tool_choice} was specified, but the only "
|
||||
f"provided tools were {tool_names}."
|
||||
)
|
||||
else:
|
||||
raise ValueError(
|
||||
f"Unrecognized tool_choice type. Expected str, bool or dict. "
|
||||
f"Received: {tool_choice}"
|
||||
)
|
||||
kwargs["tool_choice"] = tool_choice
|
||||
return super().bind(tools=formatted_tools, **kwargs)
|
||||
|
||||
@property
|
||||
def _llm_type(self) -> str:
|
||||
"""Return type of chat model."""
|
||||
return "chat-databricks"
|
||||
|
||||
|
||||
### Conversion function to convert Pydantic models to dictionaries and vice versa. ###
|
||||
|
||||
|
||||
def _convert_message_to_dict(message: BaseMessage) -> dict:
|
||||
message_dict = {"content": message.content}
|
||||
|
||||
# OpenAI supports "name" field in messages.
|
||||
if (name := message.name or message.additional_kwargs.get("name")) is not None:
|
||||
message_dict["name"] = name
|
||||
|
||||
if id := message.id:
|
||||
message_dict["id"] = id
|
||||
|
||||
if isinstance(message, ChatMessage):
|
||||
return {"role": message.role, **message_dict}
|
||||
elif isinstance(message, HumanMessage):
|
||||
return {"role": "user", **message_dict}
|
||||
elif isinstance(message, AIMessage):
|
||||
if tool_calls := _get_tool_calls_from_ai_message(message):
|
||||
message_dict["tool_calls"] = tool_calls # type: ignore[assignment]
|
||||
# If tool calls present, content null value should be None not empty string.
|
||||
message_dict["content"] = message_dict["content"] or None # type: ignore[assignment]
|
||||
return {"role": "assistant", **message_dict}
|
||||
elif isinstance(message, SystemMessage):
|
||||
return {"role": "system", **message_dict}
|
||||
elif isinstance(message, ToolMessage):
|
||||
return {
|
||||
"role": "tool",
|
||||
"tool_call_id": message.tool_call_id,
|
||||
**message_dict,
|
||||
}
|
||||
elif (
|
||||
isinstance(message, FunctionMessage)
|
||||
or "function_call" in message.additional_kwargs
|
||||
):
|
||||
raise ValueError(
|
||||
"Function messages are not supported by Databricks. Please"
|
||||
" create a feature request at https://github.com/mlflow/mlflow/issues."
|
||||
)
|
||||
else:
|
||||
raise ValueError(f"Got unknown message type: {type(message)}")
|
||||
|
||||
|
||||
def _get_tool_calls_from_ai_message(message: AIMessage) -> List[Dict]:
|
||||
tool_calls = [
|
||||
{
|
||||
"type": "function",
|
||||
"id": tc["id"],
|
||||
"function": {
|
||||
"name": tc["name"],
|
||||
"arguments": json.dumps(tc["args"]),
|
||||
},
|
||||
}
|
||||
for tc in message.tool_calls
|
||||
]
|
||||
|
||||
invalid_tool_calls = [
|
||||
{
|
||||
"type": "function",
|
||||
"id": tc["id"],
|
||||
"function": {
|
||||
"name": tc["name"],
|
||||
"arguments": tc["args"],
|
||||
},
|
||||
}
|
||||
for tc in message.invalid_tool_calls
|
||||
]
|
||||
|
||||
if tool_calls or invalid_tool_calls:
|
||||
return tool_calls + invalid_tool_calls
|
||||
|
||||
# Get tool calls from additional kwargs if present.
|
||||
return [
|
||||
{
|
||||
k: v
|
||||
for k, v in tool_call.items() # type: ignore[union-attr]
|
||||
if k in {"id", "type", "function"}
|
||||
}
|
||||
for tool_call in message.additional_kwargs.get("tool_calls", [])
|
||||
]
|
||||
|
||||
|
||||
def _convert_dict_to_message(_dict: Dict) -> BaseMessage:
|
||||
role = _dict["role"]
|
||||
content = _dict.get("content")
|
||||
content = content if content is not None else ""
|
||||
|
||||
if role == "user":
|
||||
return HumanMessage(content=content)
|
||||
elif role == "system":
|
||||
return SystemMessage(content=content)
|
||||
elif role == "assistant":
|
||||
additional_kwargs: Dict = {}
|
||||
tool_calls = []
|
||||
invalid_tool_calls = []
|
||||
if raw_tool_calls := _dict.get("tool_calls"):
|
||||
additional_kwargs["tool_calls"] = raw_tool_calls
|
||||
for raw_tool_call in raw_tool_calls:
|
||||
try:
|
||||
tool_calls.append(parse_tool_call(raw_tool_call, return_id=True))
|
||||
except Exception as e:
|
||||
invalid_tool_calls.append(
|
||||
make_invalid_tool_call(raw_tool_call, str(e))
|
||||
)
|
||||
return AIMessage(
|
||||
content=content,
|
||||
additional_kwargs=additional_kwargs,
|
||||
id=_dict.get("id"),
|
||||
tool_calls=tool_calls,
|
||||
invalid_tool_calls=invalid_tool_calls,
|
||||
)
|
||||
else:
|
||||
return ChatMessage(content=content, role=role)
|
||||
|
||||
|
||||
def _convert_dict_to_message_chunk(
|
||||
_dict: Mapping[str, Any], default_role: str
|
||||
) -> BaseMessageChunk:
|
||||
role = _dict.get("role", default_role)
|
||||
content = _dict.get("content")
|
||||
content = content if content is not None else ""
|
||||
|
||||
if role == "user":
|
||||
return HumanMessageChunk(content=content)
|
||||
elif role == "system":
|
||||
return SystemMessageChunk(content=content)
|
||||
elif role == "tool":
|
||||
return ToolMessageChunk(
|
||||
content=content, tool_call_id=_dict["tool_call_id"], id=_dict.get("id")
|
||||
)
|
||||
elif role == "assistant":
|
||||
additional_kwargs: Dict = {}
|
||||
tool_call_chunks = []
|
||||
if raw_tool_calls := _dict.get("tool_calls"):
|
||||
additional_kwargs["tool_calls"] = raw_tool_calls
|
||||
try:
|
||||
tool_call_chunks = [
|
||||
tool_call_chunk(
|
||||
name=tc["function"].get("name"),
|
||||
args=tc["function"].get("arguments"),
|
||||
id=tc.get("id"),
|
||||
index=tc["index"],
|
||||
)
|
||||
for tc in raw_tool_calls
|
||||
]
|
||||
except KeyError:
|
||||
pass
|
||||
return AIMessageChunk(
|
||||
content=content,
|
||||
additional_kwargs=additional_kwargs,
|
||||
id=_dict.get("id"),
|
||||
tool_call_chunks=tool_call_chunks,
|
||||
)
|
||||
else:
|
||||
return ChatMessageChunk(content=content, role=role)
|
||||
2495
libs/partners/databricks/poetry.lock
generated
Normal file
2495
libs/partners/databricks/poetry.lock
generated
Normal file
File diff suppressed because it is too large
Load Diff
99
libs/partners/databricks/pyproject.toml
Normal file
99
libs/partners/databricks/pyproject.toml
Normal file
@@ -0,0 +1,99 @@
|
||||
[tool.poetry]
|
||||
name = "langchain-databricks"
|
||||
version = "0.1.0"
|
||||
description = "An integration package connecting Databricks and LangChain"
|
||||
authors = []
|
||||
readme = "README.md"
|
||||
repository = "https://github.com/langchain-ai/langchain"
|
||||
license = "MIT"
|
||||
|
||||
[tool.poetry.urls]
|
||||
"Source Code" = "https://github.com/langchain-ai/langchain/tree/master/libs/partners/databricks"
|
||||
"Release Notes" = "https://github.com/langchain-ai/langchain/releases?q=tag%3A%22databricks%3D%3D0%22&expanded=true"
|
||||
|
||||
[tool.poetry.dependencies]
|
||||
# TODO: Replace <3.12 to <4.0 once https://github.com/mlflow/mlflow/commit/04370119fcc1b2ccdbcd9a50198ab00566d58cd2 is released
|
||||
python = ">=3.8.1,<3.12"
|
||||
langchain-core = "^0.2.0"
|
||||
mlflow = ">=2.9"
|
||||
|
||||
# MLflow depends on following libraries, which require different version for Python 3.8 vs 3.12
|
||||
numpy = [
|
||||
{version = ">=1.26.0", python = ">=3.12"},
|
||||
{version = ">=1.24.0", python = "<3.12"},
|
||||
]
|
||||
scipy = [
|
||||
{version = ">=1.11", python = ">=3.12"},
|
||||
{version = "<2", python = "<3.12"}
|
||||
]
|
||||
|
||||
[tool.poetry.group.test]
|
||||
optional = true
|
||||
|
||||
[tool.poetry.group.test.dependencies]
|
||||
pytest = "^7.4.3"
|
||||
pytest-asyncio = "^0.23.2"
|
||||
pytest-socket = "^0.7.0"
|
||||
langchain-core = { path = "../../core", develop = true }
|
||||
|
||||
[tool.poetry.group.codespell]
|
||||
optional = true
|
||||
|
||||
[tool.poetry.group.codespell.dependencies]
|
||||
codespell = "^2.2.6"
|
||||
|
||||
[tool.poetry.group.test_integration]
|
||||
optional = true
|
||||
|
||||
[tool.poetry.group.test_integration.dependencies]
|
||||
|
||||
[tool.poetry.group.lint]
|
||||
optional = true
|
||||
|
||||
[tool.poetry.group.lint.dependencies]
|
||||
ruff = "^0.5"
|
||||
|
||||
[tool.poetry.group.typing.dependencies]
|
||||
mypy = "^1.10"
|
||||
langchain-core = { path = "../../core", develop = true }
|
||||
|
||||
[tool.poetry.group.dev]
|
||||
optional = true
|
||||
|
||||
[tool.poetry.group.dev.dependencies]
|
||||
langchain-core = { path = "../../core", develop = true }
|
||||
|
||||
[tool.ruff.lint]
|
||||
select = [
|
||||
"E", # pycodestyle
|
||||
"F", # pyflakes
|
||||
"I", # isort
|
||||
"T201", # print
|
||||
]
|
||||
|
||||
[tool.mypy]
|
||||
disallow_untyped_defs = "True"
|
||||
|
||||
[tool.coverage.run]
|
||||
omit = ["tests/*"]
|
||||
|
||||
[build-system]
|
||||
requires = ["poetry-core>=1.0.0"]
|
||||
build-backend = "poetry.core.masonry.api"
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
# --strict-markers will raise errors on unknown marks.
|
||||
# https://docs.pytest.org/en/7.1.x/how-to/mark.html#raising-errors-on-unknown-marks
|
||||
#
|
||||
# https://docs.pytest.org/en/7.1.x/reference/reference.html
|
||||
# --strict-config any warnings encountered while parsing the `pytest`
|
||||
# section of the configuration file raise errors.
|
||||
#
|
||||
# https://github.com/tophat/syrupy
|
||||
addopts = "--strict-markers --strict-config --durations=5"
|
||||
# Registering custom markers.
|
||||
# https://docs.pytest.org/en/7.1.x/example/markers.html#registering-markers
|
||||
markers = [
|
||||
"compile: mark placeholder test used to compile integration tests without running them",
|
||||
]
|
||||
asyncio_mode = "auto"
|
||||
17
libs/partners/databricks/scripts/check_imports.py
Normal file
17
libs/partners/databricks/scripts/check_imports.py
Normal file
@@ -0,0 +1,17 @@
|
||||
import sys
|
||||
import traceback
|
||||
from importlib.machinery import SourceFileLoader
|
||||
|
||||
if __name__ == "__main__":
|
||||
files = sys.argv[1:]
|
||||
has_failure = False
|
||||
for file in files:
|
||||
try:
|
||||
SourceFileLoader("x", file).load_module()
|
||||
except Exception:
|
||||
has_failure = True
|
||||
print(file) # noqa: T201
|
||||
traceback.print_exc()
|
||||
print() # noqa: T201
|
||||
|
||||
sys.exit(1 if has_failure else 0)
|
||||
27
libs/partners/databricks/scripts/check_pydantic.sh
Executable file
27
libs/partners/databricks/scripts/check_pydantic.sh
Executable file
@@ -0,0 +1,27 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# This script searches for lines starting with "import pydantic" or "from pydantic"
|
||||
# in tracked files within a Git repository.
|
||||
#
|
||||
# Usage: ./scripts/check_pydantic.sh /path/to/repository
|
||||
|
||||
# Check if a path argument is provided
|
||||
if [ $# -ne 1 ]; then
|
||||
echo "Usage: $0 /path/to/repository"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
repository_path="$1"
|
||||
|
||||
# Search for lines matching the pattern within the specified repository
|
||||
result=$(git -C "$repository_path" grep -E '^import pydantic|^from pydantic')
|
||||
|
||||
# Check if any matching lines were found
|
||||
if [ -n "$result" ]; then
|
||||
echo "ERROR: The following lines need to be updated:"
|
||||
echo "$result"
|
||||
echo "Please replace the code with an import from langchain_core.pydantic_v1."
|
||||
echo "For example, replace 'from pydantic import BaseModel'"
|
||||
echo "with 'from langchain_core.pydantic_v1 import BaseModel'"
|
||||
exit 1
|
||||
fi
|
||||
18
libs/partners/databricks/scripts/lint_imports.sh
Executable file
18
libs/partners/databricks/scripts/lint_imports.sh
Executable file
@@ -0,0 +1,18 @@
|
||||
#!/bin/bash
|
||||
|
||||
set -eu
|
||||
|
||||
# Initialize a variable to keep track of errors
|
||||
errors=0
|
||||
|
||||
# make sure not importing from langchain, langchain_experimental, or langchain_community
|
||||
git --no-pager grep '^from langchain\.' . && errors=$((errors+1))
|
||||
git --no-pager grep '^from langchain_experimental\.' . && errors=$((errors+1))
|
||||
git --no-pager grep '^from langchain_community\.' . && errors=$((errors+1))
|
||||
|
||||
# Decide on an exit status based on the errors
|
||||
if [ "$errors" -gt 0 ]; then
|
||||
exit 1
|
||||
else
|
||||
exit 0
|
||||
fi
|
||||
0
libs/partners/databricks/tests/__init__.py
Normal file
0
libs/partners/databricks/tests/__init__.py
Normal file
@@ -0,0 +1,7 @@
|
||||
import pytest
|
||||
|
||||
|
||||
@pytest.mark.compile
|
||||
def test_placeholder() -> None:
|
||||
"""Used for compiling integration tests without running any real tests."""
|
||||
pass
|
||||
321
libs/partners/databricks/tests/unit_tests/test_chat_models.py
Normal file
321
libs/partners/databricks/tests/unit_tests/test_chat_models.py
Normal file
@@ -0,0 +1,321 @@
|
||||
"""Test chat model integration."""
|
||||
|
||||
import json
|
||||
from typing import Generator
|
||||
from unittest import mock
|
||||
|
||||
import mlflow # type: ignore # noqa: F401
|
||||
import pytest
|
||||
from langchain_core.messages import (
|
||||
AIMessage,
|
||||
AIMessageChunk,
|
||||
BaseMessage,
|
||||
ChatMessage,
|
||||
ChatMessageChunk,
|
||||
FunctionMessage,
|
||||
HumanMessage,
|
||||
HumanMessageChunk,
|
||||
SystemMessage,
|
||||
SystemMessageChunk,
|
||||
ToolMessageChunk,
|
||||
)
|
||||
from langchain_core.messages.tool import ToolCallChunk
|
||||
from langchain_core.pydantic_v1 import BaseModel, Field
|
||||
|
||||
from langchain_databricks.chat_models import (
|
||||
ChatDatabricks,
|
||||
_convert_dict_to_message,
|
||||
_convert_dict_to_message_chunk,
|
||||
_convert_message_to_dict,
|
||||
)
|
||||
|
||||
_MOCK_CHAT_RESPONSE = {
|
||||
"id": "chatcmpl_id",
|
||||
"object": "chat.completion",
|
||||
"created": 1721875529,
|
||||
"model": "meta-llama-3.1-70b-instruct-072424",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": "To calculate the result of 36939 multiplied by 8922.4, "
|
||||
"I get:\n\n36939 x 8922.4 = 329,511,111.6",
|
||||
},
|
||||
"finish_reason": "stop",
|
||||
"logprobs": None,
|
||||
}
|
||||
],
|
||||
"usage": {"prompt_tokens": 30, "completion_tokens": 36, "total_tokens": 66},
|
||||
}
|
||||
|
||||
_MOCK_STREAM_RESPONSE = [
|
||||
{
|
||||
"id": "chatcmpl_bb1fce87-f14e-4ae1-ac22-89facc74898a",
|
||||
"object": "chat.completion.chunk",
|
||||
"created": 1721877054,
|
||||
"model": "meta-llama-3.1-70b-instruct-072424",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"delta": {"role": "assistant", "content": "36939"},
|
||||
"finish_reason": None,
|
||||
"logprobs": None,
|
||||
}
|
||||
],
|
||||
"usage": {"prompt_tokens": 30, "completion_tokens": 20, "total_tokens": 50},
|
||||
},
|
||||
{
|
||||
"id": "chatcmpl_bb1fce87-f14e-4ae1-ac22-89facc74898a",
|
||||
"object": "chat.completion.chunk",
|
||||
"created": 1721877054,
|
||||
"model": "meta-llama-3.1-70b-instruct-072424",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"delta": {"role": "assistant", "content": "x"},
|
||||
"finish_reason": None,
|
||||
"logprobs": None,
|
||||
}
|
||||
],
|
||||
"usage": {"prompt_tokens": 30, "completion_tokens": 22, "total_tokens": 52},
|
||||
},
|
||||
{
|
||||
"id": "chatcmpl_bb1fce87-f14e-4ae1-ac22-89facc74898a",
|
||||
"object": "chat.completion.chunk",
|
||||
"created": 1721877054,
|
||||
"model": "meta-llama-3.1-70b-instruct-072424",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"delta": {"role": "assistant", "content": "8922.4"},
|
||||
"finish_reason": None,
|
||||
"logprobs": None,
|
||||
}
|
||||
],
|
||||
"usage": {"prompt_tokens": 30, "completion_tokens": 24, "total_tokens": 54},
|
||||
},
|
||||
{
|
||||
"id": "chatcmpl_bb1fce87-f14e-4ae1-ac22-89facc74898a",
|
||||
"object": "chat.completion.chunk",
|
||||
"created": 1721877054,
|
||||
"model": "meta-llama-3.1-70b-instruct-072424",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"delta": {"role": "assistant", "content": " = "},
|
||||
"finish_reason": None,
|
||||
"logprobs": None,
|
||||
}
|
||||
],
|
||||
"usage": {"prompt_tokens": 30, "completion_tokens": 28, "total_tokens": 58},
|
||||
},
|
||||
{
|
||||
"id": "chatcmpl_bb1fce87-f14e-4ae1-ac22-89facc74898a",
|
||||
"object": "chat.completion.chunk",
|
||||
"created": 1721877054,
|
||||
"model": "meta-llama-3.1-70b-instruct-072424",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"delta": {"role": "assistant", "content": "329,511,111.6"},
|
||||
"finish_reason": None,
|
||||
"logprobs": None,
|
||||
}
|
||||
],
|
||||
"usage": {"prompt_tokens": 30, "completion_tokens": 30, "total_tokens": 60},
|
||||
},
|
||||
{
|
||||
"id": "chatcmpl_bb1fce87-f14e-4ae1-ac22-89facc74898a",
|
||||
"object": "chat.completion.chunk",
|
||||
"created": 1721877054,
|
||||
"model": "meta-llama-3.1-70b-instruct-072424",
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"delta": {"role": "assistant", "content": ""},
|
||||
"finish_reason": "stop",
|
||||
"logprobs": None,
|
||||
}
|
||||
],
|
||||
"usage": {"prompt_tokens": 30, "completion_tokens": 36, "total_tokens": 66},
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def mock_client() -> Generator:
|
||||
client = mock.MagicMock()
|
||||
client.predict.return_value = _MOCK_CHAT_RESPONSE
|
||||
client.predict_stream.return_value = _MOCK_STREAM_RESPONSE
|
||||
with mock.patch("mlflow.deployments.get_deploy_client", return_value=client):
|
||||
yield
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def llm() -> ChatDatabricks:
|
||||
return ChatDatabricks(
|
||||
endpoint="databricks-meta-llama-3-70b-instruct", target_uri="databricks"
|
||||
)
|
||||
|
||||
|
||||
def test_chat_mlflow_predict(llm: ChatDatabricks) -> None:
|
||||
res = llm.invoke(
|
||||
[
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "36939 * 8922.4"},
|
||||
]
|
||||
)
|
||||
assert res.content == _MOCK_CHAT_RESPONSE["choices"][0]["message"]["content"] # type: ignore[index]
|
||||
|
||||
|
||||
def test_chat_mlflow_stream(llm: ChatDatabricks) -> None:
|
||||
res = llm.stream(
|
||||
[
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "36939 * 8922.4"},
|
||||
]
|
||||
)
|
||||
for chunk, expected in zip(res, _MOCK_STREAM_RESPONSE):
|
||||
assert chunk.content == expected["choices"][0]["delta"]["content"] # type: ignore[index]
|
||||
|
||||
|
||||
def test_chat_mlflow_bind_tools(llm: ChatDatabricks) -> None:
|
||||
class GetWeather(BaseModel):
|
||||
"""Get the current weather in a given location"""
|
||||
|
||||
location: str = Field(
|
||||
..., description="The city and state, e.g. San Francisco, CA"
|
||||
)
|
||||
|
||||
class GetPopulation(BaseModel):
|
||||
"""Get the current population in a given location"""
|
||||
|
||||
location: str = Field(
|
||||
..., description="The city and state, e.g. San Francisco, CA"
|
||||
)
|
||||
|
||||
llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
|
||||
response = llm_with_tools.invoke(
|
||||
"Which city is hotter today and which is bigger: LA or NY?"
|
||||
)
|
||||
assert isinstance(response, AIMessage)
|
||||
|
||||
|
||||
### Test data conversion functions ###
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("role", "expected_output"),
|
||||
[
|
||||
("user", HumanMessage("foo")),
|
||||
("system", SystemMessage("foo")),
|
||||
("assistant", AIMessage("foo")),
|
||||
("any_role", ChatMessage(content="foo", role="any_role")),
|
||||
],
|
||||
)
|
||||
def test_convert_message(role: str, expected_output: BaseMessage) -> None:
|
||||
message = {"role": role, "content": "foo"}
|
||||
result = _convert_dict_to_message(message)
|
||||
assert result == expected_output
|
||||
|
||||
# convert back
|
||||
dict_result = _convert_message_to_dict(result)
|
||||
assert dict_result == message
|
||||
|
||||
|
||||
def test_convert_message_with_tool_calls() -> None:
|
||||
ID = "call_fb5f5e1a-bac0-4422-95e9-d06e6022ad12"
|
||||
tool_calls = [
|
||||
{
|
||||
"id": ID,
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "main__test__python_exec",
|
||||
"arguments": '{"code": "result = 36939 * 8922.4"}',
|
||||
},
|
||||
}
|
||||
]
|
||||
message_with_tools = {
|
||||
"role": "assistant",
|
||||
"content": None,
|
||||
"tool_calls": tool_calls,
|
||||
"id": ID,
|
||||
}
|
||||
result = _convert_dict_to_message(message_with_tools)
|
||||
expected_output = AIMessage(
|
||||
content="",
|
||||
additional_kwargs={"tool_calls": tool_calls},
|
||||
id=ID,
|
||||
tool_calls=[
|
||||
{
|
||||
"name": tool_calls[0]["function"]["name"], # type: ignore[index]
|
||||
"args": json.loads(tool_calls[0]["function"]["arguments"]), # type: ignore[index]
|
||||
"id": ID,
|
||||
"type": "tool_call",
|
||||
}
|
||||
],
|
||||
)
|
||||
assert result == expected_output
|
||||
|
||||
# convert back
|
||||
dict_result = _convert_message_to_dict(result)
|
||||
assert dict_result == message_with_tools
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("role", "expected_output"),
|
||||
[
|
||||
("user", HumanMessageChunk(content="foo")),
|
||||
("system", SystemMessageChunk(content="foo")),
|
||||
("assistant", AIMessageChunk(content="foo")),
|
||||
("any_role", ChatMessageChunk(content="foo", role="any_role")),
|
||||
],
|
||||
)
|
||||
def test_convert_message_chunk(role: str, expected_output: BaseMessage) -> None:
|
||||
delta = {"role": role, "content": "foo"}
|
||||
result = _convert_dict_to_message_chunk(delta, "default_role")
|
||||
assert result == expected_output
|
||||
|
||||
# convert back
|
||||
dict_result = _convert_message_to_dict(result)
|
||||
assert dict_result == delta
|
||||
|
||||
|
||||
def test_convert_message_chunk_with_tool_calls() -> None:
|
||||
delta_with_tools = {
|
||||
"role": "assistant",
|
||||
"content": None,
|
||||
"tool_calls": [{"index": 0, "function": {"arguments": " }"}}],
|
||||
}
|
||||
result = _convert_dict_to_message_chunk(delta_with_tools, "role")
|
||||
expected_output = AIMessageChunk(
|
||||
content="",
|
||||
additional_kwargs={"tool_calls": delta_with_tools["tool_calls"]},
|
||||
id=None,
|
||||
tool_call_chunks=[ToolCallChunk(name=None, args=" }", id=None, index=0)],
|
||||
)
|
||||
assert result == expected_output
|
||||
|
||||
|
||||
def test_convert_tool_message_chunk() -> None:
|
||||
delta = {
|
||||
"role": "tool",
|
||||
"content": "foo",
|
||||
"tool_call_id": "tool_call_id",
|
||||
"id": "some_id",
|
||||
}
|
||||
result = _convert_dict_to_message_chunk(delta, "default_role")
|
||||
expected_output = ToolMessageChunk(
|
||||
content="foo", id="some_id", tool_call_id="tool_call_id"
|
||||
)
|
||||
assert result == expected_output
|
||||
|
||||
# convert back
|
||||
dict_result = _convert_message_to_dict(result)
|
||||
assert dict_result == delta
|
||||
|
||||
|
||||
def test_convert_message_to_dict_function() -> None:
|
||||
with pytest.raises(ValueError, match="Function messages are not supported"):
|
||||
_convert_message_to_dict(FunctionMessage(content="", name="name"))
|
||||
10
libs/partners/databricks/tests/unit_tests/test_imports.py
Normal file
10
libs/partners/databricks/tests/unit_tests/test_imports.py
Normal file
@@ -0,0 +1,10 @@
|
||||
from langchain_databricks import __all__
|
||||
|
||||
EXPECTED_ALL = [
|
||||
"ChatDatabricks",
|
||||
"__version__",
|
||||
]
|
||||
|
||||
|
||||
def test_all_imports() -> None:
|
||||
assert sorted(EXPECTED_ALL) == sorted(__all__)
|
||||
@@ -68,6 +68,7 @@ class MongoDBChatMessageHistory(BaseChatMessageHistory):
|
||||
session_id_key: str = DEFAULT_SESSION_ID_KEY,
|
||||
history_key: str = DEFAULT_HISTORY_KEY,
|
||||
create_index: bool = True,
|
||||
history_size: Optional[int] = None,
|
||||
index_kwargs: Optional[Dict] = None,
|
||||
):
|
||||
"""Initialize with a MongoDBChatMessageHistory instance.
|
||||
@@ -88,6 +89,8 @@ class MongoDBChatMessageHistory(BaseChatMessageHistory):
|
||||
name of the field that stores the chat history.
|
||||
create_index: Optional[bool]
|
||||
whether to create an index on the session id field.
|
||||
history_size: Optional[int]
|
||||
count of (most recent) messages to fetch from MongoDB.
|
||||
index_kwargs: Optional[Dict]
|
||||
additional keyword arguments to pass to the index creation.
|
||||
"""
|
||||
@@ -97,6 +100,7 @@ class MongoDBChatMessageHistory(BaseChatMessageHistory):
|
||||
self.collection_name = collection_name
|
||||
self.session_id_key = session_id_key
|
||||
self.history_key = history_key
|
||||
self.history_size = history_size
|
||||
|
||||
try:
|
||||
self.client: MongoClient = MongoClient(connection_string)
|
||||
@@ -114,7 +118,15 @@ class MongoDBChatMessageHistory(BaseChatMessageHistory):
|
||||
def messages(self) -> List[BaseMessage]: # type: ignore
|
||||
"""Retrieve the messages from MongoDB"""
|
||||
try:
|
||||
cursor = self.collection.find({self.session_id_key: self.session_id})
|
||||
if self.history_size is None:
|
||||
cursor = self.collection.find({self.session_id_key: self.session_id})
|
||||
else:
|
||||
skip_count = max(
|
||||
0, self.collection.count_documents({}) - self.history_size
|
||||
)
|
||||
cursor = self.collection.find(
|
||||
{self.session_id_key: self.session_id}, skip=skip_count
|
||||
)
|
||||
except errors.OperationFailure as error:
|
||||
logger.error(error)
|
||||
|
||||
|
||||
@@ -16,6 +16,7 @@ class PatchedMongoDBChatMessageHistory(MongoDBChatMessageHistory):
|
||||
self.collection = MockCollection()
|
||||
self.session_id_key = "SessionId"
|
||||
self.history_key = "History"
|
||||
self.history_size = None
|
||||
|
||||
|
||||
def test_memory_with_message_store() -> None:
|
||||
|
||||
@@ -1120,7 +1120,6 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
Args:
|
||||
schema:
|
||||
The output schema. Can be passed in as:
|
||||
|
||||
- an OpenAI function/tool schema,
|
||||
- a JSON Schema,
|
||||
- a TypedDict class (support added in 0.1.20),
|
||||
@@ -1138,7 +1137,6 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
|
||||
method:
|
||||
The method for steering model generation, one of:
|
||||
|
||||
- "function_calling":
|
||||
Uses OpenAI's tool-calling (formerly called function calling)
|
||||
API: https://platform.openai.com/docs/guides/function-calling
|
||||
@@ -1156,8 +1154,8 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
Learn more about the differences between the methods and which models
|
||||
support which methods here:
|
||||
|
||||
- https://platform.openai.com/docs/guides/structured-outputs/structured-outputs-vs-json-mode
|
||||
- https://platform.openai.com/docs/guides/structured-outputs/function-calling-vs-response-format
|
||||
- https://platform.openai.com/docs/guides/structured-outputs/structured-outputs-vs-json-mode
|
||||
- https://platform.openai.com/docs/guides/structured-outputs/function-calling-vs-response-format
|
||||
|
||||
.. versionchanged:: 0.1.21
|
||||
|
||||
@@ -1200,26 +1198,22 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
Returns:
|
||||
A Runnable that takes same inputs as a :class:`langchain_core.language_models.chat.BaseChatModel`.
|
||||
|
||||
If ``include_raw`` is False and ``schema`` is a Pydantic class, Runnable outputs
|
||||
an instance of ``schema`` (i.e., a Pydantic object).
|
||||
| If ``include_raw`` is False and ``schema`` is a Pydantic class, Runnable outputs an instance of ``schema`` (i.e., a Pydantic object). Otherwise, if ``include_raw`` is False then Runnable outputs a dict.
|
||||
|
||||
Otherwise, if ``include_raw`` is False then Runnable outputs a dict.
|
||||
| If ``include_raw`` is True, then Runnable outputs a dict with keys:
|
||||
|
||||
If ``include_raw`` is True, then Runnable outputs a dict with keys:
|
||||
- "raw": BaseMessage
|
||||
- "parsed": None if there was a parsing error, otherwise the type depends on the ``schema`` as described above.
|
||||
- "parsing_error": Optional[BaseException]
|
||||
|
||||
- "raw": BaseMessage
|
||||
- "parsed": None if there was a parsing error, otherwise the type depends on the ``schema`` as described above.
|
||||
- "parsing_error": Optional[BaseException]
|
||||
.. dropdown:: Example: schema=Pydantic class, method="function_calling", include_raw=False, strict=True
|
||||
|
||||
Example: schema=Pydantic class, method="function_calling", include_raw=False, strict=True:
|
||||
.. note:: Valid schemas when using ``strict`` = True
|
||||
Note, OpenAI has a number of restrictions on what types of schemas can be
|
||||
provided if ``strict`` = True. When using Pydantic, our model cannot
|
||||
specify any Field metadata (like min/max constraints) and fields cannot
|
||||
have default values.
|
||||
|
||||
OpenAI has a number of restrictions on what types of schemas can be
|
||||
provided if ``strict`` = True. When using Pydantic, our model cannot
|
||||
specify any Field metadata (like min/max constraints) and fields cannot
|
||||
have default values.
|
||||
|
||||
See all constraints here: https://platform.openai.com/docs/guides/structured-outputs/supported-schemas
|
||||
See all constraints here: https://platform.openai.com/docs/guides/structured-outputs/supported-schemas
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@@ -1252,7 +1246,8 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
# justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'
|
||||
# )
|
||||
|
||||
Example: schema=Pydantic class, method="function_calling", include_raw=True:
|
||||
.. dropdown:: Example: schema=Pydantic class, method="function_calling", include_raw=True
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
@@ -1280,7 +1275,8 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
# 'parsing_error': None
|
||||
# }
|
||||
|
||||
Example: schema=TypedDict class, method="function_calling", include_raw=False:
|
||||
.. dropdown:: Example: schema=TypedDict class, method="function_calling", include_raw=False
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# IMPORTANT: If you are using Python <=3.8, you need to import Annotated
|
||||
@@ -1310,7 +1306,8 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
|
||||
# }
|
||||
|
||||
Example: schema=OpenAI function schema, method="function_calling", include_raw=False:
|
||||
.. dropdown:: Example: schema=OpenAI function schema, method="function_calling", include_raw=False
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
@@ -1339,7 +1336,8 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
|
||||
# }
|
||||
|
||||
Example: schema=Pydantic class, method="json_mode", include_raw=True:
|
||||
.. dropdown:: Example: schema=Pydantic class, method="json_mode", include_raw=True
|
||||
|
||||
.. code-block::
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
@@ -1367,7 +1365,8 @@ class BaseChatOpenAI(BaseChatModel):
|
||||
# 'parsing_error': None
|
||||
# }
|
||||
|
||||
Example: schema=None, method="json_mode", include_raw=True:
|
||||
.. dropdown:: Example: schema=None, method="json_mode", include_raw=True
|
||||
|
||||
.. code-block::
|
||||
|
||||
structured_llm = llm.with_structured_output(method="json_mode", include_raw=True)
|
||||
|
||||
1792
libs/partners/unstructured/poetry.lock
generated
1792
libs/partners/unstructured/poetry.lock
generated
File diff suppressed because it is too large
Load Diff
@@ -1,6 +1,6 @@
|
||||
[tool.poetry]
|
||||
name = "langchain-unstructured"
|
||||
version = "0.1.1"
|
||||
version = "0.1.2"
|
||||
description = "An integration package connecting Unstructured and LangChain"
|
||||
authors = []
|
||||
readme = "README.md"
|
||||
@@ -15,7 +15,7 @@ license = "MIT"
|
||||
python = ">=3.9,<4.0"
|
||||
langchain-core = "^0.2.23"
|
||||
unstructured-client = { version = "^0.24.1" }
|
||||
unstructured = { version = "^0.15.0", optional = true, python = "<3.13", extras = [
|
||||
unstructured = { version = "^0.15.7", optional = true, python = "<3.13", extras = [
|
||||
"all-docs",
|
||||
] }
|
||||
|
||||
@@ -50,7 +50,7 @@ ruff = "^0.1.8"
|
||||
|
||||
[tool.poetry.group.typing.dependencies]
|
||||
mypy = "^1.7.1"
|
||||
unstructured = { version = "^0.15.0", python = "<3.13", extras = ["all-docs"] }
|
||||
unstructured = { version = "^0.15.7", python = "<3.13", extras = ["all-docs"] }
|
||||
langchain-core = { path = "../../core", develop = true }
|
||||
|
||||
[tool.poetry.group.dev]
|
||||
|
||||
Reference in New Issue
Block a user