Compare commits

..

35 Commits

Author SHA1 Message Date
Tat Dat Duong
adb2889c80 Again trigger 2023-11-18 01:04:23 +01:00
Tat Dat Duong
3940676346 Remove old patch 2023-11-18 01:02:27 +01:00
Tat Dat Duong
760178a6b9 Revert "Trigger a rebuild"
This reverts commit 0647e4248b.
2023-11-18 00:57:19 +01:00
Tat Dat Duong
6b1dc45d8f Add other breakpoint patches 2023-11-18 00:57:01 +01:00
Tat Dat Duong
0647e4248b Trigger a rebuild 2023-11-17 15:34:03 +01:00
Tat Dat Duong
e7d552769b Expand the desktop threshold width 2023-11-17 15:29:13 +01:00
Bagatur
b4312aac5c TEMPLATES: Add multi-index templates (#13490)
One that routes and one that fuses

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-17 02:00:11 -08:00
Hugues Chocart
35e04f204b [LLMonitorCallbackHandler] Various improvements (#13151)
Small improvements for the llmonitor callback handler, like better
support for non-openai models.


---------

Co-authored-by: vincelwt <vince@lyser.io>
2023-11-16 23:39:36 -08:00
Noah Stapp
c1b041c188 Add Wrapping Library Metadata to MongoDB vector store (#13084)
**Description**
MongoDB drivers are used in various flavors and languages. Making sure
we exercise our due diligence in identifying the "origin" of the library
calls makes it best to understand how our Atlas servers get accessed.
2023-11-16 22:20:04 -08:00
Leonid Ganeline
21552628c8 DOCS updated data_connection index page (#13426)
- the `Index` section was missed. Created it.
- text simplification

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-16 18:16:50 -08:00
Guy Korland
7f8fd70ac4 Add optional arguments to FalkorDBGraph constructor (#13459)
**Description:** Add optional arguments to FalkorDBGraph constructor
**Tag maintainer:** baskaryan 
**Twitter handle:** @g_korland
2023-11-16 18:15:40 -08:00
Leonid Ganeline
e3a5cd7969 docs integrations/vectorstores/ cleanup (#13487)
- updated titles to consistent format
- added/updated descriptions and links
- format heading
2023-11-16 17:51:49 -08:00
Leonid Ganeline
1d2981114f DOCS updated async-faiss example (#13434)
The original notebook has the `faiss` title which is duplicated in
the`faiss.jpynb`. As a result, we have two `faiss` items in the
vectorstore ToC. And the first item breaks the searching order (it is
placed between `A...` items).
- I updated title to `Asynchronous Faiss`.
2023-11-16 17:41:26 -08:00
Erick Friis
9dfad613c2 IMPROVEMENT Allow openai v1 in all templates that require it (#13489)
- pyproject change
- lockfiles
2023-11-16 17:10:08 -08:00
chris stucchio
d7f014cd89 Bug: OpenAIFunctionsAgentOutputParser doesn't handle functions with no args (#13467)
**Description/Issue:** 
When OpenAI calls a function with no args, the args are `""` rather than
`"{}"`. Then `json.loads("")` blows up. This PR handles it correctly.

**Dependencies:** None
2023-11-16 16:47:05 -08:00
Yujie Qian
41a433fa33 IMPROVEMENT: add input_type to VoyageEmbeddings (#13488)
- **Description:** add input_type to VoyageEmbeddings
2023-11-16 16:35:36 -08:00
David Duong
ea6e017b85 Add serialisation arguments to Bedrock and ChatBedrock (#13465) 2023-11-17 01:33:24 +01:00
Erick Friis
427331d621 IMPROVEMENT Lock pydantic v1 in app template, cli 0.0.18 (#13485) 2023-11-16 15:22:11 -08:00
Erick Friis
75363f048f BUG Fix app_name in cli app new (#13482) 2023-11-16 14:19:35 -08:00
Leonid Ganeline
9ff8f69e75 DOCS updated memory Titles (#13435)
- Fixed titles for two notebooks. They were inconsistent with other
titles and clogged ToC.
- Added `Upstash` description and link
- Moved the authentication text up in the `Elasticsearch` nb, right
after package installation. It was on the end of the page which was a
wrong place.
2023-11-16 13:24:05 -08:00
ifduyue
324ab382ad Use List instead of list (#13443)
Unify List usages in libs/langchain/langchain/text_splitter.py, only one
place it's `list`, all other ocurrences are `List`
2023-11-16 13:15:58 -08:00
Stefano Lottini
b029d9f4e6 Astra DB: minor improvements to docstrings and demo notebook (#13449)
This PR brings a few minor improvements to the docs, namely class/method
docstrings and the demo notebook.

- A note on how to control concurrency levels to tune performance in
bulk inserts, both in the class docstring and the demo notebook;
- Slightly increased concurrency defaults after careful experimentation
(still on the conservative side even for clients running on
less-than-typical network/hardware specs)
- renamed the DB token variable to the standardized
`ASTRA_DB_APPLICATION_TOKEN` name (used elsewhere, e.g. in the Astra DB
docs)
- added a note and a reference (add_text docstring, demo notebook) on
allowed metadata field names.

Thank you!
2023-11-16 12:48:32 -08:00
Eugene Yurtsev
1e43fd6afe Add ahandle_event to _all_ (#13469)
Add ahandle_event for backwards compatibility as it is used by langserve
2023-11-16 12:46:20 -08:00
Leonid Ganeline
283ef1f66d DOCS fix for integratons/document_loaders sidebar (#13471)
The current `integrations/document_loaders/` sidebar has the
`example_data` item, which is a menu with a single item: "Notebook".
It is happening because the `integrations/document_loaders/` folder has
the `example_data/notebook.md` file that is used to autogenerate the
above menu item.
- removed an example_data/notebook.md file. Docusaurus doesn't have
simple ways to fix this problem (to exclude folders/files from an
autogenerated sidebar). Removing this file didn't break any existing
examples, so this fix is safe.
2023-11-16 12:02:30 -08:00
Leonid Ganeline
b1fcf5b481 DOCS: integrations/text_embeddings/ cleanup (#13476)
Updated several notebooks:
- fixed titles which are inconsistent or break the ToC sorting order.
- added missed soruce descriptions and links
- fixed formatting
2023-11-16 11:56:53 -08:00
Bagatur
6030ab9779 Update chain of note README.md (#13473) 2023-11-16 10:47:27 -08:00
Lance Martin
cf66a4737d Update multi-modal RAG cookbook (#13429)
Use example
[blog](https://cloudedjudgement.substack.com/p/clouded-judgement-111023)
w/ tables, charts as images.
2023-11-16 10:34:13 -08:00
Bagatur
10fddac4b5 Bagatur/chain of note template(#13470) 2023-11-16 10:34:04 -08:00
Leonid Ganeline
d5b1a21ae4 DOCS updated semadb example (#13431)
- the `SemaDB` notebook was placed in additional subfolder which breaks
the vectorstore ToC. I moved file up, removed this unnecessary
subfolder; updated the `vercel.json` with rerouting for the new URL
- Added SemaDB description and link
- improved text consistency
2023-11-16 09:57:22 -08:00
Leonid Ganeline
17c2007e0c DOCS updated Activeloop DeepMemory notebook (#13428)
- Fixed the title of the notebook. It created an ugly ToC element as
`Activeloop DeepLake's DeepMemory + LangChain + ragas or how to get +27%
on RAG recall.`
- Added Activeloop description
- improved consistency in text
- fixed ToC (it was using HTML tagas that break left-side in-page ToC).
Now in-page ToC works
2023-11-16 09:56:28 -08:00
Harrison Chase
f90249305a callback refactor (#13372)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-11-16 08:25:09 -08:00
Bagatur
9e6748e198 DOCS: rag nit (#13436) 2023-11-15 18:06:52 -08:00
Leonid Ganeline
8a52c1456b updated clickup example (#13424)
- Fixed headers (was more then 1 Titles)
- Removed security token value. It was OK to have it, because it is
temporary token, but the automatic security swippers raise warnings on
that.
- Added `ClickUp` service description and link.
2023-11-15 15:11:24 -08:00
Brace Sproul
79fa9a81f4 Fix a link in docs (#13423) 2023-11-15 15:02:26 -08:00
Nuno Campos
a632f61f3d IMPROVEMENT pirate-speak-configurable alternatives env vars (#13395)
…rnative LLMs until used

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-15 14:38:03 -08:00
210 changed files with 19920 additions and 6432 deletions

File diff suppressed because one or more lines are too long

View File

@@ -50,4 +50,4 @@ Heres where our team hangs out, talks shop, spotlights cool work, and shares
- **[Twitter](https://twitter.com/LangChainAI):** We post about what were working on and what cool things were seeing in the space. If you tag @langchainai in your post, well almost certainly see it, and can show you some love!
- **[Discord](https://discord.gg/6adMQxSpJS):** connect with over 30,000 developers who are building with LangChain.
- **[GitHub](https://github.com/langchain-ai/langchain):** Open pull requests, contribute to a discussion, and/or contribute
- **[Subscribe to our bi-weekly Release Notes](https://6w1pwbss0py.typeform.com/to/KjZB1auB):** a twice/month email roundup of the coolest things going on in our orbit
- **[Subscribe to our bi-weekly Release Notes](https://6w1pwbss0py.typeform.com/to/KjZB1auB):** a twice/month email roundup of the coolest things going on in our orbit

View File

@@ -49,7 +49,7 @@ LCEL is a declarative way to compose chains. LCEL was designed from day 1 to sup
- **[Overview](/docs/expression_language/)**: LCEL and its benefits
- **[Interface](/docs/expression_language/interface)**: The standard interface for LCEL objects
- **[How-to](/docs/expression_language/interface)**: Key features of LCEL
- **[How-to](/docs/expression_language/how_to)**: Key features of LCEL
- **[Cookbook](/docs/expression_language/cookbook)**: Example code for accomplishing common tasks

View File

@@ -1,29 +0,0 @@
# Notebook
This notebook covers how to load data from an .ipynb notebook into a format suitable by LangChain.
```python
from langchain.document_loaders import NotebookLoader
```
```python
loader = NotebookLoader("example_data/notebook.ipynb")
```
`NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object.
**Parameters**:
* `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False).
* `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10).
* `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False).
* `traceback` (bool): whether to include full traceback (default is False).
```python
loader.load(include_outputs=True, max_output_length=20, remove_newline=True)
```

View File

@@ -101,8 +101,8 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain"
"from langchain.chains import LLMChain\n",
"from langchain.prompts import PromptTemplate"
]
},
{

View File

@@ -7,11 +7,11 @@
"id": "683953b3"
},
"source": [
"# Elasticsearch Chat Message History\n",
"# Elasticsearch\n",
"\n",
">[Elasticsearch](https://www.elastic.co/elasticsearch/) is a distributed, RESTful search and analytics engine, capable of performing both vector and lexical search. It is built on top of the Apache Lucene library.\n",
"\n",
"This notebook shows how to use chat message history functionality with Elasticsearch."
"This notebook shows how to use chat message history functionality with `Elasticsearch`."
]
},
{
@@ -46,6 +46,59 @@
"%pip install elasticsearch langchain"
]
},
{
"cell_type": "markdown",
"id": "c46c216c",
"metadata": {},
"source": [
"## Authentication\n",
"\n",
"### How to obtain a password for the default \"elastic\" user\n",
"\n",
"To obtain your Elastic Cloud password for the default \"elastic\" user:\n",
"1. Log in to the [Elastic Cloud console](https://cloud.elastic.co)\n",
"2. Go to \"Security\" > \"Users\"\n",
"3. Locate the \"elastic\" user and click \"Edit\"\n",
"4. Click \"Reset password\"\n",
"5. Follow the prompts to reset the password\n",
"\n",
"\n",
"### Use the Username/password\n",
"\n",
"```python\n",
"es_username = os.environ.get(\"ES_USERNAME\", \"elastic\")\n",
"es_password = os.environ.get(\"ES_PASSWORD\", \"change me...\")\n",
"\n",
"history = ElasticsearchChatMessageHistory(\n",
" es_url=es_url,\n",
" es_user=es_username,\n",
" es_password=es_password,\n",
" index=\"test-history\",\n",
" session_id=\"test-session\"\n",
")\n",
"```\n",
"\n",
"### How to obtain an API key\n",
"\n",
"To obtain an API key:\n",
"1. Log in to the [Elastic Cloud console](https://cloud.elastic.co)\n",
"2. Open `Kibana` and go to Stack Management > API Keys\n",
"3. Click \"Create API key\"\n",
"4. Enter a name for the API key and click \"Create\"\n",
"\n",
"### Use the API key\n",
"\n",
"```python\n",
"es_api_key = os.environ.get(\"ES_API_KEY\")\n",
"\n",
"history = ElasticsearchChatMessageHistory(\n",
" es_api_key=es_api_key,\n",
" index=\"test-history\",\n",
" session_id=\"test-session\"\n",
")\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "8be8fcc3",
@@ -104,58 +157,6 @@
"history.add_user_message(\"hi!\")\n",
"history.add_ai_message(\"whats up?\")"
]
},
{
"cell_type": "markdown",
"id": "c46c216c",
"metadata": {},
"source": [
"# Authentication\n",
"\n",
"## Username/password\n",
"\n",
"```python\n",
"es_username = os.environ.get(\"ES_USERNAME\", \"elastic\")\n",
"es_password = os.environ.get(\"ES_PASSWORD\", \"changeme\")\n",
"\n",
"history = ElasticsearchChatMessageHistory(\n",
" es_url=es_url,\n",
" es_user=es_username,\n",
" es_password=es_password,\n",
" index=\"test-history\",\n",
" session_id=\"test-session\"\n",
")\n",
"```\n",
"\n",
"### How to obtain a password for the default \"elastic\" user\n",
"\n",
"To obtain your Elastic Cloud password for the default \"elastic\" user:\n",
"1. Log in to the Elastic Cloud console at https://cloud.elastic.co\n",
"2. Go to \"Security\" > \"Users\"\n",
"3. Locate the \"elastic\" user and click \"Edit\"\n",
"4. Click \"Reset password\"\n",
"5. Follow the prompts to reset the password\n",
"\n",
"## API key\n",
"\n",
"```python\n",
"es_api_key = os.environ.get(\"ES_API_KEY\")\n",
"\n",
"history = ElasticsearchChatMessageHistory(\n",
" es_api_key=es_api_key,\n",
" index=\"test-history\",\n",
" session_id=\"test-session\"\n",
")\n",
"```\n",
"\n",
"### How to obtain an API key\n",
"\n",
"To obtain an API key:\n",
"1. Log in to the Elastic Cloud console at https://cloud.elastic.co\n",
"2. Open Kibana and go to Stack Management > API Keys\n",
"3. Click \"Create API key\"\n",
"4. Enter a name for the API key and click \"Create\""
]
}
],
"metadata": {
@@ -177,7 +178,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -4,9 +4,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Upstash Redis Chat Message History\n",
"# Upstash Redis\n",
"\n",
"This notebook goes over how to use Upstash Redis to store chat message history."
">[Upstash](https://upstash.com/docs/introduction) is a provider of the serverless `Redis`, `Kafka`, and `QStash` APIs.\n",
"\n",
"This notebook goes over how to use `Upstash Redis` to store chat message history."
]
},
{
@@ -42,7 +44,7 @@
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -56,10 +58,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -4,14 +4,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Activeloop DeepLake's DeepMemory + LangChain + ragas or how to get +27% on RAG recall."
"# Activeloop Deep Memory"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Retrieval-Augmented Generators (RAGs) have recently gained significant attention. As advanced RAG techniques and agents emerge, they expand the potential of what RAGs can accomplish. However, several challenges may limit the integration of RAGs into production. The primary factors to consider when implementing RAGs in production settings are accuracy (recall), cost, and latency. For basic use cases, OpenAI's Ada model paired with a naive similarity search can produce satisfactory results. Yet, for higher accuracy or recall during searches, one might need to employ advanced retrieval techniques. These methods might involve varying data chunk sizes, rewriting queries multiple times, and more, potentially increasing latency and costs. [Activeloop's](https://activeloop.ai/) [Deep Memory](https://www.activeloop.ai/resources/use-deep-memory-to-boost-rag-apps-accuracy-by-up-to-22/) a feature available to Activeloop Deep Lake users, addresses these issuea by introducing a tiny neural network layer trained to match user queries with relevant data from a corpus. While this addition incurs minimal latency during search, it can boost retrieval accuracy by up to 27\n",
">[Activeloop Deep Memory](https://docs.activeloop.ai/performance-features/deep-memory) is a suite of tools that enables you to optimize your Vector Store for your use-case and achieve higher accuracy in your LLM apps.\n",
"\n",
"`Retrieval-Augmented Generatation` (`RAG`) has recently gained significant attention. As advanced RAG techniques and agents emerge, they expand the potential of what RAGs can accomplish. However, several challenges may limit the integration of RAGs into production. The primary factors to consider when implementing RAGs in production settings are accuracy (recall), cost, and latency. For basic use cases, OpenAI's Ada model paired with a naive similarity search can produce satisfactory results. Yet, for higher accuracy or recall during searches, one might need to employ advanced retrieval techniques. These methods might involve varying data chunk sizes, rewriting queries multiple times, and more, potentially increasing latency and costs. [Activeloop's](https://activeloop.ai/) [Deep Memory](https://www.activeloop.ai/resources/use-deep-memory-to-boost-rag-apps-accuracy-by-up-to-22/) a feature available to `Activeloop Deep Lake` users, addresses these issuea by introducing a tiny neural network layer trained to match user queries with relevant data from a corpus. While this addition incurs minimal latency during search, it can boost retrieval accuracy by up to 27\n",
"% and remains cost-effective and simple to use, without requiring any additional advanced rag techniques.\n"
]
},
@@ -19,23 +21,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"For this tutorial we will parse deeplake documentation, and create a RAG system that could answer the question from the docs. \n",
"\n",
"The tutorial can be divided into several parts:\n",
"1. [Dataset creation and uploading](#1-dataset-creation)\n",
"2. [Generating synthetic queries and training deep_memory](#2-generating-synthetic-queries-and-training-deep_memory)\n",
"3. [Evaluating deep memory performance](#3-evaluating-deep-memory-performance)\n",
" - 3.1 [using deepmemory recall@10 metric](#31-using-deepmemory-recall10-metric)\n",
" - 3.2 [using ragas](#32-deepmemory--ragas)\n",
" - 3.3 [deep_memory inference](#33-deepmemory-inference)\n",
" - 3.4 [deep_memory cost savings](#34-cost-savings)"
"For this tutorial we will parse `DeepLake` documentation, and create a RAG system that could answer the question from the docs. \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"dataset-creation\"></a>\n",
"## 1. Dataset Creation"
]
},
@@ -227,10 +219,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"jp-MarkdownHeadingCollapsed": true
},
"source": [
"<a name=\"training\"></a>\n",
"## 2. Generating synthetic queries and training deep_memory "
"## 2. Generating synthetic queries and training Deep Memory "
]
},
{
@@ -422,8 +415,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"evaluation\"></a>\n",
"## 3. Evaluating deep memory performance"
"## 3. Evaluating Deep Memory performance"
]
},
{
@@ -437,15 +429,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"recall@10\"></a>\n",
"### 3.1 using deepmemory recall@10 metric"
"### 3.1 Deep Memory evaluation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For the beginning we can use deep_memory's builtin evaluation method. it can be done easily in a few lines of code:"
"For the beginning we can use deep_memory's builtin evaluation method. \n",
"It calculates several `recall` metrics.\n",
"It can be done easily in a few lines of code."
]
},
{
@@ -495,8 +488,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"ragas\"></a>\n",
"### 3.2 DeepMemory + ragas"
"### 3.2 Deep Memory + RAGas"
]
},
{
@@ -596,10 +588,11 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"jp-MarkdownHeadingCollapsed": true
},
"source": [
"<a name=\"inference\"></a>\n",
"### 3.3 DeepMemory Inference"
"### 3.3 Deep Memory Inference"
]
},
{
@@ -677,8 +670,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"cost\"></a>\n",
"### 3.4 Cost savings"
"### 3.4 Deep Memory cost savings"
]
},
{
@@ -691,7 +683,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -705,10 +697,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -4,9 +4,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# ERNIE Embedding-V1\n",
"# ERNIE\n",
"\n",
"[ERNIE Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu) is a text representation model based on Baidu Wenxin's large-scale model technology, \n",
"[ERNIE Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu) is a text representation model based on `Baidu Wenxin` large-scale model technology, \n",
"which converts text into a vector form represented by numerical values, and is used in text retrieval, information recommendation, knowledge mining and other scenarios."
]
},
@@ -53,8 +53,19 @@
"language": "python",
"name": "python3"
},
"orig_nbformat": 4
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -5,14 +5,14 @@
"id": "900fbd04-f6aa-4813-868f-1c54e3265385",
"metadata": {},
"source": [
"# Qdrant FastEmbed\n",
"# FastEmbed by Qdrant\n",
"\n",
"[FastEmbed](https://qdrant.github.io/fastembed/) is a lightweight, fast, Python library built for embedding generation. \n",
"\n",
"- Quantized model weights\n",
"- ONNX Runtime, no PyTorch dependency\n",
"- CPU-first design\n",
"- Data-parallelism for encoding of large datasets."
">[FastEmbed](https://qdrant.github.io/fastembed/) from [Qdrant](https://qdrant.tech) is a lightweight, fast, Python library built for embedding generation. \n",
">\n",
">- Quantized model weights\n",
">- ONNX Runtime, no PyTorch dependency\n",
">- CPU-first design\n",
">- Data-parallelism for encoding of large datasets."
]
},
{
@@ -154,7 +154,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -5,8 +5,10 @@
"id": "59428e05",
"metadata": {},
"source": [
"# InstructEmbeddings\n",
"Let's load the HuggingFace instruct Embeddings class."
"# Instruct Embeddings on Hugging Face\n",
"\n",
">[Hugging Face sentence-transformers](https://huggingface.co/sentence-transformers) is a Python framework for state-of-the-art sentence, text and image embeddings.\n",
">One of the instruct embedding models is used in the `HuggingFaceInstructEmbeddings` class.\n"
]
},
{
@@ -85,7 +87,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

View File

@@ -2,183 +2,207 @@
"cells": [
{
"cell_type": "markdown",
"source": [
"# Johnsnowlabs Embedding\n",
"\n",
"### Loading the Johnsnowlabs embedding class to generate and query embeddings\n",
"\n",
"Models are loaded with [nlp.load](https://nlp.johnsnowlabs.com/docs/en/jsl/load_api) and spark session is started with [nlp.start()](https://nlp.johnsnowlabs.com/docs/en/jsl/start-a-sparksession) under the hood.\n",
"For all 24.000+ models, see the [John Snow Labs Model Models Hub](https://nlp.johnsnowlabs.com/models)\n"
],
"metadata": {
"collapsed": false
}
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"# John Snow Labs\n",
"\n",
">[John Snow Labs](https://nlp.johnsnowlabs.com/) NLP & LLM ecosystem includes software libraries for state-of-the-art AI at scale, Responsible AI, No-Code AI, and access to over 20,000 models for Healthcare, Legal, Finance, etc.\n",
">\n",
">Models are loaded with [nlp.load](https://nlp.johnsnowlabs.com/docs/en/jsl/load_api) and spark session is started >with [nlp.start()](https://nlp.johnsnowlabs.com/docs/en/jsl/start-a-sparksession) under the hood.\n",
">For all 24.000+ models, see the [John Snow Labs Model Models Hub](https://nlp.johnsnowlabs.com/models)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"! pip install johnsnowlabs\n"
],
"metadata": {
"collapsed": false
}
"## Setting up"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"! pip install johnsnowlabs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"# If you have a enterprise license, you can run this to install enterprise features\n",
"# from johnsnowlabs import nlp\n",
"# nlp.install()"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"source": [
"#### Import the necessary classes"
],
"metadata": {
"collapsed": false
},
"execution_count": 1,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Found existing installation: langchain 0.0.189\n",
"Uninstalling langchain-0.0.189:\n",
" Successfully uninstalled langchain-0.0.189\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [],
"metadata": {
"collapsed": false
}
"metadata": {},
"source": [
"## Example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"from langchain.embeddings.johnsnowlabs import JohnSnowLabsEmbeddings"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "markdown",
"source": [
"#### Initialize Johnsnowlabs Embeddings and Spark Session"
],
"metadata": {
"collapsed": false
}
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"Initialize Johnsnowlabs Embeddings and Spark Session"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"embedder = JohnSnowLabsEmbeddings(\"en.embed_sentence.biobert.clinical_base_cased\")"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "markdown",
"source": [
"#### Define some example texts . These could be any documents that you want to analyze - for example, news articles, social media posts, or product reviews."
],
"metadata": {
"collapsed": false
}
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"Define some example texts . These could be any documents that you want to analyze - for example, news articles, social media posts, or product reviews."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"texts = [\"Cancer is caused by smoking\", \"Antibiotics aren't painkiller\"]"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "markdown",
"source": [
"#### Generate and print embeddings for the texts . The JohnSnowLabsEmbeddings class generates an embedding for each document, which is a numerical representation of the document's content. These embeddings can be used for various natural language processing tasks, such as document similarity comparison or text classification."
],
"metadata": {
"collapsed": false
}
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"Generate and print embeddings for the texts . The JohnSnowLabsEmbeddings class generates an embedding for each document, which is a numerical representation of the document's content. These embeddings can be used for various natural language processing tasks, such as document similarity comparison or text classification."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"embeddings = embedder.embed_documents(texts)\n",
"for i, embedding in enumerate(embeddings):\n",
" print(f\"Embedding for document {i+1}: {embedding}\")"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "markdown",
"source": [
"#### Generate and print an embedding for a single piece of text. You can also generate an embedding for a single piece of text, such as a search query. This can be useful for tasks like information retrieval, where you want to find documents that are similar to a given query."
],
"metadata": {
"collapsed": false
}
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"Generate and print an embedding for a single piece of text. You can also generate an embedding for a single piece of text, such as a search query. This can be useful for tasks like information retrieval, where you want to find documents that are similar to a given query."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
"query = \"Cancer is caused by smoking\"\n",
"query_embedding = embedder.embed_query(query)\n",
"print(f\"Embedding for query: {query_embedding}\")"
],
"metadata": {
"collapsed": false
}
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
"nbformat_minor": 4
}

View File

@@ -5,11 +5,13 @@
"id": "ed47bb62",
"metadata": {},
"source": [
"# Sentence Transformers\n",
"# Sentence Transformers on Hugging Face\n",
"\n",
">[SentenceTransformers](https://www.sbert.net/) embeddings are called using the `HuggingFaceEmbeddings` integration. We have also added an alias for `SentenceTransformerEmbeddings` for users who are more familiar with directly using that package.\n",
">[Hugging Face sentence-transformers](https://huggingface.co/sentence-transformers) is a Python framework for state-of-the-art sentence, text and image embeddings.\n",
">One of the embedding models is used in the `HuggingFaceEmbeddings` class.\n",
">We have also added an alias for `SentenceTransformerEmbeddings` for users who are more familiar with directly using that package.\n",
"\n",
"`SentenceTransformers` is a python package that can generate text and image embeddings, originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
"`sentence_transformers` package models are originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
]
},
{

View File

@@ -5,7 +5,11 @@
"id": "fff4734f",
"metadata": {},
"source": [
"# TensorflowHub\n",
"# TensorFlow Hub\n",
"\n",
">[TensorFlow Hub](https://www.tensorflow.org/hub) is a repository of trained machine learning models ready for fine-tuning and deployable anywhere. Reuse trained models like `BERT` and `Faster R-CNN` with just a few lines of code.\n",
">\n",
">\n",
"Let's load the TensorflowHub Embedding class."
]
},
@@ -105,7 +109,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

View File

@@ -7,6 +7,8 @@
"source": [
"# Voyage AI\n",
"\n",
">[Voyage AI](https://www.voyageai.com/) provides cutting-edge embedding/vectorizations models.\n",
"\n",
"Let's load the Voyage Embedding class."
]
},
@@ -215,7 +217,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

View File

@@ -4,7 +4,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# ClickUp Langchain Toolkit"
"# ClickUp\n",
"\n",
">[ClickUp](https://clickup.com/) is an all-in-one productivity platform that provides small and large teams across industries with flexible and customizable work management solutions, tools, and functions. \n",
"\n",
">It is a cloud-based project management solution for businesses of all sizes featuring communication and collaboration tools to help achieve organizational goals."
]
},
{
@@ -27,14 +31,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Init"
"## Initializing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get Authenticated\n",
"### Get Authenticated\n",
"1. Create a [ClickUp App](https://help.clickup.com/hc/en-us/articles/6303422883095-Create-your-own-app-with-the-ClickUp-API)\n",
"2. Follow [these steps](https://clickup.com/api/developer-portal/authentication/) to get your `client_id` and `client_secret`.\n",
" - *Suggestion: use `https://google.com` as the redirect_uri. This is what we assume in the defaults for this toolkit.*\n",
@@ -112,18 +116,7 @@
"source": [
"access_token = ClickupAPIWrapper.get_access_token(\n",
" oauth_client_id, oauth_client_secret, code\n",
")\n",
"\n",
"if access_token is not None:\n",
" print(\"Copy/paste this code, into the next cell so you can reuse it!\")\n",
" print(access_token)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Toolkit"
")"
]
},
{
@@ -142,12 +135,6 @@
}
],
"source": [
"# Set your access token here\n",
"access_token = \"12345678_myaccesstokengoeshere123\"\n",
"access_token = (\n",
" \"81928627_c009bf122ccf36ec3ba3e0ef748b07042c5e4217260042004a5934540cb61527\"\n",
")\n",
"\n",
"# Init toolkit\n",
"clickup_api_wrapper = ClickupAPIWrapper(access_token=access_token)\n",
"toolkit = ClickupToolkit.from_clickup_api_wrapper(clickup_api_wrapper)\n",
@@ -160,7 +147,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Agent"
"### Create Agent"
]
},
{
@@ -180,7 +167,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Run"
"## Use an Agent"
]
},
{
@@ -203,7 +190,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Navigation\n",
"### Navigation\n",
"You can get the teams, folder and spaces your user has access to"
]
},
@@ -287,7 +274,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Task Operations\n",
"### Task Operations\n",
"You can get, ask question about tasks and update them"
]
},
@@ -594,7 +581,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Creation\n",
"### Creation\n",
"You can create tasks, lists and folders"
]
},
@@ -778,7 +765,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multi-Step Tasks"
"## Multi-Step Tasks"
]
},
{
@@ -848,7 +835,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "clickup-copilot",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -862,10 +849,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
},
"orig_nbformat": 4
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -44,7 +44,7 @@
"metadata": {},
"source": [
"_Note: depending on your LangChain setup, you may need to install/upgrade other dependencies needed for this demo_\n",
"_(specifically, recent versions of `datasets` `openai` `pypdf` and `tiktoken` are required)._"
"_(specifically, recent versions of `datasets`, `openai`, `pypdf` and `tiktoken` are required)._"
]
},
{
@@ -65,7 +65,6 @@
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.prompts import ChatPromptTemplate\n",
"\n",
"# if not present yet, run: pip install \"datasets==2.14.6\"\n",
"from langchain.schema import Document\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnablePassthrough\n",
@@ -145,7 +144,7 @@
"outputs": [],
"source": [
"ASTRA_DB_API_ENDPOINT = input(\"ASTRA_DB_API_ENDPOINT = \")\n",
"ASTRA_DB_TOKEN = getpass(\"ASTRA_DB_TOKEN = \")"
"ASTRA_DB_APPLICATION_TOKEN = getpass(\"ASTRA_DB_APPLICATION_TOKEN = \")"
]
},
{
@@ -159,7 +158,7 @@
" embedding=embe,\n",
" collection_name=\"astra_vector_demo\",\n",
" api_endpoint=ASTRA_DB_API_ENDPOINT,\n",
" token=ASTRA_DB_TOKEN,\n",
" token=ASTRA_DB_APPLICATION_TOKEN,\n",
")"
]
},
@@ -171,6 +170,14 @@
"### Load a dataset"
]
},
{
"cell_type": "markdown",
"id": "552e56b0-301a-4b06-99c7-57ba6faa966f",
"metadata": {},
"source": [
"Convert each entry in the source dataset into a `Document`, then write them into the vector store:"
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -190,6 +197,16 @@
"print(f\"\\nInserted {len(inserted_ids)} documents.\")"
]
},
{
"cell_type": "markdown",
"id": "79d4f436-ef04-4288-8f79-97c9abb983ed",
"metadata": {},
"source": [
"In the above, `metadata` dictionaries are created from the source data and are part of the `Document`.\n",
"\n",
"_Note: check the [Astra DB API Docs](https://docs.datastax.com/en/astra-serverless/docs/develop/dev-with-json.html#_json_api_limits) for the valid metadata field names: some characters are reserved and cannot be used._"
]
},
{
"cell_type": "markdown",
"id": "084d8802-ab39-4262-9a87-42eafb746f92",
@@ -213,6 +230,16 @@
"print(f\"\\nInserted {len(inserted_ids_2)} documents.\")"
]
},
{
"cell_type": "markdown",
"id": "63840eb3-8b29-4017-bc2f-301bf5001f28",
"metadata": {},
"source": [
"_Note: you may want to speed up the execution of `add_texts` and `add_documents` by increasing the concurrency level for_\n",
"_these bulk operations - check out the `*_concurrency` parameters in the class constructor and the `add_texts` docstrings_\n",
"_for more details. Depending on the network and the client machine specifications, your best-performing choice of parameters may vary._"
]
},
{
"cell_type": "markdown",
"id": "c031760a-1fc5-4855-adf2-02ed52fe2181",
@@ -625,7 +652,7 @@
"outputs": [],
"source": [
"ASTRA_DB_ID = input(\"ASTRA_DB_ID = \")\n",
"ASTRA_DB_TOKEN = getpass(\"ASTRA_DB_TOKEN = \")\n",
"ASTRA_DB_APPLICATION_TOKEN = getpass(\"ASTRA_DB_APPLICATION_TOKEN = \")\n",
"\n",
"desired_keyspace = input(\"ASTRA_DB_KEYSPACE (optional, can be left empty) = \")\n",
"if desired_keyspace:\n",
@@ -645,7 +672,7 @@
"\n",
"cassio.init(\n",
" database_id=ASTRA_DB_ID,\n",
" token=ASTRA_DB_TOKEN,\n",
" token=ASTRA_DB_APPLICATION_TOKEN,\n",
" keyspace=ASTRA_DB_KEYSPACE,\n",
")"
]

View File

@@ -38,8 +38,8 @@
},
{
"cell_type": "code",
"execution_count": 2,
"id": "47f9b495-88f1-4286-8d5d-1416103931a7",
"execution_count": null,
"id": "dc37144c-208d-4ab3-9f3a-0407a69fe052",
"metadata": {
"tags": []
},
@@ -51,33 +51,13 @@
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
"\n",
"# Uncomment the following line if you need to initialize FAISS with no AVX2 optimization\n",
"# os.environ['FAISS_NO_AVX2'] = '1'"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# os.environ['FAISS_NO_AVX2'] = '1'\n",
"\n",
"from langchain.document_loaders import TextLoader\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import FAISS"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a3c3999a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.vectorstores import FAISS\n",
"\n",
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../../extras/modules/state_of_the_union.txt\")\n",
@@ -200,31 +180,15 @@
},
{
"cell_type": "code",
"execution_count": 16,
"id": "428a6816",
"metadata": {},
"outputs": [],
"source": [
"db.save_local(\"faiss_index\")"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "56d1841c",
"metadata": {},
"outputs": [],
"source": [
"new_db = FAISS.load_local(\"faiss_index\", embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "39055525",
"execution_count": null,
"id": "1b31fe27-e0b3-42c6-b17c-8270b517ee1f",
"metadata": {},
"outputs": [],
"source": [
"db.save_local(\"faiss_index\")\n",
"\n",
"new_db = FAISS.load_local(\"faiss_index\", embeddings)\n",
"\n",
"docs = new_db.similarity_search(query)"
]
},
@@ -266,30 +230,11 @@
"metadata": {},
"outputs": [],
"source": [
"pkl = db.serialize_to_bytes() # serializes the faiss index"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eb083247",
"metadata": {
"vscode": {
"languageId": "r"
}
},
"outputs": [],
"source": [
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e36e220b",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
"\n",
"pkl = db.serialize_to_bytes() # serializes the faiss\n",
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
"\n",
"db = FAISS.deserialize_from_bytes(\n",
" embeddings=embeddings, serialized=pkl\n",
") # Load the index"
@@ -306,33 +251,14 @@
},
{
"cell_type": "code",
"execution_count": 20,
"id": "6dfd2b78",
"execution_count": null,
"id": "9b8f5e31-3f40-4e94-8d97-5883125efba7",
"metadata": {},
"outputs": [],
"source": [
"db1 = FAISS.from_texts([\"foo\"], embeddings)\n",
"db2 = FAISS.from_texts([\"bar\"], embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "29960da7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'068c473b-d420-487a-806b-fb0ccea7f711': Document(page_content='foo', metadata={})}"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db2 = FAISS.from_texts([\"bar\"], embeddings)\n",
"\n",
"db1.docstore._dict"
]
},

View File

@@ -5,15 +5,16 @@
"id": "683953b3",
"metadata": {},
"source": [
"# Faiss\n",
"# Faiss (Async)\n",
"\n",
">[Facebook AI Similarity Search (Faiss)](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.\n",
"\n",
"[Faiss documentation](https://faiss.ai/).\n",
"\n",
"This notebook shows how to use functionality related to the `FAISS` vector database using asyncio.\n",
"This notebook shows how to use functionality related to the `FAISS` vector database using `asyncio`.\n",
"LangChain implemented the synchronous and asynchronous vector store functions.\n",
"\n",
"See synchronous version [here](https://python.langchain.com/docs/integrations/vectorstores/faiss)."
"See `synchronous` version [here](https://python.langchain.com/docs/integrations/vectorstores/faiss)."
]
},
{
@@ -40,8 +41,8 @@
},
{
"cell_type": "code",
"execution_count": 1,
"id": "47f9b495-88f1-4286-8d5d-1416103931a7",
"execution_count": null,
"id": "971a172a-2d87-4eec-be92-87aa174fec30",
"metadata": {
"tags": []
},
@@ -53,33 +54,13 @@
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
"\n",
"# Uncomment the following line if you need to initialize FAISS with no AVX2 optimization\n",
"# os.environ['FAISS_NO_AVX2'] = '1'"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# os.environ['FAISS_NO_AVX2'] = '1'\n",
"\n",
"from langchain.document_loaders import TextLoader\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import FAISS"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a3c3999a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.vectorstores import FAISS\n",
"\n",
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../../extras/modules/state_of_the_union.txt\")\n",
@@ -87,47 +68,13 @@
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5eabdb75",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"embeddings = OpenAIEmbeddings()\n",
"\n",
"db = await FAISS.afrom_documents(docs, embeddings)\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = await db.asimilarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "4b172de8",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"docs = await db.asimilarity_search(query)\n",
"\n",
"print(docs[0].page_content)"
]
},
@@ -142,33 +89,13 @@
},
{
"cell_type": "code",
"execution_count": 8,
"id": "186ee1d8",
"execution_count": null,
"id": "30bf7c85-a273-45dc-ae9e-f138e330b42e",
"metadata": {},
"outputs": [],
"source": [
"docs_and_scores = await db.asimilarity_search_with_score(query)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "284e04b5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': './state_of_the_union.txt'}),\n",
" 0.36871302)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs_and_scores = await db.asimilarity_search_with_score(query)\n",
"\n",
"docs_and_scores[0]"
]
},
@@ -202,52 +129,17 @@
},
{
"cell_type": "code",
"execution_count": 11,
"id": "428a6816",
"execution_count": null,
"id": "88e11f08-1ac8-45aa-8bc0-56439ef87256",
"metadata": {},
"outputs": [],
"source": [
"db.save_local(\"faiss_index\")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "56d1841c",
"metadata": {},
"outputs": [],
"source": [
"new_db = FAISS.load_local(\"faiss_index\", embeddings, asynchronous=True)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "39055525",
"metadata": {},
"outputs": [],
"source": [
"docs = await new_db.asimilarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "98378c4e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': './state_of_the_union.txt'})"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db.save_local(\"faiss_index\")\n",
"\n",
"new_db = FAISS.load_local(\"faiss_index\", embeddings, asynchronous=True)\n",
"\n",
"docs = await new_db.asimilarity_search(query)\n",
"\n",
"docs[0]"
]
},
@@ -261,26 +153,6 @@
"you can pickle the FAISS Index by these functions. If you use embeddings model which is of 90 mb (sentence-transformers/all-MiniLM-L6-v2 or any other model), the resultant pickle size would be more than 90 mb. the size of the model is also included in the overall size. To overcome this, use the below functions. These functions only serializes FAISS index and size would be much lesser. this can be helpful if you wish to store the index in database like sql."
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "d8faead5",
"metadata": {},
"outputs": [],
"source": [
"pkl = db.serialize_to_bytes() # serializes the faiss index"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eb083247",
"metadata": {},
"outputs": [],
"source": [
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")"
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -288,6 +160,10 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
"\n",
"pkl = db.serialize_to_bytes() # serializes the faiss index\n",
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
"db = FAISS.deserialize_from_bytes(\n",
" embeddings=embeddings, serialized=pkl, asynchronous=True\n",
") # Load the index"
@@ -596,7 +472,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -4,13 +4,15 @@
"cell_type": "markdown",
"id": "357f24224a8e818f",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Hippo\n",
"# Hippo\n",
"\n",
">[Hippo](https://www.transwarp.cn/starwarp) Please visit our official website for how to run a Hippo instance and\n",
"how to use functionality related to the Hippo vector database\n",
">[Transwarp Hippo](https://www.transwarp.cn/en/subproduct/hippo) is an enterprise-level cloud-native distributed vector database that supports storage, retrieval, and management of massive vector-based datasets. It efficiently solves problems such as vector similarity search and high-density vector clustering. `Hippo` features high availability, high performance, and easy scalability. It has many functions, such as multiple vector search indexes, data partitioning and sharding, data persistence, incremental data ingestion, vector scalar field filtering, and mixed queries. It can effectively meet the high real-time search demands of enterprises for massive vector data\n",
"\n",
"## Getting Started\n",
"\n",
@@ -21,12 +23,15 @@
"cell_type": "markdown",
"id": "a92d2ce26df7ac4c",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Installing Dependencies\n",
"\n",
"Initially, we require the installation of certain dependencies, such as OpenAI, Langchain, and Hippo-API. Please note, you should install the appropriate versions tailored to your environment."
"Initially, we require the installation of certain dependencies, such as OpenAI, Langchain, and Hippo-API. Please note, that you should install the appropriate versions tailored to your environment."
]
},
{
@@ -38,7 +43,10 @@
"end_time": "2023-10-30T06:47:54.718488Z",
"start_time": "2023-10-30T06:47:53.563129Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -59,12 +67,15 @@
"cell_type": "markdown",
"id": "554081137df2c252",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"Note: Python version needs to be >=3.8.\n",
"\n",
"## Best Practice\n",
"## Best Practices\n",
"### Importing Dependency Packages"
]
},
@@ -77,7 +88,10 @@
"end_time": "2023-10-30T06:47:56.003409Z",
"start_time": "2023-10-30T06:47:55.998839Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -94,7 +108,10 @@
"cell_type": "markdown",
"id": "dad255dae8aea755",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Loading Knowledge Documents"
@@ -109,7 +126,10 @@
"end_time": "2023-10-30T06:47:59.027869Z",
"start_time": "2023-10-30T06:47:59.023934Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -122,7 +142,10 @@
"cell_type": "markdown",
"id": "e9b93c330f1c6160",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Segmenting the Knowledge Document\n",
@@ -139,7 +162,10 @@
"end_time": "2023-10-30T06:48:00.279351Z",
"start_time": "2023-10-30T06:48:00.275763Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -151,7 +177,10 @@
"cell_type": "markdown",
"id": "eefe28c7c993ffdf",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Declaring the Embedding Model\n",
@@ -167,7 +196,10 @@
"end_time": "2023-10-30T06:48:11.686166Z",
"start_time": "2023-10-30T06:48:11.664355Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -188,7 +220,10 @@
"cell_type": "markdown",
"id": "e60235602ed91d3c",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Declaring Hippo Client"
@@ -203,7 +238,10 @@
"end_time": "2023-10-30T06:48:48.594298Z",
"start_time": "2023-10-30T06:48:48.585267Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -214,7 +252,10 @@
"cell_type": "markdown",
"id": "43ee6dbd765c3172",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Storing the Document"
@@ -229,7 +270,10 @@
"end_time": "2023-10-30T06:51:12.661741Z",
"start_time": "2023-10-30T06:51:06.257156Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -257,7 +301,10 @@
"cell_type": "markdown",
"id": "89077cc9763d5dd0",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Conducting Knowledge-based Question and Answer\n",
@@ -274,7 +321,10 @@
"end_time": "2023-10-30T06:51:28.329351Z",
"start_time": "2023-10-30T06:51:28.318713Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -293,7 +343,10 @@
"cell_type": "markdown",
"id": "a4c5d73016a9db0c",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Acquiring Related Knowledge Based on the Question"
@@ -308,7 +361,10 @@
"end_time": "2023-10-30T06:51:33.195634Z",
"start_time": "2023-10-30T06:51:32.196493Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -328,7 +384,10 @@
"cell_type": "markdown",
"id": "e5adbaaa7086d1ae",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Constructing a Prompt Template"
@@ -343,7 +402,10 @@
"end_time": "2023-10-30T06:51:35.649376Z",
"start_time": "2023-10-30T06:51:35.645763Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -358,7 +420,10 @@
"cell_type": "markdown",
"id": "b36b6a9adbec8a82",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Waiting for the Large Language Model to Generate an Answer"
@@ -373,7 +438,10 @@
"end_time": "2023-10-30T06:52:17.967885Z",
"start_time": "2023-10-30T06:51:37.692819Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -402,7 +470,10 @@
"ExecuteTime": {
"start_time": "2023-10-30T06:42:42.172639Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": []
@@ -410,21 +481,21 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -7,11 +7,11 @@
"source": [
"# SemaDB\n",
"\n",
"> SemaDB is a no fuss vector similarity database for building AI applications. The hosted SemaDB Cloud offers a no fuss developer experience to get started.\n",
"> [SemaDB](https://www.semafind.com/products/semadb) from [SemaFind](https://www.semafind.com) is a no fuss vector similarity database for building AI applications. The hosted `SemaDB Cloud` offers a no fuss developer experience to get started.\n",
"\n",
"The full documentation of the API along with examples and an interactive playground is available on [RapidAPI](https://rapidapi.com/semafind-semadb/api/semadb).\n",
"\n",
"This notebook demonstrates how the `langchain` wrapper can be used with SemaDB Cloud."
"This notebook demonstrates usage of the `SemaDB Cloud` vector store."
]
},
{

View File

@@ -217,7 +217,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -3,12 +3,15 @@
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"# sqlite-vss\n",
"# SQLite-VSS\n",
"\n",
">[sqlite-vss](https://alexgarcia.xyz/sqlite-vss/) is an SQLite extension designed for vector search, emphasizing local-first operations and easy integration into applications without external servers. Leveraging the Faiss library, it offers efficient similarity search and clustering capabilities.\n",
">[SQLite-VSS](https://alexgarcia.xyz/sqlite-vss/) is an `SQLite` extension designed for vector search, emphasizing local-first operations and easy integration into applications without external servers. Leveraging the `Faiss` library, it offers efficient similarity search and clustering capabilities.\n",
"\n",
"This notebook shows how to use the `SQLiteVSS` vector database."
]
@@ -17,7 +20,10 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -28,10 +34,13 @@
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Quickstart"
"## Quickstart"
]
},
{
@@ -42,7 +51,10 @@
"end_time": "2023-09-06T14:55:55.370351Z",
"start_time": "2023-09-06T14:55:53.547755Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -97,10 +109,13 @@
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Using existing sqlite connection"
"## Using existing SQLite connection"
]
},
{
@@ -111,7 +126,10 @@
"end_time": "2023-09-06T14:59:22.086252Z",
"start_time": "2023-09-06T14:59:21.693237Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -166,7 +184,10 @@
"end_time": "2023-09-06T15:01:15.550318Z",
"start_time": "2023-09-06T15:01:15.546428Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -180,7 +201,10 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": []
@@ -188,23 +212,23 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
"nbformat_minor": 4
}

View File

@@ -7,28 +7,30 @@
"source": [
"# Timescale Vector (Postgres)\n",
"\n",
">[Timescale Vector](https://www.timescale.com/ai?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral) is `PostgreSQL++` vector database for AI applications.\n",
"\n",
"This notebook shows how to use the Postgres vector database `Timescale Vector`. You'll learn how to use TimescaleVector for (1) semantic search, (2) time-based vector search, (3) self-querying, and (4) how to create indexes to speed up queries.\n",
"\n",
"## What is Timescale Vector?\n",
"**[Timescale Vector](https://www.timescale.com/ai?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral) is PostgreSQL++ for AI applications.**\n",
"\n",
"Timescale Vector enables you to efficiently store and query millions of vector embeddings in `PostgreSQL`.\n",
"`Timescale Vector` enables you to efficiently store and query millions of vector embeddings in `PostgreSQL`.\n",
"- Enhances `pgvector` with faster and more accurate similarity search on 100M+ vectors via `DiskANN` inspired indexing algorithm.\n",
"- Enables fast time-based vector search via automatic time-based partitioning and indexing.\n",
"- Provides a familiar SQL interface for querying vector embeddings and relational data.\n",
"\n",
"Timescale Vector is cloud PostgreSQL for AI that scales with you from POC to production:\n",
"`Timescale Vector` is cloud `PostgreSQL` for AI that scales with you from POC to production:\n",
"- Simplifies operations by enabling you to store relational metadata, vector embeddings, and time-series data in a single database.\n",
"- Benefits from rock-solid PostgreSQL foundation with enterprise-grade feature liked streaming backups and replication, high-availability and row-level security.\n",
"- Benefits from rock-solid PostgreSQL foundation with enterprise-grade features like streaming backups and replication, high availability and row-level security.\n",
"- Enables a worry-free experience with enterprise-grade security and compliance.\n",
"\n",
"## How to access Timescale Vector\n",
"Timescale Vector is available on [Timescale](https://www.timescale.com/ai?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral), the cloud PostgreSQL platform. (There is no self-hosted version at this time.)\n",
"\n",
"`Timescale Vector` is available on [Timescale](https://www.timescale.com/ai?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral), the cloud PostgreSQL platform. (There is no self-hosted version at this time.)\n",
"\n",
"LangChain users get a 90-day free trial for Timescale Vector.\n",
"- To get started, [signup](https://console.cloud.timescale.com/signup?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral) to Timescale, create a new database and follow this notebook!\n",
"- See the [Timescale Vector explainer blog](https://www.timescale.com/blog/how-we-made-postgresql-the-best-vector-database/?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral) for more details and performance benchmarks.\n",
"- See the [installation instructions](https://github.com/timescale/python-vector) for more details on using Timescale Vector in python."
"- See the [installation instructions](https://github.com/timescale/python-vector) for more details on using Timescale Vector in Python."
]
},
{
@@ -1726,7 +1728,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -1,5 +1,43 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Vearch\n",
"\n",
">[Vearch](https://vearch.readthedocs.io) is the vector search infrastructure for deeping learning and AI applications.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up\n",
"\n",
"Follow [instructions](https://vearch.readthedocs.io/en/latest/quick-start-guide.html#)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install vearch\n",
"\n",
"# OR\n",
"\n",
"!pip install vearch_cluster"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example"
]
},
{
"cell_type": "code",
"execution_count": 2,
@@ -16,10 +54,11 @@
}
],
"source": [
"from langchain.vectorstores.vearch import Vearch\n",
"\n",
"from langchain.document_loaders import TextLoader\n",
"from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain.vectorstores.vearch import Vearch\n",
"from transformers import AutoModel, AutoTokenizer\n",
"\n",
"# repalce to your local model path\n",
@@ -464,7 +503,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.10.13 ('vearch_cluster_langchain')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -478,9 +517,8 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
"version": "3.10.12"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "f1da10a89896267ed34b497c9568817f36cc7ea79826b5cfca4d96376f5b4835"
@@ -488,5 +526,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -4,27 +4,21 @@
"cell_type": "markdown",
"id": "9eb8dfa6fdb71ef5",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"# Zep\n",
"## VectorStore Example for [Zep](https://docs.getzep.com/) - Fast, scalable building blocks for LLM Apps\n",
"\n",
"### More on Zep:\n",
">[Zep](https://docs.getzep.com/) is an open-source platform for LLM apps. Go from a prototype\n",
">built in LangChain or LlamaIndex, or a custom app, to production in minutes without rewriting code.\n",
"\n",
"Zep is an open source platform for productionizing LLM apps. Go from a prototype\n",
"built in LangChain or LlamaIndex, or a custom app, to production in minutes without\n",
"rewriting code.\n",
"## Key Features:\n",
"\n",
"## Fast, Scalable Building Blocks for LLM Apps\n",
"Zep is an open source platform for productionizing LLM apps. Go from a prototype\n",
"built in LangChain or LlamaIndex, or a custom app, to production in minutes without\n",
"rewriting code.\n",
"\n",
"Key Features:\n",
"\n",
"- **Fast!** Zep operates independently of the your chat loop, ensuring a snappy user experience.\n",
"- **Chat History Memory, Archival, and Enrichment**, populate your prompts with relevant chat history, sumamries, named entities, intent data, and more.\n",
"- **Fast!** `Zep` operates independently of your chat loop, ensuring a snappy user experience.\n",
"- **Chat History Memory, Archival, and Enrichment**, populate your prompts with relevant chat history, summaries, named entities, intent data, and more.\n",
"- **Vector Search over Chat History and Documents** Automatic embedding of documents, chat histories, and summaries. Use Zep's similarity or native MMR Re-ranked search to find the most relevant.\n",
"- **Manage Users and their Chat Sessions** Users and their Chat Sessions are first-class citizens in Zep, allowing you to manage user interactions with your bots or agents easily.\n",
"- **Records Retention and Privacy Compliance** Comply with corporate and regulatory mandates for records retention while ensuring compliance with privacy regulations such as CCPA and GDPR. Fulfill *Right To Be Forgotten* requests with a single API call\n",
@@ -34,14 +28,15 @@
"and searching your user's chat history.\n",
"\n",
"## Installation\n",
"Follow the [Zep Quickstart Guide](https://docs.getzep.com/deployment/quickstart/) to install and get started with Zep.\n",
"\n",
"## Usage\n",
"Follow the [Zep Quickstart Guide](https://docs.getzep.com/deployment/quickstart/) to install and get started with Zep.\n",
"\n",
"You'll need your Zep API URL and optionally an API key to use the Zep VectorStore. \n",
"See the [Zep docs](https://docs.getzep.com) for more information.\n",
"\n",
"In the examples below, we're using Zep's auto-embedding feature which automatically embed documents on the Zep server \n",
"## Usage\n",
"\n",
"In the examples below, we're using Zep's auto-embedding feature which automatically embeds documents on the Zep server \n",
"using low-latency embedding models.\n",
"\n",
"## Note\n",
@@ -55,7 +50,10 @@
"cell_type": "markdown",
"id": "9a3a11aab1412d98",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Load or create a Collection from documents"
@@ -70,7 +68,10 @@
"end_time": "2023-08-13T01:07:50.672390Z",
"start_time": "2023-08-13T01:07:48.777799Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
@@ -124,7 +125,10 @@
"end_time": "2023-08-13T01:07:53.807663Z",
"start_time": "2023-08-13T01:07:50.671241Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -170,7 +174,10 @@
"cell_type": "markdown",
"id": "94ca9dfa7d0ecaa5",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Simarility Search Query over the Collection"
@@ -185,7 +192,10 @@
"end_time": "2023-08-13T01:07:54.195988Z",
"start_time": "2023-08-13T01:07:53.808550Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -237,7 +247,10 @@
"cell_type": "markdown",
"id": "e02b61a9af0b2c80",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Search over Collection Re-ranked by MMR\n",
@@ -254,7 +267,10 @@
"end_time": "2023-08-13T01:07:54.394873Z",
"start_time": "2023-08-13T01:07:54.180901Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -304,7 +320,10 @@
"cell_type": "markdown",
"id": "42455e31d4ab0d68",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"# Filter by Metadata\n",
@@ -321,7 +340,10 @@
"end_time": "2023-08-13T01:08:06.323569Z",
"start_time": "2023-08-13T01:07:54.381822Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -367,10 +389,13 @@
"cell_type": "markdown",
"id": "5b225f3ae1e61de8",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### We see results from both books. Note the `source` metadata"
"We see results from both books. Note the `source` metadata"
]
},
{
@@ -382,7 +407,10 @@
"end_time": "2023-08-13T01:08:06.504769Z",
"start_time": "2023-08-13T01:08:06.325435Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -431,10 +459,13 @@
"cell_type": "markdown",
"id": "7b81d7cae351a1ec",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"### Let's try again using a filter for only the Sherlock Holmes document."
"Now, we set up a filter"
]
},
{
@@ -446,7 +477,10 @@
"end_time": "2023-08-13T01:08:06.672836Z",
"start_time": "2023-08-13T01:08:06.505944Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
@@ -515,7 +549,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -529,7 +563,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -18,23 +18,23 @@ This encompasses several key modules.
**[Document loaders](/docs/modules/data_connection/document_loaders/)**
Load documents from many different sources.
**Document loaders** load documents from many different sources.
LangChain provides over 100 different document loaders as well as integrations with other major providers in the space,
like AirByte and Unstructured.
We provide integrations to load all types of documents (HTML, PDF, code) from all types of locations (private s3 buckets, public websites).
LangChain provides integrations to load all types of documents (HTML, PDF, code) from all types of locations (private S3 buckets, public websites).
**[Document transformers](/docs/modules/data_connection/document_transformers/)**
A key part of retrieval is fetching only the relevant parts of documents.
This involves several transformation steps in order to best prepare the documents for retrieval.
This involves several transformation steps to prepare the documents for retrieval.
One of the primary ones here is splitting (or chunking) a large document into smaller chunks.
LangChain provides several different algorithms for doing this, as well as logic optimized for specific document types (code, markdown, etc).
LangChain provides several transformation algorithms for doing this, as well as logic optimized for specific document types (code, markdown, etc).
**[Text embedding models](/docs/modules/data_connection/text_embedding/)**
Another key part of retrieval has become creating embeddings for documents.
Another key part of retrieval is creating embeddings for documents.
Embeddings capture the semantic meaning of the text, allowing you to quickly and
efficiently find other pieces of text that are similar.
efficiently find other pieces of a text that are similar.
LangChain provides integrations with over 25 different embedding providers and methods,
from open-source to proprietary API,
allowing you to choose the one best suited for your needs.
@@ -51,7 +51,7 @@ LangChain exposes a standard interface, allowing you to easily swap between vect
Once the data is in the database, you still need to retrieve it.
LangChain supports many different retrieval algorithms and is one of the places where we add the most value.
We support basic methods that are easy to get started - namely simple semantic search.
LangChain supports basic methods that are easy to get started - namely simple semantic search.
However, we have also added a collection of algorithms on top of this to increase performance.
These include:
@@ -60,3 +60,13 @@ These include:
- [Ensemble Retriever](/docs/modules/data_connection/retrievers/ensemble): Sometimes you may want to retrieve documents from multiple different sources, or using multiple different algorithms. The ensemble retriever allows you to easily do this.
- And more!
**[Indexing](/docs/modules/data_connection/indexing)**
The LangChain **Indexing API** syncs your data from any source into a vector store,
helping you:
- Avoid writing duplicated content into the vector store
- Avoid re-writing unchanged content
- Avoid re-computing embeddings over unchanged content
All of which should save you time and money, as well as improve your vector search results.

View File

@@ -1051,14 +1051,6 @@
":::"
]
},
{
"cell_type": "markdown",
"id": "e6e5191f-43e6-4fa0-9ba5-db002fcaacf3",
"metadata": {},
"source": [
"Of course, we've written here the logic for using chat history when it's provided, but we haven't actually added functionality for storing chat history for each user session. This is something that's fairly application specific and is usually best handled outside of LangChain."
]
},
{
"cell_type": "markdown",
"id": "580e18de-132d-4009-ba67-4aaf2c7717a2",

322
docs/package-lock.json generated
View File

@@ -7,6 +7,7 @@
"": {
"name": "docs",
"version": "0.0.0",
"hasInstallScript": true,
"dependencies": {
"@docusaurus/core": "2.4.0",
"@docusaurus/preset-classic": "2.4.0",
@@ -32,6 +33,7 @@
"eslint-plugin-jsx-a11y": "^6.6.0",
"eslint-plugin-react": "^7.30.1",
"eslint-plugin-react-hooks": "^4.6.0",
"patch-package": "^8.0.0",
"prettier": "^2.7.1",
"typedoc": "^0.24.4",
"typedoc-plugin-markdown": "next"
@@ -4029,6 +4031,12 @@
"resolved": "https://registry.npmjs.org/@xtuc/long/-/long-4.2.2.tgz",
"integrity": "sha512-NuHqBY1PB/D8xU6s/thBgOAiAP7HOYDQ32+BFZILJ8ivkUkAHQnWfn6WhL79Owj1qmUnoN/YPhktdIoucipkAQ=="
},
"node_modules/@yarnpkg/lockfile": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@yarnpkg/lockfile/-/lockfile-1.1.0.tgz",
"integrity": "sha512-GpSwvyXOcOOlV70vbnzjj4fW5xW/FdUF6nQEt1ENy7m4ZCczi1+/buVUPAqmGfqznsORNFzUMjctTIp8a9tuCQ==",
"dev": true
},
"node_modules/accepts": {
"version": "1.3.8",
"resolved": "https://registry.npmjs.org/accepts/-/accepts-1.3.8.tgz",
@@ -4900,12 +4908,13 @@
}
},
"node_modules/call-bind": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.2.tgz",
"integrity": "sha512-7O+FbCihrB5WGbFYesctwmTKae6rOiIzmz1icreWJ+0aA7LJfuqhEso2T9ncpcFtzMQtzXf2QGGueWJGTYsqrA==",
"version": "1.0.5",
"resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.5.tgz",
"integrity": "sha512-C3nQxfFZxFRVoJoGKKI8y3MOEo129NQ+FgQ08iye+Mk4zNZZGdjfs06bVTr+DBSlA66Q2VEcMki/cUCP4SercQ==",
"dependencies": {
"function-bind": "^1.1.1",
"get-intrinsic": "^1.0.2"
"function-bind": "^1.1.2",
"get-intrinsic": "^1.2.1",
"set-function-length": "^1.1.1"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
@@ -5992,6 +6001,19 @@
"resolved": "https://registry.npmjs.org/defer-to-connect/-/defer-to-connect-1.1.3.tgz",
"integrity": "sha512-0ISdNousHvZT2EiFlZeZAHBUvSxmKswVCEf8hW7KWgG4a8MVEu/3Vb6uWYozkjylyCxe0JBIiRB1jV45S70WVQ=="
},
"node_modules/define-data-property": {
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/define-data-property/-/define-data-property-1.1.1.tgz",
"integrity": "sha512-E7uGkTzkk1d0ByLeSc6ZsFS79Axg+m1P/VsgYsxHgiuc3tFSj+MjMIwe90FC4lOAZzNBdY7kkO2P2wKdsQ1vgQ==",
"dependencies": {
"get-intrinsic": "^1.2.1",
"gopd": "^1.0.1",
"has-property-descriptors": "^1.0.0"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/define-lazy-prop": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/define-lazy-prop/-/define-lazy-prop-2.0.0.tgz",
@@ -7445,6 +7467,15 @@
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/find-yarn-workspace-root": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/find-yarn-workspace-root/-/find-yarn-workspace-root-2.0.0.tgz",
"integrity": "sha512-1IMnbjt4KzsQfnhnzNd8wUEgXZ44IzZaZmnLYx7D5FZlaHt2gW20Cri8Q+E/t5tIj4+epTBub+2Zxu/vNILzqQ==",
"dev": true,
"dependencies": {
"micromatch": "^4.0.2"
}
},
"node_modules/flat-cache": {
"version": "3.0.4",
"resolved": "https://registry.npmjs.org/flat-cache/-/flat-cache-3.0.4.tgz",
@@ -7755,9 +7786,12 @@
}
},
"node_modules/function-bind": {
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz",
"integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A=="
"version": "1.1.2",
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz",
"integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==",
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/function.prototype.name": {
"version": "1.1.5",
@@ -7983,7 +8017,6 @@
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/gopd/-/gopd-1.0.1.tgz",
"integrity": "sha512-d65bNlIadxvpb/A2abVdlqKqV563juRnZ1Wtk6s1sIR8uNsXR70xqIzVqxVf1eTqDunwT2MkczEeaezCKTZhwA==",
"dev": true,
"dependencies": {
"get-intrinsic": "^1.1.3"
},
@@ -9408,12 +9441,36 @@
"resolved": "https://registry.npmjs.org/json-schema-traverse/-/json-schema-traverse-0.4.1.tgz",
"integrity": "sha512-xbbCH5dCYU5T8LcEhhuh7HJ88HXuW3qsI3Y0zOZFKfZEHcpWiHU/Jxzk629Brsab/mMiHQti9wMP+845RPe3Vg=="
},
"node_modules/json-stable-stringify": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/json-stable-stringify/-/json-stable-stringify-1.1.0.tgz",
"integrity": "sha512-zfA+5SuwYN2VWqN1/5HZaDzQKLJHaBVMZIIM+wuYjdptkaQsqzDdqjqf+lZZJUuJq1aanHiY8LhH8LmH+qBYJA==",
"dev": true,
"dependencies": {
"call-bind": "^1.0.5",
"isarray": "^2.0.5",
"jsonify": "^0.0.1",
"object-keys": "^1.1.1"
},
"engines": {
"node": ">= 0.4"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/json-stable-stringify-without-jsonify": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/json-stable-stringify-without-jsonify/-/json-stable-stringify-without-jsonify-1.0.1.tgz",
"integrity": "sha512-Bdboy+l7tA3OGW6FjyFHWkP5LuByj1Tk33Ljyq0axyzdk9//JSi2u3fP1QSmd1KNwq6VOKYGlAu87CisVir6Pw==",
"devOptional": true
},
"node_modules/json-stable-stringify/node_modules/isarray": {
"version": "2.0.5",
"resolved": "https://registry.npmjs.org/isarray/-/isarray-2.0.5.tgz",
"integrity": "sha512-xHjhDr3cNBK0BzdUJSPXZntQUx/mwMS5Rw4A7lPJ90XGAO6ISP/ePDNuo0vhqOZU+UD5JoodwCAAoZQd3FeAKw==",
"dev": true
},
"node_modules/json5": {
"version": "2.2.3",
"resolved": "https://registry.npmjs.org/json5/-/json5-2.2.3.tgz",
@@ -9442,6 +9499,15 @@
"graceful-fs": "^4.1.6"
}
},
"node_modules/jsonify": {
"version": "0.0.1",
"resolved": "https://registry.npmjs.org/jsonify/-/jsonify-0.0.1.tgz",
"integrity": "sha512-2/Ki0GcmuqSrgFyelQq9M05y7PS0mEwuIzrf3f1fPqkVDVRvZrPZtVSMHxdgo8Aq0sxAOb/cr2aqqA3LeWHVPg==",
"dev": true,
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/jsx-ast-utils": {
"version": "3.3.4",
"resolved": "https://registry.npmjs.org/jsx-ast-utils/-/jsx-ast-utils-3.3.4.tgz",
@@ -9473,6 +9539,15 @@
"node": ">=0.10.0"
}
},
"node_modules/klaw-sync": {
"version": "6.0.0",
"resolved": "https://registry.npmjs.org/klaw-sync/-/klaw-sync-6.0.0.tgz",
"integrity": "sha512-nIeuVSzdCCs6TDPTqI8w1Yre34sSq7AkZ4B3sfOBbI2CgVSB4Du4aLQijFU2+lhAFCwt9+42Hel6lQNIv6AntQ==",
"dev": true,
"dependencies": {
"graceful-fs": "^4.1.11"
}
},
"node_modules/kleur": {
"version": "3.0.3",
"resolved": "https://registry.npmjs.org/kleur/-/kleur-3.0.3.tgz",
@@ -10317,6 +10392,15 @@
"node": ">= 0.8.0"
}
},
"node_modules/os-tmpdir": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/os-tmpdir/-/os-tmpdir-1.0.2.tgz",
"integrity": "sha512-D2FR03Vir7FIu45XBY20mTb+/ZSWB00sjU9jdQXt83gDrI4Ztz5Fs7/yy74g2N5SVQY4xY1qDr4rNddwYRVX0g==",
"dev": true,
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/p-cancelable": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/p-cancelable/-/p-cancelable-1.1.0.tgz",
@@ -10500,6 +10584,200 @@
"tslib": "^2.0.3"
}
},
"node_modules/patch-package": {
"version": "8.0.0",
"resolved": "https://registry.npmjs.org/patch-package/-/patch-package-8.0.0.tgz",
"integrity": "sha512-da8BVIhzjtgScwDJ2TtKsfT5JFWz1hYoBl9rUQ1f38MC2HwnEIkK8VN3dKMKcP7P7bvvgzNDbfNHtx3MsQb5vA==",
"dev": true,
"dependencies": {
"@yarnpkg/lockfile": "^1.1.0",
"chalk": "^4.1.2",
"ci-info": "^3.7.0",
"cross-spawn": "^7.0.3",
"find-yarn-workspace-root": "^2.0.0",
"fs-extra": "^9.0.0",
"json-stable-stringify": "^1.0.2",
"klaw-sync": "^6.0.0",
"minimist": "^1.2.6",
"open": "^7.4.2",
"rimraf": "^2.6.3",
"semver": "^7.5.3",
"slash": "^2.0.0",
"tmp": "^0.0.33",
"yaml": "^2.2.2"
},
"bin": {
"patch-package": "index.js"
},
"engines": {
"node": ">=14",
"npm": ">5"
}
},
"node_modules/patch-package/node_modules/ansi-styles": {
"version": "4.3.0",
"resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-4.3.0.tgz",
"integrity": "sha512-zbB9rCJAT1rbjiVDb2hqKFHNYLxgtk8NURxZ3IZwD3F6NtxbXZQCnnSi1Lkx+IDohdPlFp222wVALIheZJQSEg==",
"dev": true,
"dependencies": {
"color-convert": "^2.0.1"
},
"engines": {
"node": ">=8"
},
"funding": {
"url": "https://github.com/chalk/ansi-styles?sponsor=1"
}
},
"node_modules/patch-package/node_modules/chalk": {
"version": "4.1.2",
"resolved": "https://registry.npmjs.org/chalk/-/chalk-4.1.2.tgz",
"integrity": "sha512-oKnbhFyRIXpUuez8iBMmyEa4nbj4IOQyuhc/wy9kY7/WVPcwIO9VA668Pu8RkO7+0G76SLROeyw9CpQ061i4mA==",
"dev": true,
"dependencies": {
"ansi-styles": "^4.1.0",
"supports-color": "^7.1.0"
},
"engines": {
"node": ">=10"
},
"funding": {
"url": "https://github.com/chalk/chalk?sponsor=1"
}
},
"node_modules/patch-package/node_modules/color-convert": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/color-convert/-/color-convert-2.0.1.tgz",
"integrity": "sha512-RRECPsj7iu/xb5oKYcsFHSppFNnsj/52OVTRKb4zP5onXwVF3zVmmToNcOfGC+CRDpfK/U584fMg38ZHCaElKQ==",
"dev": true,
"dependencies": {
"color-name": "~1.1.4"
},
"engines": {
"node": ">=7.0.0"
}
},
"node_modules/patch-package/node_modules/color-name": {
"version": "1.1.4",
"resolved": "https://registry.npmjs.org/color-name/-/color-name-1.1.4.tgz",
"integrity": "sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==",
"dev": true
},
"node_modules/patch-package/node_modules/fs-extra": {
"version": "9.1.0",
"resolved": "https://registry.npmjs.org/fs-extra/-/fs-extra-9.1.0.tgz",
"integrity": "sha512-hcg3ZmepS30/7BSFqRvoo3DOMQu7IjqxO5nCDt+zM9XWjb33Wg7ziNT+Qvqbuc3+gWpzO02JubVyk2G4Zvo1OQ==",
"dev": true,
"dependencies": {
"at-least-node": "^1.0.0",
"graceful-fs": "^4.2.0",
"jsonfile": "^6.0.1",
"universalify": "^2.0.0"
},
"engines": {
"node": ">=10"
}
},
"node_modules/patch-package/node_modules/has-flag": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/has-flag/-/has-flag-4.0.0.tgz",
"integrity": "sha512-EykJT/Q1KjTWctppgIAgfSO0tKVuZUjhgMr17kqTumMl6Afv3EISleU7qZUzoXDFTAHTDC4NOoG/ZxU3EvlMPQ==",
"dev": true,
"engines": {
"node": ">=8"
}
},
"node_modules/patch-package/node_modules/lru-cache": {
"version": "6.0.0",
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-6.0.0.tgz",
"integrity": "sha512-Jo6dJ04CmSjuznwJSS3pUeWmd/H0ffTlkXXgwZi+eq1UCmqQwCh+eLsYOYCwY991i2Fah4h1BEMCx4qThGbsiA==",
"dev": true,
"dependencies": {
"yallist": "^4.0.0"
},
"engines": {
"node": ">=10"
}
},
"node_modules/patch-package/node_modules/open": {
"version": "7.4.2",
"resolved": "https://registry.npmjs.org/open/-/open-7.4.2.tgz",
"integrity": "sha512-MVHddDVweXZF3awtlAS+6pgKLlm/JgxZ90+/NBurBoQctVOOB/zDdVjcyPzQ+0laDGbsWgrRkflI65sQeOgT9Q==",
"dev": true,
"dependencies": {
"is-docker": "^2.0.0",
"is-wsl": "^2.1.1"
},
"engines": {
"node": ">=8"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/patch-package/node_modules/rimraf": {
"version": "2.7.1",
"resolved": "https://registry.npmjs.org/rimraf/-/rimraf-2.7.1.tgz",
"integrity": "sha512-uWjbaKIK3T1OSVptzX7Nl6PvQ3qAGtKEtVRjRuazjfL3Bx5eI409VZSqgND+4UNnmzLVdPj9FqFJNPqBZFve4w==",
"dev": true,
"dependencies": {
"glob": "^7.1.3"
},
"bin": {
"rimraf": "bin.js"
}
},
"node_modules/patch-package/node_modules/semver": {
"version": "7.5.4",
"resolved": "https://registry.npmjs.org/semver/-/semver-7.5.4.tgz",
"integrity": "sha512-1bCSESV6Pv+i21Hvpxp3Dx+pSD8lIPt8uVjRrxAUt/nbswYc+tK6Y2btiULjd4+fnq15PX+nqQDC7Oft7WkwcA==",
"dev": true,
"dependencies": {
"lru-cache": "^6.0.0"
},
"bin": {
"semver": "bin/semver.js"
},
"engines": {
"node": ">=10"
}
},
"node_modules/patch-package/node_modules/slash": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/slash/-/slash-2.0.0.tgz",
"integrity": "sha512-ZYKh3Wh2z1PpEXWr0MpSBZ0V6mZHAQfYevttO11c51CaWjGTaadiKZ+wVt1PbMlDV5qhMFslpZCemhwOK7C89A==",
"dev": true,
"engines": {
"node": ">=6"
}
},
"node_modules/patch-package/node_modules/supports-color": {
"version": "7.2.0",
"resolved": "https://registry.npmjs.org/supports-color/-/supports-color-7.2.0.tgz",
"integrity": "sha512-qpCAvRl9stuOHveKsn7HncJRvv501qIacKzQlO/+Lwxc9+0q2wLyv4Dfvt80/DPn2pqOBsJdDiogXGR9+OvwRw==",
"dev": true,
"dependencies": {
"has-flag": "^4.0.0"
},
"engines": {
"node": ">=8"
}
},
"node_modules/patch-package/node_modules/yallist": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/yallist/-/yallist-4.0.0.tgz",
"integrity": "sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A==",
"dev": true
},
"node_modules/patch-package/node_modules/yaml": {
"version": "2.3.4",
"resolved": "https://registry.npmjs.org/yaml/-/yaml-2.3.4.tgz",
"integrity": "sha512-8aAvwVUSHpfEqTQ4w/KMlf3HcRdt50E5ODIQJBw1fQ5RL34xabzxtUlzTXVqc4rkZsPbvrXKWnABCD7kWSmocA==",
"dev": true,
"engines": {
"node": ">= 14"
}
},
"node_modules/path-exists": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/path-exists/-/path-exists-4.0.0.tgz",
@@ -12741,6 +13019,20 @@
"node": ">= 0.8.0"
}
},
"node_modules/set-function-length": {
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/set-function-length/-/set-function-length-1.1.1.tgz",
"integrity": "sha512-VoaqjbBJKiWtg4yRcKBQ7g7wnGnLV3M8oLvVWwOk2PdYY6PEFegR1vezXR0tw6fZGF9csVakIRjrJiy2veSBFQ==",
"dependencies": {
"define-data-property": "^1.1.1",
"get-intrinsic": "^1.2.1",
"gopd": "^1.0.1",
"has-property-descriptors": "^1.0.0"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/setimmediate": {
"version": "1.0.5",
"resolved": "https://registry.npmjs.org/setimmediate/-/setimmediate-1.0.5.tgz",
@@ -13467,6 +13759,18 @@
"resolved": "https://registry.npmjs.org/tiny-warning/-/tiny-warning-1.0.3.tgz",
"integrity": "sha512-lBN9zLN/oAf68o3zNXYrdCt1kP8WsiGW8Oo2ka41b2IM5JL/S1CTyX1rW0mb/zSuJun0ZUrDxx4sqvYS2FWzPA=="
},
"node_modules/tmp": {
"version": "0.0.33",
"resolved": "https://registry.npmjs.org/tmp/-/tmp-0.0.33.tgz",
"integrity": "sha512-jRCJlojKnZ3addtTOjdIqoRuPEKBvNXcGYqzO6zWZX8KfKEpnGY5jfggJQ3EjKuu8D4bJRr0y+cYJFmYbImXGw==",
"dev": true,
"dependencies": {
"os-tmpdir": "~1.0.2"
},
"engines": {
"node": ">=0.6.0"
}
},
"node_modules/to-fast-properties": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/to-fast-properties/-/to-fast-properties-2.0.0.tgz",

View File

@@ -16,7 +16,8 @@
"lint:fix": "yarn lint --fix",
"precommit": "lint-staged",
"format": "prettier --write \"**/*.{js,jsx,ts,tsx,md,mdx}\"",
"format:check": "prettier --check \"**/*.{js,jsx,ts,tsx,md,mdx}\""
"format:check": "prettier --check \"**/*.{js,jsx,ts,tsx,md,mdx}\"",
"postinstall": "patch-package"
},
"dependencies": {
"@docusaurus/core": "2.4.0",
@@ -43,6 +44,7 @@
"eslint-plugin-jsx-a11y": "^6.6.0",
"eslint-plugin-react": "^7.30.1",
"eslint-plugin-react-hooks": "^4.6.0",
"patch-package": "^8.0.0",
"prettier": "^2.7.1",
"typedoc": "^0.24.4",
"typedoc-plugin-markdown": "next"

View File

@@ -0,0 +1,432 @@
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/AnnouncementBar/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/AnnouncementBar/styles.module.css
index b09a217..930ede8 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/AnnouncementBar/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/AnnouncementBar/styles.module.css
@@ -50,7 +50,7 @@ html[data-announcement-bar-initially-dismissed='true'] .announcementBar {
}
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
:root {
--docusaurus-announcement-bar-height: 30px;
}
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/BlogSidebar/Desktop/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/BlogSidebar/Desktop/styles.module.css
index 73a4bd8..008a8f8 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/BlogSidebar/Desktop/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/BlogSidebar/Desktop/styles.module.css
@@ -38,7 +38,7 @@
color: var(--ifm-color-primary) !important;
}
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.sidebar {
display: none;
}
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocCategoryGeneratedIndexPage/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocCategoryGeneratedIndexPage/styles.module.css
index 94f6951..17ac45f 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocCategoryGeneratedIndexPage/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocCategoryGeneratedIndexPage/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.generatedIndexPage {
max-width: 75% !important;
}
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Footer/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Footer/styles.module.css
index 7d05961..1f94f96 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Footer/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Footer/styles.module.css
@@ -11,7 +11,7 @@
font-size: smaller;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.lastUpdated {
text-align: right;
}
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Layout/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Layout/styles.module.css
index 784498f..e7c6043 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Layout/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/Layout/styles.module.css
@@ -10,7 +10,7 @@
margin-top: 0;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.docItemCol {
max-width: 75% !important;
}
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/TOC/Mobile/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/TOC/Mobile/styles.module.css
index a298b5e..e110e10 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/TOC/Mobile/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocItem/TOC/Mobile/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
/* Prevent hydration FOUC, as the mobile TOC needs to be server-rendered */
.tocMobile {
display: none;
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Main/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Main/styles.module.css
index 0754fce..aef0565 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Main/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Main/styles.module.css
@@ -10,7 +10,7 @@
width: 100%;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.docMainContainer {
flex-grow: 1;
max-width: calc(100% - var(--doc-sidebar-width));
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css
index f5e5878..23390f9 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.expandButton {
position: absolute;
top: 0;
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/styles.module.css
index f821f3b..3321785 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocPage/Layout/Sidebar/styles.module.css
@@ -14,7 +14,7 @@
display: none;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.docSidebarContainer {
display: block;
width: var(--doc-sidebar-width);
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/CollapseButton/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/CollapseButton/styles.module.css
index c5026e8..901e80d 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/CollapseButton/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/CollapseButton/styles.module.css
@@ -15,7 +15,7 @@
--docusaurus-collapse-button-bg-hover: rgb(255 255 255 / 10%);
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.collapseSidebarButton {
display: block !important;
background-color: var(--docusaurus-collapse-button-bg);
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/Content/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/Content/styles.module.css
index 8d8c7e8..7cadb05 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/Content/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/Content/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.menu {
flex-grow: 1;
padding: 0.5rem;
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/styles.module.css
index e4d4724..8c16f4a 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebar/Desktop/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.sidebar {
display: flex;
flex-direction: column;
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebarItem/Html/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebarItem/Html/styles.module.css
index 2bb6934..af33ba1 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebarItem/Html/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/DocSidebarItem/Html/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.menuHtmlItem {
padding: var(--ifm-menu-link-padding-vertical)
var(--ifm-menu-link-padding-horizontal);
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Content/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Content/styles.module.css
index 3cbe701..b504f5c 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Content/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Content/styles.module.css
@@ -8,7 +8,7 @@
/*
Hide color mode toggle in small viewports
*/
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.colorModeToggle {
display: none;
}
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Search/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Search/styles.module.css
index ca386bb..d062be9 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Search/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/Navbar/Search/styles.module.css
@@ -5,14 +5,14 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.searchBox {
position: absolute;
right: var(--ifm-navbar-padding-horizontal);
}
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.searchBox {
padding: var(--ifm-navbar-item-padding-vertical)
var(--ifm-navbar-item-padding-horizontal);
diff --git a/node_modules/@docusaurus/theme-classic/lib/theme/TOC/styles.module.css b/node_modules/@docusaurus/theme-classic/lib/theme/TOC/styles.module.css
index 4b6d2bc..397a75f 100644
--- a/node_modules/@docusaurus/theme-classic/lib/theme/TOC/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/lib/theme/TOC/styles.module.css
@@ -12,7 +12,7 @@
top: calc(var(--ifm-navbar-height) + 1rem);
}
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.tableOfContents {
display: none;
}
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/AnnouncementBar/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/AnnouncementBar/styles.module.css
index b09a217..930ede8 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/AnnouncementBar/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/AnnouncementBar/styles.module.css
@@ -50,7 +50,7 @@ html[data-announcement-bar-initially-dismissed='true'] .announcementBar {
}
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
:root {
--docusaurus-announcement-bar-height: 30px;
}
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/BlogSidebar/Desktop/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/BlogSidebar/Desktop/styles.module.css
index 73a4bd8..008a8f8 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/BlogSidebar/Desktop/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/BlogSidebar/Desktop/styles.module.css
@@ -38,7 +38,7 @@
color: var(--ifm-color-primary) !important;
}
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.sidebar {
display: none;
}
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocCategoryGeneratedIndexPage/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocCategoryGeneratedIndexPage/styles.module.css
index 94f6951..17ac45f 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocCategoryGeneratedIndexPage/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocCategoryGeneratedIndexPage/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.generatedIndexPage {
max-width: 75% !important;
}
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Footer/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Footer/styles.module.css
index 7d05961..1f94f96 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Footer/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Footer/styles.module.css
@@ -11,7 +11,7 @@
font-size: smaller;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.lastUpdated {
text-align: right;
}
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Layout/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Layout/styles.module.css
index 784498f..e7c6043 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Layout/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocItem/Layout/styles.module.css
@@ -10,7 +10,7 @@
margin-top: 0;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.docItemCol {
max-width: 75% !important;
}
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocItem/TOC/Mobile/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocItem/TOC/Mobile/styles.module.css
index a298b5e..e110e10 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocItem/TOC/Mobile/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocItem/TOC/Mobile/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
/* Prevent hydration FOUC, as the mobile TOC needs to be server-rendered */
.tocMobile {
display: none;
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Main/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Main/styles.module.css
index 0754fce..aef0565 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Main/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Main/styles.module.css
@@ -10,7 +10,7 @@
width: 100%;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.docMainContainer {
flex-grow: 1;
max-width: calc(100% - var(--doc-sidebar-width));
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css
index f5e5878..23390f9 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/ExpandButton/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.expandButton {
position: absolute;
top: 0;
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/styles.module.css
index f821f3b..3321785 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocPage/Layout/Sidebar/styles.module.css
@@ -14,7 +14,7 @@
display: none;
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.docSidebarContainer {
display: block;
width: var(--doc-sidebar-width);
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/CollapseButton/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/CollapseButton/styles.module.css
index c5026e8..901e80d 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/CollapseButton/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/CollapseButton/styles.module.css
@@ -15,7 +15,7 @@
--docusaurus-collapse-button-bg-hover: rgb(255 255 255 / 10%);
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.collapseSidebarButton {
display: block !important;
background-color: var(--docusaurus-collapse-button-bg);
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/Content/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/Content/styles.module.css
index 8d8c7e8..7cadb05 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/Content/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/Content/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.menu {
flex-grow: 1;
padding: 0.5rem;
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/styles.module.css
index e4d4724..8c16f4a 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebar/Desktop/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.sidebar {
display: flex;
flex-direction: column;
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebarItem/Html/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebarItem/Html/styles.module.css
index 2bb6934..af33ba1 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/DocSidebarItem/Html/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/DocSidebarItem/Html/styles.module.css
@@ -5,7 +5,7 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.menuHtmlItem {
padding: var(--ifm-menu-link-padding-vertical)
var(--ifm-menu-link-padding-horizontal);
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Content/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Content/styles.module.css
index 3cbe701..b504f5c 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Content/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Content/styles.module.css
@@ -8,7 +8,7 @@
/*
Hide color mode toggle in small viewports
*/
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.colorModeToggle {
display: none;
}
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Search/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Search/styles.module.css
index ca386bb..d062be9 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Search/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/Navbar/Search/styles.module.css
@@ -5,14 +5,14 @@
* LICENSE file in the root directory of this source tree.
*/
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.searchBox {
position: absolute;
right: var(--ifm-navbar-padding-horizontal);
}
}
-@media (min-width: 997px) {
+@media (min-width: 1281px) {
.searchBox {
padding: var(--ifm-navbar-item-padding-vertical)
var(--ifm-navbar-item-padding-horizontal);
diff --git a/node_modules/@docusaurus/theme-classic/src/theme/TOC/styles.module.css b/node_modules/@docusaurus/theme-classic/src/theme/TOC/styles.module.css
index 4b6d2bc..397a75f 100644
--- a/node_modules/@docusaurus/theme-classic/src/theme/TOC/styles.module.css
+++ b/node_modules/@docusaurus/theme-classic/src/theme/TOC/styles.module.css
@@ -12,7 +12,7 @@
top: calc(var(--ifm-navbar-height) + 1rem);
}
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.tableOfContents {
display: none;
}

View File

@@ -0,0 +1,13 @@
diff --git a/node_modules/@docusaurus/theme-common/lib/hooks/useWindowSize.js b/node_modules/@docusaurus/theme-common/lib/hooks/useWindowSize.js
index 2c9dcb0..13eda59 100644
--- a/node_modules/@docusaurus/theme-common/lib/hooks/useWindowSize.js
+++ b/node_modules/@docusaurus/theme-common/lib/hooks/useWindowSize.js
@@ -11,7 +11,7 @@ const windowSizes = {
mobile: 'mobile',
ssr: 'ssr',
};
-const DesktopThresholdWidth = 996;
+const DesktopThresholdWidth = 1280;
function getWindowSize() {
if (!ExecutionEnvironment.canUseDOM) {
return windowSizes.ssr;

View File

@@ -0,0 +1,13 @@
diff --git a/node_modules/infima/dist/css/default/default.css b/node_modules/infima/dist/css/default/default.css
index 3ffe402..515944f 100644
--- a/node_modules/infima/dist/css/default/default.css
+++ b/node_modules/infima/dist/css/default/default.css
@@ -2930,7 +2930,7 @@ html[data-theme='dark'] {
}
}
-@media (max-width: 996px) {
+@media (max-width: 1280px) {
.col {
--ifm-col-width: 100%;
flex-basis: var(--ifm-col-width);

View File

@@ -492,10 +492,18 @@
"source": "/docs/integrations/providers/cassandra",
"destination": "/docs/integrations/providers/astradb"
},
{
"source": "/docs/integrations/vectorstores/vectorstores/semadb",
"destination": "/docs/integrations/vectorstores/semadb"
},
{
"source": "/docs/integrations/vectorstores/cassandra",
"destination": "/docs/integrations/vectorstores/astradb"
},
{
"source": "/docs/integrations/vectorstores/async_faiss",
"destination": "/docs/integrations/vectorstores/faiss_async"
},
{
"source": "/docs/integrations/cerebriumai",
"destination": "/docs/integrations/providers/cerebriumai"

View File

@@ -7,7 +7,7 @@ from langchain_cli.namespaces import app as app_namespace
from langchain_cli.namespaces import template as template_namespace
from langchain_cli.utils.packages import get_langserve_export, get_package_root
__version__ = "0.0.17"
__version__ = "0.0.18"
app = typer.Typer(no_args_is_help=True, add_completion=False)
app.add_typer(

View File

@@ -110,6 +110,10 @@ def new(
readme_contents = readme.read_text()
readme.write_text(readme_contents.replace("__app_name__", app_name))
pyproject = destination_dir / "pyproject.toml"
pyproject_contents = pyproject.read_text()
pyproject.write_text(pyproject_contents.replace("__app_name__", app_name))
# add packages if specified
if has_packages:
add(package, project_dir=destination_dir, pip=pip_bool)

View File

@@ -4,11 +4,15 @@ version = "0.1.0"
description = ""
authors = ["Your Name <you@example.com>"]
readme = "README.md"
packages = [
{ include = "app" },
]
[tool.poetry.dependencies]
python = "^3.11"
uvicorn = "^0.23.2"
langserve = {extras = ["server"], version = ">=0.0.22"}
pydantic = "<2"
[tool.poetry.group.dev.dependencies]

View File

@@ -1,6 +1,6 @@
[tool.poetry]
name = "langchain-cli"
version = "0.0.17"
version = "0.0.18"
description = "CLI for interacting with LangChain"
authors = ["Erick Friis <erick@langchain.dev>"]
readme = "README.md"

View File

@@ -44,7 +44,12 @@ class OpenAIFunctionsAgentOutputParser(AgentOutputParser):
if function_call:
function_name = function_call["name"]
try:
_tool_input = json.loads(function_call["arguments"])
if len(function_call["arguments"].strip()) == 0:
# OpenAI returns an empty string for functions containing no args
_tool_input = {}
else:
# otherwise it returns a json object
_tool_input = json.loads(function_call["arguments"])
except JSONDecodeError:
raise OutputParserException(
f"Could not parse tool input: {function_call} because "

View File

@@ -1,599 +1,28 @@
"""Base callback handler that can be used to handle callbacks in langchain."""
from __future__ import annotations
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Sequence, TypeVar, Union
from uuid import UUID
from tenacity import RetryCallState
if TYPE_CHECKING:
from langchain.schema.agent import AgentAction, AgentFinish
from langchain.schema.document import Document
from langchain.schema.messages import BaseMessage
from langchain.schema.output import ChatGenerationChunk, GenerationChunk, LLMResult
class RetrieverManagerMixin:
"""Mixin for Retriever callbacks."""
def on_retriever_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when Retriever errors."""
def on_retriever_end(
self,
documents: Sequence[Document],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when Retriever ends running."""
class LLMManagerMixin:
"""Mixin for LLM callbacks."""
def on_llm_new_token(
self,
token: str,
*,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on new LLM token. Only available when streaming is enabled.
Args:
token (str): The new token.
chunk (GenerationChunk | ChatGenerationChunk): The new generated chunk,
containing content and other information.
"""
def on_llm_end(
self,
response: LLMResult,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when LLM ends running."""
def on_llm_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when LLM errors."""
class ChainManagerMixin:
"""Mixin for chain callbacks."""
def on_chain_end(
self,
outputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when chain ends running."""
def on_chain_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when chain errors."""
def on_agent_action(
self,
action: AgentAction,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on agent action."""
def on_agent_finish(
self,
finish: AgentFinish,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on agent end."""
class ToolManagerMixin:
"""Mixin for tool callbacks."""
def on_tool_end(
self,
output: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when tool ends running."""
def on_tool_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when tool errors."""
class CallbackManagerMixin:
"""Mixin for callback manager."""
def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when LLM starts running."""
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when a chat model starts running."""
raise NotImplementedError(
f"{self.__class__.__name__} does not implement `on_chat_model_start`"
)
def on_retriever_start(
self,
serialized: Dict[str, Any],
query: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when Retriever starts running."""
def on_chain_start(
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when chain starts running."""
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when tool starts running."""
class RunManagerMixin:
"""Mixin for run manager."""
def on_text(
self,
text: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on arbitrary text."""
def on_retry(
self,
retry_state: RetryCallState,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on a retry event."""
class BaseCallbackHandler(
LLMManagerMixin,
ChainManagerMixin,
ToolManagerMixin,
RetrieverManagerMixin,
from langchain.schema.callbacks.base import (
AsyncCallbackHandler,
BaseCallbackHandler,
BaseCallbackManager,
CallbackManagerMixin,
Callbacks,
ChainManagerMixin,
LLMManagerMixin,
RetrieverManagerMixin,
RunManagerMixin,
):
"""Base callback handler that handles callbacks from LangChain."""
ToolManagerMixin,
)
raise_error: bool = False
run_inline: bool = False
@property
def ignore_llm(self) -> bool:
"""Whether to ignore LLM callbacks."""
return False
@property
def ignore_retry(self) -> bool:
"""Whether to ignore retry callbacks."""
return False
@property
def ignore_chain(self) -> bool:
"""Whether to ignore chain callbacks."""
return False
@property
def ignore_agent(self) -> bool:
"""Whether to ignore agent callbacks."""
return False
@property
def ignore_retriever(self) -> bool:
"""Whether to ignore retriever callbacks."""
return False
@property
def ignore_chat_model(self) -> bool:
"""Whether to ignore chat model callbacks."""
return False
class AsyncCallbackHandler(BaseCallbackHandler):
"""Async callback handler that handles callbacks from LangChain."""
async def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run when LLM starts running."""
async def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when a chat model starts running."""
raise NotImplementedError(
f"{self.__class__.__name__} does not implement `on_chat_model_start`"
)
async def on_llm_new_token(
self,
token: str,
*,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on new LLM token. Only available when streaming is enabled."""
async def on_llm_end(
self,
response: LLMResult,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when LLM ends running."""
async def on_llm_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when LLM errors."""
async def on_chain_start(
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run when chain starts running."""
async def on_chain_end(
self,
outputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when chain ends running."""
async def on_chain_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when chain errors."""
async def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run when tool starts running."""
async def on_tool_end(
self,
output: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when tool ends running."""
async def on_tool_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when tool errors."""
async def on_text(
self,
text: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on arbitrary text."""
async def on_retry(
self,
retry_state: RetryCallState,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on a retry event."""
async def on_agent_action(
self,
action: AgentAction,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on agent action."""
async def on_agent_finish(
self,
finish: AgentFinish,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on agent end."""
async def on_retriever_start(
self,
serialized: Dict[str, Any],
query: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run on retriever start."""
async def on_retriever_end(
self,
documents: Sequence[Document],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on retriever end."""
async def on_retriever_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on retriever error."""
T = TypeVar("T", bound="BaseCallbackManager")
class BaseCallbackManager(CallbackManagerMixin):
"""Base callback manager that handles callbacks from LangChain."""
def __init__(
self,
handlers: List[BaseCallbackHandler],
inheritable_handlers: Optional[List[BaseCallbackHandler]] = None,
parent_run_id: Optional[UUID] = None,
*,
tags: Optional[List[str]] = None,
inheritable_tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
inheritable_metadata: Optional[Dict[str, Any]] = None,
) -> None:
"""Initialize callback manager."""
self.handlers: List[BaseCallbackHandler] = handlers
self.inheritable_handlers: List[BaseCallbackHandler] = (
inheritable_handlers or []
)
self.parent_run_id: Optional[UUID] = parent_run_id
self.tags = tags or []
self.inheritable_tags = inheritable_tags or []
self.metadata = metadata or {}
self.inheritable_metadata = inheritable_metadata or {}
def copy(self: T) -> T:
"""Copy the callback manager."""
return self.__class__(
handlers=self.handlers,
inheritable_handlers=self.inheritable_handlers,
parent_run_id=self.parent_run_id,
tags=self.tags,
inheritable_tags=self.inheritable_tags,
metadata=self.metadata,
inheritable_metadata=self.inheritable_metadata,
)
@property
def is_async(self) -> bool:
"""Whether the callback manager is async."""
return False
def add_handler(self, handler: BaseCallbackHandler, inherit: bool = True) -> None:
"""Add a handler to the callback manager."""
if handler not in self.handlers:
self.handlers.append(handler)
if inherit and handler not in self.inheritable_handlers:
self.inheritable_handlers.append(handler)
def remove_handler(self, handler: BaseCallbackHandler) -> None:
"""Remove a handler from the callback manager."""
self.handlers.remove(handler)
self.inheritable_handlers.remove(handler)
def set_handlers(
self, handlers: List[BaseCallbackHandler], inherit: bool = True
) -> None:
"""Set handlers as the only handlers on the callback manager."""
self.handlers = []
self.inheritable_handlers = []
for handler in handlers:
self.add_handler(handler, inherit=inherit)
def set_handler(self, handler: BaseCallbackHandler, inherit: bool = True) -> None:
"""Set handler as the only handler on the callback manager."""
self.set_handlers([handler], inherit=inherit)
def add_tags(self, tags: List[str], inherit: bool = True) -> None:
for tag in tags:
if tag in self.tags:
self.remove_tags([tag])
self.tags.extend(tags)
if inherit:
self.inheritable_tags.extend(tags)
def remove_tags(self, tags: List[str]) -> None:
for tag in tags:
self.tags.remove(tag)
self.inheritable_tags.remove(tag)
def add_metadata(self, metadata: Dict[str, Any], inherit: bool = True) -> None:
self.metadata.update(metadata)
if inherit:
self.inheritable_metadata.update(metadata)
def remove_metadata(self, keys: List[str]) -> None:
for key in keys:
self.metadata.pop(key)
self.inheritable_metadata.pop(key)
Callbacks = Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]
__all__ = [
"RetrieverManagerMixin",
"LLMManagerMixin",
"ChainManagerMixin",
"ToolManagerMixin",
"CallbackManagerMixin",
"RunManagerMixin",
"BaseCallbackHandler",
"AsyncCallbackHandler",
"BaseCallbackManager",
"Callbacks",
]

View File

@@ -4,7 +4,7 @@ import os
import traceback
import warnings
from contextvars import ContextVar
from typing import Any, Dict, List, Literal, Union
from typing import Any, Dict, List, Union, cast
from uuid import UUID
import requests
@@ -15,11 +15,30 @@ from langchain.schema.agent import AgentAction, AgentFinish
from langchain.schema.messages import BaseMessage
from langchain.schema.output import LLMResult
logger = logging.getLogger(__name__)
DEFAULT_API_URL = "https://app.llmonitor.com"
user_ctx = ContextVar[Union[str, None]]("user_ctx", default=None)
user_props_ctx = ContextVar[Union[str, None]]("user_props_ctx", default=None)
PARAMS_TO_CAPTURE = [
"temperature",
"top_p",
"top_k",
"stop",
"presence_penalty",
"frequence_penalty",
"seed",
"function_call",
"functions",
"tools",
"tool_choice",
"response_format",
"max_tokens",
"logit_bias",
]
class UserContextManager:
"""Context manager for LLMonitor user context."""
@@ -66,6 +85,10 @@ def _parse_input(raw_input: Any) -> Any:
if not raw_input:
return None
# if it's an array of 1, just parse the first element
if isinstance(raw_input, list) and len(raw_input) == 1:
return _parse_input(raw_input[0])
if not isinstance(raw_input, dict):
return _serialize(raw_input)
@@ -115,17 +138,11 @@ def _parse_output(raw_output: dict) -> Any:
def _parse_lc_role(
role: str,
) -> Union[Literal["user", "ai", "system", "function"], None]:
) -> str:
if role == "human":
return "user"
elif role == "ai":
return "ai"
elif role == "system":
return "system"
elif role == "function":
return "function"
else:
return None
return role
def _get_user_id(metadata: Any) -> Any:
@@ -148,13 +165,15 @@ def _get_user_props(metadata: Any) -> Any:
def _parse_lc_message(message: BaseMessage) -> Dict[str, Any]:
keys = ["function_call", "tool_calls", "tool_call_id", "name"]
parsed = {"text": message.content, "role": _parse_lc_role(message.type)}
function_call = (message.additional_kwargs or {}).get("function_call")
if function_call is not None:
parsed["functionCall"] = function_call
parsed.update(
{
key: cast(Any, message.additional_kwargs.get(key))
for key in keys
if message.additional_kwargs.get(key) is not None
}
)
return parsed
@@ -213,19 +232,20 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
self.__track_event = llmonitor.track_event
except ImportError:
warnings.warn(
logger.warning(
"""[LLMonitor] To use the LLMonitor callback handler you need to
have the `llmonitor` Python package installed. Please install it
with `pip install llmonitor`"""
)
self.__has_valid_config = False
return
if parse(self.__llmonitor_version) < parse("0.0.20"):
warnings.warn(
if parse(self.__llmonitor_version) < parse("0.0.32"):
logger.warning(
f"""[LLMonitor] The installed `llmonitor` version is
{self.__llmonitor_version} but `LLMonitorCallbackHandler` requires
at least version 0.0.20 upgrade `llmonitor` with `pip install
--upgrade llmonitor`"""
{self.__llmonitor_version}
but `LLMonitorCallbackHandler` requires at least version 0.0.32
upgrade `llmonitor` with `pip install --upgrade llmonitor`"""
)
self.__has_valid_config = False
@@ -236,9 +256,9 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
_app_id = app_id or os.getenv("LLMONITOR_APP_ID")
if _app_id is None:
warnings.warn(
"""[LLMonitor] app_id must be provided either as an argument or as
an environment variable"""
logger.warning(
"""[LLMonitor] app_id must be provided either as an argument or
as an environment variable"""
)
self.__has_valid_config = False
else:
@@ -252,7 +272,7 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
if not res.ok:
raise ConnectionError()
except Exception:
warnings.warn(
logger.warning(
f"""[LLMonitor] Could not connect to the LLMonitor API at
{self.__api_url}"""
)
@@ -273,7 +293,27 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
try:
user_id = _get_user_id(metadata)
user_props = _get_user_props(metadata)
name = kwargs.get("invocation_params", {}).get("model_name")
params = kwargs.get("invocation_params", {})
params.update(
serialized.get("kwargs", {})
) # Sometimes, for example with ChatAnthropic, `invocation_params` is empty
name = (
params.get("model")
or params.get("model_name")
or params.get("model_id")
)
if not name and "anthropic" in params.get("_type"):
name = "claude-2"
extra = {
param: params.get(param)
for param in PARAMS_TO_CAPTURE
if params.get(param) is not None
}
input = _parse_input(prompts)
self.__track_event(
@@ -285,8 +325,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
name=name,
input=input,
tags=tags,
extra=extra,
metadata=metadata,
user_props=user_props,
app_id=self.__app_id,
)
except Exception as e:
warnings.warn(f"[LLMonitor] An error occurred in on_llm_start: {e}")
@@ -304,10 +346,31 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
) -> Any:
if self.__has_valid_config is False:
return
try:
user_id = _get_user_id(metadata)
user_props = _get_user_props(metadata)
name = kwargs.get("invocation_params", {}).get("model_name")
params = kwargs.get("invocation_params", {})
params.update(
serialized.get("kwargs", {})
) # Sometimes, for example with ChatAnthropic, `invocation_params` is empty
name = (
params.get("model")
or params.get("model_name")
or params.get("model_id")
)
if not name and "anthropic" in params.get("_type"):
name = "claude-2"
extra = {
param: params.get(param)
for param in PARAMS_TO_CAPTURE
if params.get(param) is not None
}
input = _parse_lc_messages(messages[0])
self.__track_event(
@@ -319,13 +382,13 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
name=name,
input=input,
tags=tags,
extra=extra,
metadata=metadata,
user_props=user_props,
app_id=self.__app_id,
)
except Exception as e:
logging.warning(
f"[LLMonitor] An error occurred in on_chat_model_start: {e}"
)
logger.error(f"[LLMonitor] An error occurred in on_chat_model_start: {e}")
def on_llm_end(
self,
@@ -340,25 +403,18 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
try:
token_usage = (response.llm_output or {}).get("token_usage", {})
parsed_output = [
{
"text": generation.text,
"role": "ai",
**(
{
"functionCall": generation.message.additional_kwargs[
"function_call"
]
}
if hasattr(generation, "message")
and hasattr(generation.message, "additional_kwargs")
and "function_call" in generation.message.additional_kwargs
else {}
),
}
parsed_output: Any = [
_parse_lc_message(generation.message)
if hasattr(generation, "message")
else generation.text
for generation in response.generations[0]
]
# if it's an array of 1, just parse the first element
if len(parsed_output) == 1:
parsed_output = parsed_output[0]
self.__track_event(
"llm",
"end",
@@ -369,9 +425,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
"prompt": token_usage.get("prompt_tokens"),
"completion": token_usage.get("completion_tokens"),
},
app_id=self.__app_id,
)
except Exception as e:
warnings.warn(f"[LLMonitor] An error occurred in on_llm_end: {e}")
logger.error(f"[LLMonitor] An error occurred in on_llm_end: {e}")
def on_tool_start(
self,
@@ -402,9 +459,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
tags=tags,
metadata=metadata,
user_props=user_props,
app_id=self.__app_id,
)
except Exception as e:
warnings.warn(f"[LLMonitor] An error occurred in on_tool_start: {e}")
logger.error(f"[LLMonitor] An error occurred in on_tool_start: {e}")
def on_tool_end(
self,
@@ -424,9 +482,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
run_id=str(run_id),
parent_run_id=str(parent_run_id) if parent_run_id else None,
output=output,
app_id=self.__app_id,
)
except Exception as e:
warnings.warn(f"[LLMonitor] An error occurred in on_tool_end: {e}")
logger.error(f"[LLMonitor] An error occurred in on_tool_end: {e}")
def on_chain_start(
self,
@@ -473,9 +532,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
tags=tags,
metadata=metadata,
user_props=user_props,
app_id=self.__app_id,
)
except Exception as e:
warnings.warn(f"[LLMonitor] An error occurred in on_chain_start: {e}")
logger.error(f"[LLMonitor] An error occurred in on_chain_start: {e}")
def on_chain_end(
self,
@@ -496,9 +556,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
run_id=str(run_id),
parent_run_id=str(parent_run_id) if parent_run_id else None,
output=output,
app_id=self.__app_id,
)
except Exception as e:
logging.warning(f"[LLMonitor] An error occurred in on_chain_end: {e}")
logger.error(f"[LLMonitor] An error occurred in on_chain_end: {e}")
def on_agent_action(
self,
@@ -521,9 +582,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
parent_run_id=str(parent_run_id) if parent_run_id else None,
name=name,
input=input,
app_id=self.__app_id,
)
except Exception as e:
logging.warning(f"[LLMonitor] An error occurred in on_agent_action: {e}")
logger.error(f"[LLMonitor] An error occurred in on_agent_action: {e}")
def on_agent_finish(
self,
@@ -544,9 +606,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
run_id=str(run_id),
parent_run_id=str(parent_run_id) if parent_run_id else None,
output=output,
app_id=self.__app_id,
)
except Exception as e:
logging.warning(f"[LLMonitor] An error occurred in on_agent_finish: {e}")
logger.error(f"[LLMonitor] An error occurred in on_agent_finish: {e}")
def on_chain_error(
self,
@@ -565,9 +628,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
run_id=str(run_id),
parent_run_id=str(parent_run_id) if parent_run_id else None,
error={"message": str(error), "stack": traceback.format_exc()},
app_id=self.__app_id,
)
except Exception as e:
logging.warning(f"[LLMonitor] An error occurred in on_chain_error: {e}")
logger.error(f"[LLMonitor] An error occurred in on_chain_error: {e}")
def on_tool_error(
self,
@@ -586,9 +650,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
run_id=str(run_id),
parent_run_id=str(parent_run_id) if parent_run_id else None,
error={"message": str(error), "stack": traceback.format_exc()},
app_id=self.__app_id,
)
except Exception as e:
logging.warning(f"[LLMonitor] An error occurred in on_tool_error: {e}")
logger.error(f"[LLMonitor] An error occurred in on_tool_error: {e}")
def on_llm_error(
self,
@@ -607,9 +672,10 @@ class LLMonitorCallbackHandler(BaseCallbackHandler):
run_id=str(run_id),
parent_run_id=str(parent_run_id) if parent_run_id else None,
error={"message": str(error), "stack": traceback.format_exc()},
app_id=self.__app_id,
)
except Exception as e:
logging.warning(f"[LLMonitor] An error occurred in on_llm_error: {e}")
logger.error(f"[LLMonitor] An error occurred in on_llm_error: {e}")
__all__ = ["LLMonitorCallbackHandler", "identify"]

File diff suppressed because it is too large Load Diff

View File

@@ -1,97 +1,3 @@
"""Callback Handler that prints to std out."""
from typing import Any, Dict, List, Optional
from langchain.schema.callbacks.stdout import StdOutCallbackHandler
from langchain.callbacks.base import BaseCallbackHandler
from langchain.schema import AgentAction, AgentFinish, LLMResult
from langchain.utils.input import print_text
class StdOutCallbackHandler(BaseCallbackHandler):
"""Callback Handler that prints to std out."""
def __init__(self, color: Optional[str] = None) -> None:
"""Initialize callback handler."""
self.color = color
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> None:
"""Print out the prompts."""
pass
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_llm_error(self, error: BaseException, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
) -> None:
"""Print out that we are entering a chain."""
class_name = serialized.get("name", serialized.get("id", ["<unknown>"])[-1])
print(f"\n\n\033[1m> Entering new {class_name} chain...\033[0m")
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Print out that we finished a chain."""
print("\n\033[1m> Finished chain.\033[0m")
def on_chain_error(self, error: BaseException, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
**kwargs: Any,
) -> None:
"""Do nothing."""
pass
def on_agent_action(
self, action: AgentAction, color: Optional[str] = None, **kwargs: Any
) -> Any:
"""Run on agent action."""
print_text(action.log, color=color or self.color)
def on_tool_end(
self,
output: str,
color: Optional[str] = None,
observation_prefix: Optional[str] = None,
llm_prefix: Optional[str] = None,
**kwargs: Any,
) -> None:
"""If not the final action, print out observation."""
if observation_prefix is not None:
print_text(f"\n{observation_prefix}")
print_text(output, color=color or self.color)
if llm_prefix is not None:
print_text(f"\n{llm_prefix}")
def on_tool_error(self, error: BaseException, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_text(
self,
text: str,
color: Optional[str] = None,
end: str = "",
**kwargs: Any,
) -> None:
"""Run when agent ends."""
print_text(text, color=color or self.color, end=end)
def on_agent_finish(
self, finish: AgentFinish, color: Optional[str] = None, **kwargs: Any
) -> None:
"""Run on agent end."""
print_text(finish.log, color=color or self.color, end="\n")
__all__ = ["StdOutCallbackHandler"]

View File

@@ -1,12 +1,12 @@
"""Tracers that record execution of LangChain runs."""
from langchain.callbacks.tracers.langchain import LangChainTracer
from langchain.callbacks.tracers.langchain_v1 import LangChainTracerV1
from langchain.callbacks.tracers.stdout import (
from langchain.callbacks.tracers.wandb import WandbTracer
from langchain.schema.callbacks.tracers.langchain import LangChainTracer
from langchain.schema.callbacks.tracers.langchain_v1 import LangChainTracerV1
from langchain.schema.callbacks.tracers.stdout import (
ConsoleCallbackHandler,
FunctionCallbackHandler,
)
from langchain.callbacks.tracers.wandb import WandbTracer
__all__ = [
"LangChainTracer",

View File

@@ -1,537 +1,5 @@
"""Base interfaces for tracing runs."""
from __future__ import annotations
import logging
from abc import ABC, abstractmethod
from datetime import datetime
from typing import Any, Dict, List, Optional, Sequence, Union, cast
from uuid import UUID
from langchain.schema.callbacks.tracers.base import BaseTracer, TracerException
from tenacity import RetryCallState
from langchain.callbacks.base import BaseCallbackHandler
from langchain.callbacks.tracers.schemas import Run
from langchain.load.dump import dumpd
from langchain.schema.document import Document
from langchain.schema.output import (
ChatGeneration,
ChatGenerationChunk,
GenerationChunk,
LLMResult,
)
logger = logging.getLogger(__name__)
class TracerException(Exception):
"""Base class for exceptions in tracers module."""
class BaseTracer(BaseCallbackHandler, ABC):
"""Base interface for tracers."""
def __init__(self, **kwargs: Any) -> None:
super().__init__(**kwargs)
self.run_map: Dict[str, Run] = {}
@staticmethod
def _add_child_run(
parent_run: Run,
child_run: Run,
) -> None:
"""Add child run to a chain run or tool run."""
parent_run.child_runs.append(child_run)
@abstractmethod
def _persist_run(self, run: Run) -> None:
"""Persist a run."""
def _start_trace(self, run: Run) -> None:
"""Start a trace for a run."""
if run.parent_run_id:
parent_run = self.run_map.get(str(run.parent_run_id))
if parent_run:
self._add_child_run(parent_run, run)
parent_run.child_execution_order = max(
parent_run.child_execution_order, run.child_execution_order
)
else:
logger.debug(f"Parent run with UUID {run.parent_run_id} not found.")
self.run_map[str(run.id)] = run
self._on_run_create(run)
def _end_trace(self, run: Run) -> None:
"""End a trace for a run."""
if not run.parent_run_id:
self._persist_run(run)
else:
parent_run = self.run_map.get(str(run.parent_run_id))
if parent_run is None:
logger.debug(f"Parent run with UUID {run.parent_run_id} not found.")
elif (
run.child_execution_order is not None
and parent_run.child_execution_order is not None
and run.child_execution_order > parent_run.child_execution_order
):
parent_run.child_execution_order = run.child_execution_order
self.run_map.pop(str(run.id))
self._on_run_update(run)
def _get_execution_order(self, parent_run_id: Optional[str] = None) -> int:
"""Get the execution order for a run."""
if parent_run_id is None:
return 1
parent_run = self.run_map.get(parent_run_id)
if parent_run is None:
logger.debug(f"Parent run with UUID {parent_run_id} not found.")
return 1
if parent_run.child_execution_order is None:
raise TracerException(
f"Parent run with UUID {parent_run_id} has no child execution order."
)
return parent_run.child_execution_order + 1
def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for an LLM run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
llm_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"prompts": prompts},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
run_type="llm",
tags=tags or [],
name=name,
)
self._start_trace(llm_run)
self._on_llm_start(llm_run)
return llm_run
def on_llm_new_token(
self,
token: str,
*,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Run:
"""Run on new LLM token. Only available when streaming is enabled."""
if not run_id:
raise TracerException("No run_id provided for on_llm_new_token callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None or llm_run.run_type != "llm":
raise TracerException(f"No LLM Run found to be traced for {run_id}")
event_kwargs: Dict[str, Any] = {"token": token}
if chunk:
event_kwargs["chunk"] = chunk
llm_run.events.append(
{
"name": "new_token",
"time": datetime.utcnow(),
"kwargs": event_kwargs,
},
)
self._on_llm_new_token(llm_run, token, chunk)
return llm_run
def on_retry(
self,
retry_state: RetryCallState,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
if not run_id:
raise TracerException("No run_id provided for on_retry callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None:
raise TracerException("No Run found to be traced for on_retry")
retry_d: Dict[str, Any] = {
"slept": retry_state.idle_for,
"attempt": retry_state.attempt_number,
}
if retry_state.outcome is None:
retry_d["outcome"] = "N/A"
elif retry_state.outcome.failed:
retry_d["outcome"] = "failed"
exception = retry_state.outcome.exception()
retry_d["exception"] = str(exception)
retry_d["exception_type"] = exception.__class__.__name__
else:
retry_d["outcome"] = "success"
retry_d["result"] = str(retry_state.outcome.result())
llm_run.events.append(
{
"name": "retry",
"time": datetime.utcnow(),
"kwargs": retry_d,
},
)
return llm_run
def on_llm_end(self, response: LLMResult, *, run_id: UUID, **kwargs: Any) -> Run:
"""End a trace for an LLM run."""
if not run_id:
raise TracerException("No run_id provided for on_llm_end callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None or llm_run.run_type != "llm":
raise TracerException(f"No LLM Run found to be traced for {run_id}")
llm_run.outputs = response.dict()
for i, generations in enumerate(response.generations):
for j, generation in enumerate(generations):
output_generation = llm_run.outputs["generations"][i][j]
if "message" in output_generation:
output_generation["message"] = dumpd(
cast(ChatGeneration, generation).message
)
llm_run.end_time = datetime.utcnow()
llm_run.events.append({"name": "end", "time": llm_run.end_time})
self._end_trace(llm_run)
self._on_llm_end(llm_run)
return llm_run
def on_llm_error(
self,
error: BaseException,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Handle an error for an LLM run."""
if not run_id:
raise TracerException("No run_id provided for on_llm_error callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None or llm_run.run_type != "llm":
raise TracerException(f"No LLM Run found to be traced for {run_id}")
llm_run.error = repr(error)
llm_run.end_time = datetime.utcnow()
llm_run.events.append({"name": "error", "time": llm_run.end_time})
self._end_trace(llm_run)
self._on_chain_error(llm_run)
return llm_run
def on_chain_start(
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
run_type: Optional[str] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for a chain run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
chain_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs=inputs if isinstance(inputs, dict) else {"input": inputs},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
child_runs=[],
run_type=run_type or "chain",
name=name,
tags=tags or [],
)
self._start_trace(chain_run)
self._on_chain_start(chain_run)
return chain_run
def on_chain_end(
self,
outputs: Dict[str, Any],
*,
run_id: UUID,
inputs: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Run:
"""End a trace for a chain run."""
if not run_id:
raise TracerException("No run_id provided for on_chain_end callback.")
chain_run = self.run_map.get(str(run_id))
if chain_run is None:
raise TracerException(f"No chain Run found to be traced for {run_id}")
chain_run.outputs = (
outputs if isinstance(outputs, dict) else {"output": outputs}
)
chain_run.end_time = datetime.utcnow()
chain_run.events.append({"name": "end", "time": chain_run.end_time})
if inputs is not None:
chain_run.inputs = inputs if isinstance(inputs, dict) else {"input": inputs}
self._end_trace(chain_run)
self._on_chain_end(chain_run)
return chain_run
def on_chain_error(
self,
error: BaseException,
*,
inputs: Optional[Dict[str, Any]] = None,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Handle an error for a chain run."""
if not run_id:
raise TracerException("No run_id provided for on_chain_error callback.")
chain_run = self.run_map.get(str(run_id))
if chain_run is None:
raise TracerException(f"No chain Run found to be traced for {run_id}")
chain_run.error = repr(error)
chain_run.end_time = datetime.utcnow()
chain_run.events.append({"name": "error", "time": chain_run.end_time})
if inputs is not None:
chain_run.inputs = inputs if isinstance(inputs, dict) else {"input": inputs}
self._end_trace(chain_run)
self._on_chain_error(chain_run)
return chain_run
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for a tool run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
tool_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"input": input_str},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
child_runs=[],
run_type="tool",
tags=tags or [],
name=name,
)
self._start_trace(tool_run)
self._on_tool_start(tool_run)
return tool_run
def on_tool_end(self, output: str, *, run_id: UUID, **kwargs: Any) -> Run:
"""End a trace for a tool run."""
if not run_id:
raise TracerException("No run_id provided for on_tool_end callback.")
tool_run = self.run_map.get(str(run_id))
if tool_run is None or tool_run.run_type != "tool":
raise TracerException(f"No tool Run found to be traced for {run_id}")
tool_run.outputs = {"output": output}
tool_run.end_time = datetime.utcnow()
tool_run.events.append({"name": "end", "time": tool_run.end_time})
self._end_trace(tool_run)
self._on_tool_end(tool_run)
return tool_run
def on_tool_error(
self,
error: BaseException,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Handle an error for a tool run."""
if not run_id:
raise TracerException("No run_id provided for on_tool_error callback.")
tool_run = self.run_map.get(str(run_id))
if tool_run is None or tool_run.run_type != "tool":
raise TracerException(f"No tool Run found to be traced for {run_id}")
tool_run.error = repr(error)
tool_run.end_time = datetime.utcnow()
tool_run.events.append({"name": "error", "time": tool_run.end_time})
self._end_trace(tool_run)
self._on_tool_error(tool_run)
return tool_run
def on_retriever_start(
self,
serialized: Dict[str, Any],
query: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Run when Retriever starts running."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
retrieval_run = Run(
id=run_id,
name=name or "Retriever",
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"query": query},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
tags=tags,
child_runs=[],
run_type="retriever",
)
self._start_trace(retrieval_run)
self._on_retriever_start(retrieval_run)
return retrieval_run
def on_retriever_error(
self,
error: BaseException,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Run when Retriever errors."""
if not run_id:
raise TracerException("No run_id provided for on_retriever_error callback.")
retrieval_run = self.run_map.get(str(run_id))
if retrieval_run is None or retrieval_run.run_type != "retriever":
raise TracerException(f"No retriever Run found to be traced for {run_id}")
retrieval_run.error = repr(error)
retrieval_run.end_time = datetime.utcnow()
retrieval_run.events.append({"name": "error", "time": retrieval_run.end_time})
self._end_trace(retrieval_run)
self._on_retriever_error(retrieval_run)
return retrieval_run
def on_retriever_end(
self, documents: Sequence[Document], *, run_id: UUID, **kwargs: Any
) -> Run:
"""Run when Retriever ends running."""
if not run_id:
raise TracerException("No run_id provided for on_retriever_end callback.")
retrieval_run = self.run_map.get(str(run_id))
if retrieval_run is None or retrieval_run.run_type != "retriever":
raise TracerException(f"No retriever Run found to be traced for {run_id}")
retrieval_run.outputs = {"documents": documents}
retrieval_run.end_time = datetime.utcnow()
retrieval_run.events.append({"name": "end", "time": retrieval_run.end_time})
self._end_trace(retrieval_run)
self._on_retriever_end(retrieval_run)
return retrieval_run
def __deepcopy__(self, memo: dict) -> BaseTracer:
"""Deepcopy the tracer."""
return self
def __copy__(self) -> BaseTracer:
"""Copy the tracer."""
return self
def _on_run_create(self, run: Run) -> None:
"""Process a run upon creation."""
def _on_run_update(self, run: Run) -> None:
"""Process a run upon update."""
def _on_llm_start(self, run: Run) -> None:
"""Process the LLM Run upon start."""
def _on_llm_new_token(
self,
run: Run,
token: str,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]],
) -> None:
"""Process new LLM token."""
def _on_llm_end(self, run: Run) -> None:
"""Process the LLM Run."""
def _on_llm_error(self, run: Run) -> None:
"""Process the LLM Run upon error."""
def _on_chain_start(self, run: Run) -> None:
"""Process the Chain Run upon start."""
def _on_chain_end(self, run: Run) -> None:
"""Process the Chain Run."""
def _on_chain_error(self, run: Run) -> None:
"""Process the Chain Run upon error."""
def _on_tool_start(self, run: Run) -> None:
"""Process the Tool Run upon start."""
def _on_tool_end(self, run: Run) -> None:
"""Process the Tool Run."""
def _on_tool_error(self, run: Run) -> None:
"""Process the Tool Run upon error."""
def _on_chat_model_start(self, run: Run) -> None:
"""Process the Chat Model Run upon start."""
def _on_retriever_start(self, run: Run) -> None:
"""Process the Retriever Run upon start."""
def _on_retriever_end(self, run: Run) -> None:
"""Process the Retriever Run."""
def _on_retriever_error(self, run: Run) -> None:
"""Process the Retriever Run upon error."""
__all__ = ["BaseTracer", "TracerException"]

View File

@@ -1,222 +1,7 @@
"""A tracer that runs evaluators over completed runs."""
from __future__ import annotations
from langchain.schema.callbacks.tracers.evaluation import (
EvaluatorCallbackHandler,
wait_for_all_evaluators,
)
import logging
import threading
import weakref
from concurrent.futures import Future, ThreadPoolExecutor, wait
from typing import Any, Dict, List, Optional, Sequence, Tuple, Union, cast
from uuid import UUID
import langsmith
from langsmith.evaluation.evaluator import EvaluationResult, EvaluationResults
from langchain.callbacks import manager
from langchain.callbacks.tracers import langchain as langchain_tracer
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.langchain import _get_executor
from langchain.callbacks.tracers.schemas import Run
logger = logging.getLogger(__name__)
_TRACERS: weakref.WeakSet[EvaluatorCallbackHandler] = weakref.WeakSet()
def wait_for_all_evaluators() -> None:
"""Wait for all tracers to finish."""
global _TRACERS
for tracer in list(_TRACERS):
if tracer is not None:
tracer.wait_for_futures()
class EvaluatorCallbackHandler(BaseTracer):
"""A tracer that runs a run evaluator whenever a run is persisted.
Parameters
----------
evaluators : Sequence[RunEvaluator]
The run evaluators to apply to all top level runs.
client : LangSmith Client, optional
The LangSmith client instance to use for evaluating the runs.
If not specified, a new instance will be created.
example_id : Union[UUID, str], optional
The example ID to be associated with the runs.
project_name : str, optional
The LangSmith project name to be organize eval chain runs under.
Attributes
----------
example_id : Union[UUID, None]
The example ID associated with the runs.
client : Client
The LangSmith client instance used for evaluating the runs.
evaluators : Sequence[RunEvaluator]
The sequence of run evaluators to be executed.
executor : ThreadPoolExecutor
The thread pool executor used for running the evaluators.
futures : Set[Future]
The set of futures representing the running evaluators.
skip_unfinished : bool
Whether to skip runs that are not finished or raised
an error.
project_name : Optional[str]
The LangSmith project name to be organize eval chain runs under.
"""
name = "evaluator_callback_handler"
def __init__(
self,
evaluators: Sequence[langsmith.RunEvaluator],
client: Optional[langsmith.Client] = None,
example_id: Optional[Union[UUID, str]] = None,
skip_unfinished: bool = True,
project_name: Optional[str] = "evaluators",
max_concurrency: Optional[int] = None,
**kwargs: Any,
) -> None:
super().__init__(**kwargs)
self.example_id = (
UUID(example_id) if isinstance(example_id, str) else example_id
)
self.client = client or langchain_tracer.get_client()
self.evaluators = evaluators
if max_concurrency is None:
self.executor: Optional[ThreadPoolExecutor] = _get_executor()
elif max_concurrency > 0:
self.executor = ThreadPoolExecutor(max_workers=max_concurrency)
weakref.finalize(
self,
lambda: cast(ThreadPoolExecutor, self.executor).shutdown(wait=True),
)
else:
self.executor = None
self.futures: weakref.WeakSet[Future] = weakref.WeakSet()
self.skip_unfinished = skip_unfinished
self.project_name = project_name
self.logged_eval_results: Dict[Tuple[str, str], List[EvaluationResult]] = {}
self.lock = threading.Lock()
global _TRACERS
_TRACERS.add(self)
def _evaluate_in_project(self, run: Run, evaluator: langsmith.RunEvaluator) -> None:
"""Evaluate the run in the project.
Parameters
----------
run : Run
The run to be evaluated.
evaluator : RunEvaluator
The evaluator to use for evaluating the run.
"""
try:
if self.project_name is None:
eval_result = self.client.evaluate_run(run, evaluator)
eval_results = [eval_result]
with manager.tracing_v2_enabled(
project_name=self.project_name, tags=["eval"], client=self.client
) as cb:
reference_example = (
self.client.read_example(run.reference_example_id)
if run.reference_example_id
else None
)
evaluation_result = evaluator.evaluate_run(
run,
example=reference_example,
)
eval_results = self._log_evaluation_feedback(
evaluation_result,
run,
source_run_id=cb.latest_run.id if cb.latest_run else None,
)
except Exception as e:
logger.error(
f"Error evaluating run {run.id} with "
f"{evaluator.__class__.__name__}: {repr(e)}",
exc_info=True,
)
raise e
example_id = str(run.reference_example_id)
with self.lock:
for res in eval_results:
run_id = (
str(getattr(res, "target_run_id"))
if hasattr(res, "target_run_id")
else str(run.id)
)
self.logged_eval_results.setdefault((run_id, example_id), []).append(
res
)
def _select_eval_results(
self,
results: Union[EvaluationResult, EvaluationResults],
) -> List[EvaluationResult]:
if isinstance(results, EvaluationResult):
results_ = [results]
elif isinstance(results, dict) and "results" in results:
results_ = cast(List[EvaluationResult], results["results"])
else:
raise TypeError(
f"Invalid evaluation result type {type(results)}."
" Expected EvaluationResult or EvaluationResults."
)
return results_
def _log_evaluation_feedback(
self,
evaluator_response: Union[EvaluationResult, EvaluationResults],
run: Run,
source_run_id: Optional[UUID] = None,
) -> List[EvaluationResult]:
results = self._select_eval_results(evaluator_response)
for res in results:
source_info_: Dict[str, Any] = {}
if res.evaluator_info:
source_info_ = {**res.evaluator_info, **source_info_}
run_id_ = (
getattr(res, "target_run_id")
if hasattr(res, "target_run_id") and res.target_run_id is not None
else run.id
)
self.client.create_feedback(
run_id_,
res.key,
score=res.score,
value=res.value,
comment=res.comment,
correction=res.correction,
source_info=source_info_,
source_run_id=res.source_run_id or source_run_id,
feedback_source_type=langsmith.schemas.FeedbackSourceType.MODEL,
)
return results
def _persist_run(self, run: Run) -> None:
"""Run the evaluator on the run.
Parameters
----------
run : Run
The run to be evaluated.
"""
if self.skip_unfinished and not run.outputs:
logger.debug(f"Skipping unfinished run {run.id}")
return
run_ = run.copy()
run_.reference_example_id = self.example_id
for evaluator in self.evaluators:
if self.executor is None:
self._evaluate_in_project(run_, evaluator)
else:
self.futures.add(
self.executor.submit(self._evaluate_in_project, run_, evaluator)
)
def wait_for_futures(self) -> None:
"""Wait for all futures to complete."""
wait(self.futures)
__all__ = ["wait_for_all_evaluators", "EvaluatorCallbackHandler"]

View File

@@ -1,262 +1,8 @@
"""A Tracer implementation that records to LangChain endpoint."""
from __future__ import annotations
import logging
import weakref
from concurrent.futures import Future, ThreadPoolExecutor, wait
from datetime import datetime
from typing import Any, Callable, Dict, List, Optional, Union
from uuid import UUID
from langsmith import Client
from langsmith import utils as ls_utils
from tenacity import (
Retrying,
retry_if_exception_type,
stop_after_attempt,
wait_exponential_jitter,
from langchain.schema.callbacks.tracers.langchain import (
LangChainTracer,
wait_for_all_tracers,
)
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import Run
from langchain.env import get_runtime_environment
from langchain.load.dump import dumpd
from langchain.schema.messages import BaseMessage
logger = logging.getLogger(__name__)
_LOGGED = set()
_TRACERS: weakref.WeakSet[LangChainTracer] = weakref.WeakSet()
_CLIENT: Optional[Client] = None
_EXECUTOR: Optional[ThreadPoolExecutor] = None
def log_error_once(method: str, exception: Exception) -> None:
"""Log an error once."""
global _LOGGED
if (method, type(exception)) in _LOGGED:
return
_LOGGED.add((method, type(exception)))
logger.error(exception)
def wait_for_all_tracers() -> None:
"""Wait for all tracers to finish."""
global _TRACERS
for tracer in list(_TRACERS):
if tracer is not None:
tracer.wait_for_futures()
def get_client() -> Client:
"""Get the client."""
global _CLIENT
if _CLIENT is None:
_CLIENT = Client()
return _CLIENT
def _get_executor() -> ThreadPoolExecutor:
"""Get the executor."""
global _EXECUTOR
if _EXECUTOR is None:
_EXECUTOR = ThreadPoolExecutor()
return _EXECUTOR
def _copy(run: Run) -> Run:
"""Copy a run."""
try:
return run.copy(deep=True)
except TypeError:
# Fallback in case the object contains a lock or other
# non-pickleable object
return run.copy()
class LangChainTracer(BaseTracer):
"""An implementation of the SharedTracer that POSTS to the langchain endpoint."""
def __init__(
self,
example_id: Optional[Union[UUID, str]] = None,
project_name: Optional[str] = None,
client: Optional[Client] = None,
tags: Optional[List[str]] = None,
use_threading: bool = True,
**kwargs: Any,
) -> None:
"""Initialize the LangChain tracer."""
super().__init__(**kwargs)
self.example_id = (
UUID(example_id) if isinstance(example_id, str) else example_id
)
self.project_name = project_name or ls_utils.get_tracer_project()
self.client = client or get_client()
self._futures: weakref.WeakSet[Future] = weakref.WeakSet()
self.tags = tags or []
self.executor = _get_executor() if use_threading else None
self.latest_run: Optional[Run] = None
global _TRACERS
_TRACERS.add(self)
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> None:
"""Start a trace for an LLM run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
chat_model_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"messages": [[dumpd(msg) for msg in batch] for batch in messages]},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
run_type="llm",
tags=tags,
name=name,
)
self._start_trace(chat_model_run)
self._on_chat_model_start(chat_model_run)
def _persist_run(self, run: Run) -> None:
run_ = run.copy()
run_.reference_example_id = self.example_id
self.latest_run = run_
def get_run_url(self) -> str:
"""Get the LangSmith root run URL"""
if not self.latest_run:
raise ValueError("No traced run found.")
# If this is the first run in a project, the project may not yet be created.
# This method is only really useful for debugging flows, so we will assume
# there is some tolerace for latency.
for attempt in Retrying(
stop=stop_after_attempt(5),
wait=wait_exponential_jitter(),
retry=retry_if_exception_type(ls_utils.LangSmithError),
):
with attempt:
return self.client.get_run_url(
run=self.latest_run, project_name=self.project_name
)
raise ValueError("Failed to get run URL.")
def _get_tags(self, run: Run) -> List[str]:
"""Get combined tags for a run."""
tags = set(run.tags or [])
tags.update(self.tags or [])
return list(tags)
def _persist_run_single(self, run: Run) -> None:
"""Persist a run."""
run_dict = run.dict(exclude={"child_runs"})
run_dict["tags"] = self._get_tags(run)
extra = run_dict.get("extra", {})
extra["runtime"] = get_runtime_environment()
run_dict["extra"] = extra
try:
self.client.create_run(**run_dict, project_name=self.project_name)
except Exception as e:
# Errors are swallowed by the thread executor so we need to log them here
log_error_once("post", e)
raise
def _update_run_single(self, run: Run) -> None:
"""Update a run."""
try:
run_dict = run.dict()
run_dict["tags"] = self._get_tags(run)
self.client.update_run(run.id, **run_dict)
except Exception as e:
# Errors are swallowed by the thread executor so we need to log them here
log_error_once("patch", e)
raise
def _submit(self, function: Callable[[Run], None], run: Run) -> None:
"""Submit a function to the executor."""
if self.executor is None:
function(run)
else:
self._futures.add(self.executor.submit(function, run))
def _on_llm_start(self, run: Run) -> None:
"""Persist an LLM run."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_chat_model_start(self, run: Run) -> None:
"""Persist an LLM run."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_llm_end(self, run: Run) -> None:
"""Process the LLM Run."""
self._submit(self._update_run_single, _copy(run))
def _on_llm_error(self, run: Run) -> None:
"""Process the LLM Run upon error."""
self._submit(self._update_run_single, _copy(run))
def _on_chain_start(self, run: Run) -> None:
"""Process the Chain Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_chain_end(self, run: Run) -> None:
"""Process the Chain Run."""
self._submit(self._update_run_single, _copy(run))
def _on_chain_error(self, run: Run) -> None:
"""Process the Chain Run upon error."""
self._submit(self._update_run_single, _copy(run))
def _on_tool_start(self, run: Run) -> None:
"""Process the Tool Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_tool_end(self, run: Run) -> None:
"""Process the Tool Run."""
self._submit(self._update_run_single, _copy(run))
def _on_tool_error(self, run: Run) -> None:
"""Process the Tool Run upon error."""
self._submit(self._update_run_single, _copy(run))
def _on_retriever_start(self, run: Run) -> None:
"""Process the Retriever Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_retriever_end(self, run: Run) -> None:
"""Process the Retriever Run."""
self._submit(self._update_run_single, _copy(run))
def _on_retriever_error(self, run: Run) -> None:
"""Process the Retriever Run upon error."""
self._submit(self._update_run_single, _copy(run))
def wait_for_futures(self) -> None:
"""Wait for the given futures to complete."""
wait(self._futures)
__all__ = ["LangChainTracer", "wait_for_all_tracers"]

View File

@@ -1,185 +1,3 @@
from __future__ import annotations
from langchain.schema.callbacks.tracers.langchain_v1 import LangChainTracerV1
import logging
import os
from typing import Any, Dict, Optional, Union
import requests
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import (
ChainRun,
LLMRun,
Run,
ToolRun,
TracerSession,
TracerSessionV1,
TracerSessionV1Base,
)
from langchain.schema.messages import get_buffer_string
from langchain.utils import raise_for_status_with_text
logger = logging.getLogger(__name__)
def get_headers() -> Dict[str, Any]:
"""Get the headers for the LangChain API."""
headers: Dict[str, Any] = {"Content-Type": "application/json"}
if os.getenv("LANGCHAIN_API_KEY"):
headers["x-api-key"] = os.getenv("LANGCHAIN_API_KEY")
return headers
def _get_endpoint() -> str:
return os.getenv("LANGCHAIN_ENDPOINT", "http://localhost:8000")
class LangChainTracerV1(BaseTracer):
"""An implementation of the SharedTracer that POSTS to the langchain endpoint."""
def __init__(self, **kwargs: Any) -> None:
"""Initialize the LangChain tracer."""
super().__init__(**kwargs)
self.session: Optional[TracerSessionV1] = None
self._endpoint = _get_endpoint()
self._headers = get_headers()
def _convert_to_v1_run(self, run: Run) -> Union[LLMRun, ChainRun, ToolRun]:
session = self.session or self.load_default_session()
if not isinstance(session, TracerSessionV1):
raise ValueError(
"LangChainTracerV1 is not compatible with"
f" session of type {type(session)}"
)
if run.run_type == "llm":
if "prompts" in run.inputs:
prompts = run.inputs["prompts"]
elif "messages" in run.inputs:
prompts = [get_buffer_string(batch) for batch in run.inputs["messages"]]
else:
raise ValueError("No prompts found in LLM run inputs")
return LLMRun(
uuid=str(run.id) if run.id else None,
parent_uuid=str(run.parent_run_id) if run.parent_run_id else None,
start_time=run.start_time,
end_time=run.end_time,
extra=run.extra,
execution_order=run.execution_order,
child_execution_order=run.child_execution_order,
serialized=run.serialized,
session_id=session.id,
error=run.error,
prompts=prompts,
response=run.outputs if run.outputs else None,
)
if run.run_type == "chain":
child_runs = [self._convert_to_v1_run(run) for run in run.child_runs]
return ChainRun(
uuid=str(run.id) if run.id else None,
parent_uuid=str(run.parent_run_id) if run.parent_run_id else None,
start_time=run.start_time,
end_time=run.end_time,
execution_order=run.execution_order,
child_execution_order=run.child_execution_order,
serialized=run.serialized,
session_id=session.id,
inputs=run.inputs,
outputs=run.outputs,
error=run.error,
extra=run.extra,
child_llm_runs=[run for run in child_runs if isinstance(run, LLMRun)],
child_chain_runs=[
run for run in child_runs if isinstance(run, ChainRun)
],
child_tool_runs=[run for run in child_runs if isinstance(run, ToolRun)],
)
if run.run_type == "tool":
child_runs = [self._convert_to_v1_run(run) for run in run.child_runs]
return ToolRun(
uuid=str(run.id) if run.id else None,
parent_uuid=str(run.parent_run_id) if run.parent_run_id else None,
start_time=run.start_time,
end_time=run.end_time,
execution_order=run.execution_order,
child_execution_order=run.child_execution_order,
serialized=run.serialized,
session_id=session.id,
action=str(run.serialized),
tool_input=run.inputs.get("input", ""),
output=None if run.outputs is None else run.outputs.get("output"),
error=run.error,
extra=run.extra,
child_chain_runs=[
run for run in child_runs if isinstance(run, ChainRun)
],
child_tool_runs=[run for run in child_runs if isinstance(run, ToolRun)],
child_llm_runs=[run for run in child_runs if isinstance(run, LLMRun)],
)
raise ValueError(f"Unknown run type: {run.run_type}")
def _persist_run(self, run: Union[Run, LLMRun, ChainRun, ToolRun]) -> None:
"""Persist a run."""
if isinstance(run, Run):
v1_run = self._convert_to_v1_run(run)
else:
v1_run = run
if isinstance(v1_run, LLMRun):
endpoint = f"{self._endpoint}/llm-runs"
elif isinstance(v1_run, ChainRun):
endpoint = f"{self._endpoint}/chain-runs"
else:
endpoint = f"{self._endpoint}/tool-runs"
try:
response = requests.post(
endpoint,
data=v1_run.json(),
headers=self._headers,
)
raise_for_status_with_text(response)
except Exception as e:
logger.warning(f"Failed to persist run: {e}")
def _persist_session(
self, session_create: TracerSessionV1Base
) -> Union[TracerSessionV1, TracerSession]:
"""Persist a session."""
try:
r = requests.post(
f"{self._endpoint}/sessions",
data=session_create.json(),
headers=self._headers,
)
session = TracerSessionV1(id=r.json()["id"], **session_create.dict())
except Exception as e:
logger.warning(f"Failed to create session, using default session: {e}")
session = TracerSessionV1(id=1, **session_create.dict())
return session
def _load_session(self, session_name: Optional[str] = None) -> TracerSessionV1:
"""Load a session from the tracer."""
try:
url = f"{self._endpoint}/sessions"
if session_name:
url += f"?name={session_name}"
r = requests.get(url, headers=self._headers)
tracer_session = TracerSessionV1(**r.json()[0])
except Exception as e:
session_type = "default" if not session_name else session_name
logger.warning(
f"Failed to load {session_type} session, using empty session: {e}"
)
tracer_session = TracerSessionV1(id=1)
self.session = tracer_session
return tracer_session
def load_session(self, session_name: str) -> Union[TracerSessionV1, TracerSession]:
"""Load a session with the given name from the tracer."""
return self._load_session(session_name)
def load_default_session(self) -> Union[TracerSessionV1, TracerSession]:
"""Load the default tracing session and set it as the Tracer's session."""
return self._load_session("default")
__all__ = ["LangChainTracerV1"]

View File

@@ -1,311 +1,9 @@
from __future__ import annotations
import math
import threading
from collections import defaultdict
from typing import (
Any,
AsyncIterator,
Dict,
List,
Optional,
Sequence,
TypedDict,
Union,
from langchain.schema.callbacks.tracers.log_stream import (
LogEntry,
LogStreamCallbackHandler,
RunLog,
RunLogPatch,
RunState,
)
from uuid import UUID
import jsonpatch
from anyio import create_memory_object_stream
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import Run
from langchain.load.load import load
from langchain.schema.output import ChatGenerationChunk, GenerationChunk
class LogEntry(TypedDict):
"""A single entry in the run log."""
id: str
"""ID of the sub-run."""
name: str
"""Name of the object being run."""
type: str
"""Type of the object being run, eg. prompt, chain, llm, etc."""
tags: List[str]
"""List of tags for the run."""
metadata: Dict[str, Any]
"""Key-value pairs of metadata for the run."""
start_time: str
"""ISO-8601 timestamp of when the run started."""
streamed_output_str: List[str]
"""List of LLM tokens streamed by this run, if applicable."""
final_output: Optional[Any]
"""Final output of this run.
Only available after the run has finished successfully."""
end_time: Optional[str]
"""ISO-8601 timestamp of when the run ended.
Only available after the run has finished."""
class RunState(TypedDict):
"""State of the run."""
id: str
"""ID of the run."""
streamed_output: List[Any]
"""List of output chunks streamed by Runnable.stream()"""
final_output: Optional[Any]
"""Final output of the run, usually the result of aggregating (`+`) streamed_output.
Only available after the run has finished successfully."""
logs: Dict[str, LogEntry]
"""Map of run names to sub-runs. If filters were supplied, this list will
contain only the runs that matched the filters."""
class RunLogPatch:
"""A patch to the run log."""
ops: List[Dict[str, Any]]
"""List of jsonpatch operations, which describe how to create the run state
from an empty dict. This is the minimal representation of the log, designed to
be serialized as JSON and sent over the wire to reconstruct the log on the other
side. Reconstruction of the state can be done with any jsonpatch-compliant library,
see https://jsonpatch.com for more information."""
def __init__(self, *ops: Dict[str, Any]) -> None:
self.ops = list(ops)
def __add__(self, other: Union[RunLogPatch, Any]) -> RunLog:
if type(other) == RunLogPatch:
ops = self.ops + other.ops
state = jsonpatch.apply_patch(None, ops)
return RunLog(*ops, state=state)
raise TypeError(
f"unsupported operand type(s) for +: '{type(self)}' and '{type(other)}'"
)
def __repr__(self) -> str:
from pprint import pformat
# 1:-1 to get rid of the [] around the list
return f"RunLogPatch({pformat(self.ops)[1:-1]})"
def __eq__(self, other: object) -> bool:
return isinstance(other, RunLogPatch) and self.ops == other.ops
class RunLog(RunLogPatch):
"""A run log."""
state: RunState
"""Current state of the log, obtained from applying all ops in sequence."""
def __init__(self, *ops: Dict[str, Any], state: RunState) -> None:
super().__init__(*ops)
self.state = state
def __add__(self, other: Union[RunLogPatch, Any]) -> RunLog:
if type(other) == RunLogPatch:
ops = self.ops + other.ops
state = jsonpatch.apply_patch(self.state, other.ops)
return RunLog(*ops, state=state)
raise TypeError(
f"unsupported operand type(s) for +: '{type(self)}' and '{type(other)}'"
)
def __repr__(self) -> str:
from pprint import pformat
return f"RunLog({pformat(self.state)})"
class LogStreamCallbackHandler(BaseTracer):
"""A tracer that streams run logs to a stream."""
def __init__(
self,
*,
auto_close: bool = True,
include_names: Optional[Sequence[str]] = None,
include_types: Optional[Sequence[str]] = None,
include_tags: Optional[Sequence[str]] = None,
exclude_names: Optional[Sequence[str]] = None,
exclude_types: Optional[Sequence[str]] = None,
exclude_tags: Optional[Sequence[str]] = None,
) -> None:
super().__init__()
self.auto_close = auto_close
self.include_names = include_names
self.include_types = include_types
self.include_tags = include_tags
self.exclude_names = exclude_names
self.exclude_types = exclude_types
self.exclude_tags = exclude_tags
send_stream, receive_stream = create_memory_object_stream(
math.inf, item_type=RunLogPatch
)
self.lock = threading.Lock()
self.send_stream = send_stream
self.receive_stream = receive_stream
self._key_map_by_run_id: Dict[UUID, str] = {}
self._counter_map_by_name: Dict[str, int] = defaultdict(int)
self.root_id: Optional[UUID] = None
def __aiter__(self) -> AsyncIterator[RunLogPatch]:
return self.receive_stream.__aiter__()
def include_run(self, run: Run) -> bool:
if run.id == self.root_id:
return False
run_tags = run.tags or []
if (
self.include_names is None
and self.include_types is None
and self.include_tags is None
):
include = True
else:
include = False
if self.include_names is not None:
include = include or run.name in self.include_names
if self.include_types is not None:
include = include or run.run_type in self.include_types
if self.include_tags is not None:
include = include or any(tag in self.include_tags for tag in run_tags)
if self.exclude_names is not None:
include = include and run.name not in self.exclude_names
if self.exclude_types is not None:
include = include and run.run_type not in self.exclude_types
if self.exclude_tags is not None:
include = include and all(tag not in self.exclude_tags for tag in run_tags)
return include
def _persist_run(self, run: Run) -> None:
# This is a legacy method only called once for an entire run tree
# therefore not useful here
pass
def _on_run_create(self, run: Run) -> None:
"""Start a run."""
if self.root_id is None:
self.root_id = run.id
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "replace",
"path": "",
"value": RunState(
id=str(run.id),
streamed_output=[],
final_output=None,
logs={},
),
}
)
)
if not self.include_run(run):
return
# Determine previous index, increment by 1
with self.lock:
self._counter_map_by_name[run.name] += 1
count = self._counter_map_by_name[run.name]
self._key_map_by_run_id[run.id] = (
run.name if count == 1 else f"{run.name}:{count}"
)
# Add the run to the stream
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "add",
"path": f"/logs/{self._key_map_by_run_id[run.id]}",
"value": LogEntry(
id=str(run.id),
name=run.name,
type=run.run_type,
tags=run.tags or [],
metadata=(run.extra or {}).get("metadata", {}),
start_time=run.start_time.isoformat(timespec="milliseconds"),
streamed_output_str=[],
final_output=None,
end_time=None,
),
}
)
)
def _on_run_update(self, run: Run) -> None:
"""Finish a run."""
try:
index = self._key_map_by_run_id.get(run.id)
if index is None:
return
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "add",
"path": f"/logs/{index}/final_output",
# to undo the dumpd done by some runnables / tracer / etc
"value": load(run.outputs),
},
{
"op": "add",
"path": f"/logs/{index}/end_time",
"value": run.end_time.isoformat(timespec="milliseconds")
if run.end_time is not None
else None,
},
)
)
finally:
if run.id == self.root_id:
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "replace",
"path": "/final_output",
"value": load(run.outputs),
}
)
)
if self.auto_close:
self.send_stream.close()
def _on_llm_new_token(
self,
run: Run,
token: str,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]],
) -> None:
"""Process new LLM token."""
index = self._key_map_by_run_id.get(run.id)
if index is None:
return
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "add",
"path": f"/logs/{index}/streamed_output_str/-",
"value": token,
}
)
)
__all__ = ["LogEntry", "RunState", "RunLog", "RunLogPatch", "LogStreamCallbackHandler"]

View File

@@ -1,54 +1,3 @@
from typing import Callable, Optional, Union
from uuid import UUID
from langchain.schema.callbacks.tracers.root_listeners import RootListenersTracer
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import Run
from langchain.schema.runnable.config import (
RunnableConfig,
call_func_with_variable_args,
)
Listener = Union[Callable[[Run], None], Callable[[Run, RunnableConfig], None]]
class RootListenersTracer(BaseTracer):
def __init__(
self,
*,
config: RunnableConfig,
on_start: Optional[Listener],
on_end: Optional[Listener],
on_error: Optional[Listener],
) -> None:
super().__init__()
self.config = config
self._arg_on_start = on_start
self._arg_on_end = on_end
self._arg_on_error = on_error
self.root_id: Optional[UUID] = None
def _persist_run(self, run: Run) -> None:
# This is a legacy method only called once for an entire run tree
# therefore not useful here
pass
def _on_run_create(self, run: Run) -> None:
if self.root_id is not None:
return
self.root_id = run.id
if self._arg_on_start is not None:
call_func_with_variable_args(self._arg_on_start, run, self.config)
def _on_run_update(self, run: Run) -> None:
if run.id != self.root_id:
return
if run.error is None:
if self._arg_on_end is not None:
call_func_with_variable_args(self._arg_on_end, run, self.config)
else:
if self._arg_on_error is not None:
call_func_with_variable_args(self._arg_on_error, run, self.config)
__all__ = ["RootListenersTracer"]

View File

@@ -1,52 +1,3 @@
"""A tracer that collects all nested runs in a list."""
from langchain.schema.callbacks.tracers.run_collector import RunCollectorCallbackHandler
from typing import Any, List, Optional, Union
from uuid import UUID
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import Run
class RunCollectorCallbackHandler(BaseTracer):
"""
A tracer that collects all nested runs in a list.
This tracer is useful for inspection and evaluation purposes.
Parameters
----------
example_id : Optional[Union[UUID, str]], default=None
The ID of the example being traced. It can be either a UUID or a string.
"""
name: str = "run-collector_callback_handler"
def __init__(
self, example_id: Optional[Union[UUID, str]] = None, **kwargs: Any
) -> None:
"""
Initialize the RunCollectorCallbackHandler.
Parameters
----------
example_id : Optional[Union[UUID, str]], default=None
The ID of the example being traced. It can be either a UUID or a string.
"""
super().__init__(**kwargs)
self.example_id = (
UUID(example_id) if isinstance(example_id, str) else example_id
)
self.traced_runs: List[Run] = []
def _persist_run(self, run: Run) -> None:
"""
Persist a run by adding it to the traced_runs list.
Parameters
----------
run : Run
The run to be persisted.
"""
run_ = run.copy()
run_.reference_example_id = self.example_id
self.traced_runs.append(run_)
__all__ = ["RunCollectorCallbackHandler"]

View File

@@ -1,129 +1,16 @@
"""Schemas for tracers."""
from __future__ import annotations
import datetime
import warnings
from typing import Any, Dict, List, Optional, Type
from uuid import UUID
from langsmith.schemas import RunBase as BaseRunV2
from langsmith.schemas import RunTypeEnum as RunTypeEnumDep
from langchain.pydantic_v1 import BaseModel, Field, root_validator
from langchain.schema import LLMResult
def RunTypeEnum() -> Type[RunTypeEnumDep]:
"""RunTypeEnum."""
warnings.warn(
"RunTypeEnum is deprecated. Please directly use a string instead"
" (e.g. 'llm', 'chain', 'tool').",
DeprecationWarning,
)
return RunTypeEnumDep
class TracerSessionV1Base(BaseModel):
"""Base class for TracerSessionV1."""
start_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
name: Optional[str] = None
extra: Optional[Dict[str, Any]] = None
class TracerSessionV1Create(TracerSessionV1Base):
"""Create class for TracerSessionV1."""
class TracerSessionV1(TracerSessionV1Base):
"""TracerSessionV1 schema."""
id: int
class TracerSessionBase(TracerSessionV1Base):
"""Base class for TracerSession."""
tenant_id: UUID
class TracerSession(TracerSessionBase):
"""TracerSessionV1 schema for the V2 API."""
id: UUID
class BaseRun(BaseModel):
"""Base class for Run."""
uuid: str
parent_uuid: Optional[str] = None
start_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
end_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
extra: Optional[Dict[str, Any]] = None
execution_order: int
child_execution_order: int
serialized: Dict[str, Any]
session_id: int
error: Optional[str] = None
class LLMRun(BaseRun):
"""Class for LLMRun."""
prompts: List[str]
response: Optional[LLMResult] = None
class ChainRun(BaseRun):
"""Class for ChainRun."""
inputs: Dict[str, Any]
outputs: Optional[Dict[str, Any]] = None
child_llm_runs: List[LLMRun] = Field(default_factory=list)
child_chain_runs: List[ChainRun] = Field(default_factory=list)
child_tool_runs: List[ToolRun] = Field(default_factory=list)
class ToolRun(BaseRun):
"""Class for ToolRun."""
tool_input: str
output: Optional[str] = None
action: str
child_llm_runs: List[LLMRun] = Field(default_factory=list)
child_chain_runs: List[ChainRun] = Field(default_factory=list)
child_tool_runs: List[ToolRun] = Field(default_factory=list)
# Begin V2 API Schemas
class Run(BaseRunV2):
"""Run schema for the V2 API in the Tracer."""
execution_order: int
child_execution_order: int
child_runs: List[Run] = Field(default_factory=list)
tags: Optional[List[str]] = Field(default_factory=list)
events: List[Dict[str, Any]] = Field(default_factory=list)
@root_validator(pre=True)
def assign_name(cls, values: dict) -> dict:
"""Assign name to the run."""
if values.get("name") is None:
if "name" in values["serialized"]:
values["name"] = values["serialized"]["name"]
elif "id" in values["serialized"]:
values["name"] = values["serialized"]["id"][-1]
if values.get("events") is None:
values["events"] = []
return values
ChainRun.update_forward_refs()
ToolRun.update_forward_refs()
Run.update_forward_refs()
from langchain.schema.callbacks.tracers.schemas import (
BaseRun,
ChainRun,
LLMRun,
Run,
RunTypeEnum,
ToolRun,
TracerSession,
TracerSessionBase,
TracerSessionV1,
TracerSessionV1Base,
TracerSessionV1Create,
)
__all__ = [
"BaseRun",

View File

@@ -1,178 +1,6 @@
import json
from typing import Any, Callable, List
from langchain.schema.callbacks.tracers.stdout import (
ConsoleCallbackHandler,
FunctionCallbackHandler,
)
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import Run
from langchain.utils.input import get_bolded_text, get_colored_text
def try_json_stringify(obj: Any, fallback: str) -> str:
"""
Try to stringify an object to JSON.
Args:
obj: Object to stringify.
fallback: Fallback string to return if the object cannot be stringified.
Returns:
A JSON string if the object can be stringified, otherwise the fallback string.
"""
try:
return json.dumps(obj, indent=2, ensure_ascii=False)
except Exception:
return fallback
def elapsed(run: Any) -> str:
"""Get the elapsed time of a run.
Args:
run: any object with a start_time and end_time attribute.
Returns:
A string with the elapsed time in seconds or
milliseconds if time is less than a second.
"""
elapsed_time = run.end_time - run.start_time
milliseconds = elapsed_time.total_seconds() * 1000
if milliseconds < 1000:
return f"{milliseconds:.0f}ms"
return f"{(milliseconds / 1000):.2f}s"
class FunctionCallbackHandler(BaseTracer):
"""Tracer that calls a function with a single str parameter."""
name: str = "function_callback_handler"
def __init__(self, function: Callable[[str], None], **kwargs: Any) -> None:
super().__init__(**kwargs)
self.function_callback = function
def _persist_run(self, run: Run) -> None:
pass
def get_parents(self, run: Run) -> List[Run]:
parents = []
current_run = run
while current_run.parent_run_id:
parent = self.run_map.get(str(current_run.parent_run_id))
if parent:
parents.append(parent)
current_run = parent
else:
break
return parents
def get_breadcrumbs(self, run: Run) -> str:
parents = self.get_parents(run)[::-1]
string = " > ".join(
f"{parent.execution_order}:{parent.run_type}:{parent.name}"
if i != len(parents) - 1
else f"{parent.execution_order}:{parent.run_type}:{parent.name}"
for i, parent in enumerate(parents + [run])
)
return string
# logging methods
def _on_chain_start(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
run_type = run.run_type.capitalize()
self.function_callback(
f"{get_colored_text('[chain/start]', color='green')} "
+ get_bolded_text(f"[{crumbs}] Entering {run_type} run with input:\n")
+ f"{try_json_stringify(run.inputs, '[inputs]')}"
)
def _on_chain_end(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
run_type = run.run_type.capitalize()
self.function_callback(
f"{get_colored_text('[chain/end]', color='blue')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] Exiting {run_type} run with output:\n"
)
+ f"{try_json_stringify(run.outputs, '[outputs]')}"
)
def _on_chain_error(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
run_type = run.run_type.capitalize()
self.function_callback(
f"{get_colored_text('[chain/error]', color='red')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] {run_type} run errored with error:\n"
)
+ f"{try_json_stringify(run.error, '[error]')}"
)
def _on_llm_start(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
inputs = (
{"prompts": [p.strip() for p in run.inputs["prompts"]]}
if "prompts" in run.inputs
else run.inputs
)
self.function_callback(
f"{get_colored_text('[llm/start]', color='green')} "
+ get_bolded_text(f"[{crumbs}] Entering LLM run with input:\n")
+ f"{try_json_stringify(inputs, '[inputs]')}"
)
def _on_llm_end(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f"{get_colored_text('[llm/end]', color='blue')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] Exiting LLM run with output:\n"
)
+ f"{try_json_stringify(run.outputs, '[response]')}"
)
def _on_llm_error(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f"{get_colored_text('[llm/error]', color='red')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] LLM run errored with error:\n"
)
+ f"{try_json_stringify(run.error, '[error]')}"
)
def _on_tool_start(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f'{get_colored_text("[tool/start]", color="green")} '
+ get_bolded_text(f"[{crumbs}] Entering Tool run with input:\n")
+ f'"{run.inputs["input"].strip()}"'
)
def _on_tool_end(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
if run.outputs:
self.function_callback(
f'{get_colored_text("[tool/end]", color="blue")} '
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] Exiting Tool run with output:\n"
)
+ f'"{run.outputs["output"].strip()}"'
)
def _on_tool_error(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f"{get_colored_text('[tool/error]', color='red')} "
+ get_bolded_text(f"[{crumbs}] [{elapsed(run)}] ")
+ f"Tool run errored with error:\n"
f"{run.error}"
)
class ConsoleCallbackHandler(FunctionCallbackHandler):
"""Tracer that prints to the console."""
name: str = "console_callback_handler"
def __init__(self, **kwargs: Any) -> None:
super().__init__(function=print, **kwargs)
__all__ = ["FunctionCallbackHandler", "ConsoleCallbackHandler"]

View File

@@ -41,6 +41,22 @@ class BedrockChat(BaseChatModel, BedrockBase):
"""Return type of chat model."""
return "amazon_bedrock_chat"
@classmethod
def is_lc_serializable(cls) -> bool:
"""Return whether this model can be serialized by Langchain."""
return True
@property
def lc_attributes(self) -> Dict[str, Any]:
attributes: Dict[str, Any] = {}
print(self.region_name)
if self.region_name:
attributes["region_name"] = self.region_name
return attributes
class Config:
"""Configuration for this pydantic object."""

View File

@@ -101,17 +101,21 @@ class VoyageEmbeddings(BaseModel, Embeddings):
)
return values
def _invocation_params(self, input: List[str]) -> Dict:
def _invocation_params(
self, input: List[str], input_type: Optional[str] = None
) -> Dict:
api_key = cast(SecretStr, self.voyage_api_key).get_secret_value()
params = {
"url": self.voyage_api_base,
"headers": {"Authorization": f"Bearer {api_key}"},
"json": {"model": self.model, "input": input},
"json": {"model": self.model, "input": input, "input_type": input_type},
"timeout": self.request_timeout,
}
return params
def _get_embeddings(self, texts: List[str], batch_size: int) -> List[List[float]]:
def _get_embeddings(
self, texts: List[str], batch_size: int, input_type: Optional[str] = None
) -> List[List[float]]:
embeddings: List[List[float]] = []
if self.show_progress_bar:
@@ -127,9 +131,18 @@ class VoyageEmbeddings(BaseModel, Embeddings):
else:
_iter = range(0, len(texts), batch_size)
if input_type and input_type not in ["query", "document"]:
raise ValueError(
f"input_type {input_type} is invalid. Options: None, 'query', "
"'document'."
)
for i in _iter:
response = embed_with_retry(
self, **self._invocation_params(input=texts[i : i + batch_size])
self,
**self._invocation_params(
input=texts[i : i + batch_size], input_type=input_type
),
)
embeddings.extend(r["embedding"] for r in response["data"])
@@ -144,7 +157,9 @@ class VoyageEmbeddings(BaseModel, Embeddings):
Returns:
List of embeddings, one for each text.
"""
return self._get_embeddings(texts, batch_size=self.batch_size)
return self._get_embeddings(
texts, batch_size=self.batch_size, input_type="document"
)
def embed_query(self, text: str) -> List[float]:
"""Call out to Voyage Embedding endpoint for embedding query text.
@@ -155,4 +170,6 @@ class VoyageEmbeddings(BaseModel, Embeddings):
Returns:
Embedding for the text.
"""
return self.embed_documents([text])[0]
return self._get_embeddings(
[text], batch_size=self.batch_size, input_type="query"
)[0]

View File

@@ -1,4 +1,4 @@
from typing import Any, Dict, List
from typing import Any, Dict, List, Optional
from langchain.graphs.graph_document import GraphDocument
from langchain.graphs.graph_store import GraphStore
@@ -48,7 +48,13 @@ class FalkorDBGraph(GraphStore):
"""
def __init__(
self, database: str, host: str = "localhost", port: int = 6379
self,
database: str,
host: str = "localhost",
port: int = 6379,
username: Optional[str] = None,
password: Optional[str] = None,
ssl: bool = False,
) -> None:
"""Create a new FalkorDB graph wrapper instance."""
try:
@@ -60,7 +66,9 @@ class FalkorDBGraph(GraphStore):
"Please install it with `pip install redis`."
)
self._driver = redis.Redis(host=host, port=port)
self._driver = redis.Redis(
host=host, port=port, username=username, password=password, ssl=ssl
)
self._graph = Graph(self._driver, database)
self.schema: str = ""
self.structured_schema: Dict[str, Any] = {}

View File

@@ -12,6 +12,7 @@ from langchain.utilities.anthropic import (
get_num_tokens_anthropic,
get_token_ids_anthropic,
)
from langchain.utils import get_from_dict_or_env
HUMAN_PROMPT = "\n\nHuman:"
ASSISTANT_PROMPT = "\n\nAssistant:"
@@ -195,6 +196,13 @@ class BedrockBase(BaseModel, ABC):
# use default credentials
session = boto3.Session()
values["region_name"] = get_from_dict_or_env(
values,
"region_name",
"AWS_DEFAULT_REGION",
default=None,
)
client_params = {}
if values["region_name"]:
client_params["region_name"] = values["region_name"]
@@ -340,6 +348,20 @@ class Bedrock(LLM, BedrockBase):
"""Return type of llm."""
return "amazon_bedrock"
@classmethod
def is_lc_serializable(cls) -> bool:
"""Return whether this model can be serialized by Langchain."""
return True
@property
def lc_attributes(self) -> Dict[str, Any]:
attributes: Dict[str, Any] = {}
if self.region_name:
attributes["region_name"] = self.region_name
return attributes
class Config:
"""Configuration for this pydantic object."""

View File

@@ -0,0 +1,598 @@
"""Base callback handler that can be used to handle callbacks in langchain."""
from __future__ import annotations
from typing import Any, Dict, List, Optional, Sequence, TypeVar, Union
from uuid import UUID
from tenacity import RetryCallState
from langchain.schema.agent import AgentAction, AgentFinish
from langchain.schema.document import Document
from langchain.schema.messages import BaseMessage
from langchain.schema.output import ChatGenerationChunk, GenerationChunk, LLMResult
class RetrieverManagerMixin:
"""Mixin for Retriever callbacks."""
def on_retriever_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when Retriever errors."""
def on_retriever_end(
self,
documents: Sequence[Document],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when Retriever ends running."""
class LLMManagerMixin:
"""Mixin for LLM callbacks."""
def on_llm_new_token(
self,
token: str,
*,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on new LLM token. Only available when streaming is enabled.
Args:
token (str): The new token.
chunk (GenerationChunk | ChatGenerationChunk): The new generated chunk,
containing content and other information.
"""
def on_llm_end(
self,
response: LLMResult,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when LLM ends running."""
def on_llm_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when LLM errors."""
class ChainManagerMixin:
"""Mixin for chain callbacks."""
def on_chain_end(
self,
outputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when chain ends running."""
def on_chain_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when chain errors."""
def on_agent_action(
self,
action: AgentAction,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on agent action."""
def on_agent_finish(
self,
finish: AgentFinish,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on agent end."""
class ToolManagerMixin:
"""Mixin for tool callbacks."""
def on_tool_end(
self,
output: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when tool ends running."""
def on_tool_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run when tool errors."""
class CallbackManagerMixin:
"""Mixin for callback manager."""
def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when LLM starts running."""
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when a chat model starts running."""
raise NotImplementedError(
f"{self.__class__.__name__} does not implement `on_chat_model_start`"
)
def on_retriever_start(
self,
serialized: Dict[str, Any],
query: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when Retriever starts running."""
def on_chain_start(
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when chain starts running."""
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when tool starts running."""
class RunManagerMixin:
"""Mixin for run manager."""
def on_text(
self,
text: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on arbitrary text."""
def on_retry(
self,
retry_state: RetryCallState,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on a retry event."""
class BaseCallbackHandler(
LLMManagerMixin,
ChainManagerMixin,
ToolManagerMixin,
RetrieverManagerMixin,
CallbackManagerMixin,
RunManagerMixin,
):
"""Base callback handler that handles callbacks from LangChain."""
raise_error: bool = False
run_inline: bool = False
@property
def ignore_llm(self) -> bool:
"""Whether to ignore LLM callbacks."""
return False
@property
def ignore_retry(self) -> bool:
"""Whether to ignore retry callbacks."""
return False
@property
def ignore_chain(self) -> bool:
"""Whether to ignore chain callbacks."""
return False
@property
def ignore_agent(self) -> bool:
"""Whether to ignore agent callbacks."""
return False
@property
def ignore_retriever(self) -> bool:
"""Whether to ignore retriever callbacks."""
return False
@property
def ignore_chat_model(self) -> bool:
"""Whether to ignore chat model callbacks."""
return False
class AsyncCallbackHandler(BaseCallbackHandler):
"""Async callback handler that handles callbacks from LangChain."""
async def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run when LLM starts running."""
async def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
"""Run when a chat model starts running."""
raise NotImplementedError(
f"{self.__class__.__name__} does not implement `on_chat_model_start`"
)
async def on_llm_new_token(
self,
token: str,
*,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on new LLM token. Only available when streaming is enabled."""
async def on_llm_end(
self,
response: LLMResult,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when LLM ends running."""
async def on_llm_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when LLM errors."""
async def on_chain_start(
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run when chain starts running."""
async def on_chain_end(
self,
outputs: Dict[str, Any],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when chain ends running."""
async def on_chain_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when chain errors."""
async def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run when tool starts running."""
async def on_tool_end(
self,
output: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when tool ends running."""
async def on_tool_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run when tool errors."""
async def on_text(
self,
text: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on arbitrary text."""
async def on_retry(
self,
retry_state: RetryCallState,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
"""Run on a retry event."""
async def on_agent_action(
self,
action: AgentAction,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on agent action."""
async def on_agent_finish(
self,
finish: AgentFinish,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on agent end."""
async def on_retriever_start(
self,
serialized: Dict[str, Any],
query: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
"""Run on retriever start."""
async def on_retriever_end(
self,
documents: Sequence[Document],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on retriever end."""
async def on_retriever_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
**kwargs: Any,
) -> None:
"""Run on retriever error."""
T = TypeVar("T", bound="BaseCallbackManager")
class BaseCallbackManager(CallbackManagerMixin):
"""Base callback manager that handles callbacks from LangChain."""
def __init__(
self,
handlers: List[BaseCallbackHandler],
inheritable_handlers: Optional[List[BaseCallbackHandler]] = None,
parent_run_id: Optional[UUID] = None,
*,
tags: Optional[List[str]] = None,
inheritable_tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
inheritable_metadata: Optional[Dict[str, Any]] = None,
) -> None:
"""Initialize callback manager."""
self.handlers: List[BaseCallbackHandler] = handlers
self.inheritable_handlers: List[BaseCallbackHandler] = (
inheritable_handlers or []
)
self.parent_run_id: Optional[UUID] = parent_run_id
self.tags = tags or []
self.inheritable_tags = inheritable_tags or []
self.metadata = metadata or {}
self.inheritable_metadata = inheritable_metadata or {}
def copy(self: T) -> T:
"""Copy the callback manager."""
return self.__class__(
handlers=self.handlers,
inheritable_handlers=self.inheritable_handlers,
parent_run_id=self.parent_run_id,
tags=self.tags,
inheritable_tags=self.inheritable_tags,
metadata=self.metadata,
inheritable_metadata=self.inheritable_metadata,
)
@property
def is_async(self) -> bool:
"""Whether the callback manager is async."""
return False
def add_handler(self, handler: BaseCallbackHandler, inherit: bool = True) -> None:
"""Add a handler to the callback manager."""
if handler not in self.handlers:
self.handlers.append(handler)
if inherit and handler not in self.inheritable_handlers:
self.inheritable_handlers.append(handler)
def remove_handler(self, handler: BaseCallbackHandler) -> None:
"""Remove a handler from the callback manager."""
self.handlers.remove(handler)
self.inheritable_handlers.remove(handler)
def set_handlers(
self, handlers: List[BaseCallbackHandler], inherit: bool = True
) -> None:
"""Set handlers as the only handlers on the callback manager."""
self.handlers = []
self.inheritable_handlers = []
for handler in handlers:
self.add_handler(handler, inherit=inherit)
def set_handler(self, handler: BaseCallbackHandler, inherit: bool = True) -> None:
"""Set handler as the only handler on the callback manager."""
self.set_handlers([handler], inherit=inherit)
def add_tags(self, tags: List[str], inherit: bool = True) -> None:
for tag in tags:
if tag in self.tags:
self.remove_tags([tag])
self.tags.extend(tags)
if inherit:
self.inheritable_tags.extend(tags)
def remove_tags(self, tags: List[str]) -> None:
for tag in tags:
self.tags.remove(tag)
self.inheritable_tags.remove(tag)
def add_metadata(self, metadata: Dict[str, Any], inherit: bool = True) -> None:
self.metadata.update(metadata)
if inherit:
self.inheritable_metadata.update(metadata)
def remove_metadata(self, keys: List[str]) -> None:
for key in keys:
self.metadata.pop(key)
self.inheritable_metadata.pop(key)
Callbacks = Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,97 @@
"""Callback Handler that prints to std out."""
from typing import Any, Dict, List, Optional
from langchain.schema import AgentAction, AgentFinish, LLMResult
from langchain.schema.callbacks.base import BaseCallbackHandler
from langchain.utils.input import print_text
class StdOutCallbackHandler(BaseCallbackHandler):
"""Callback Handler that prints to std out."""
def __init__(self, color: Optional[str] = None) -> None:
"""Initialize callback handler."""
self.color = color
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> None:
"""Print out the prompts."""
pass
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_llm_error(self, error: BaseException, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
) -> None:
"""Print out that we are entering a chain."""
class_name = serialized.get("name", serialized.get("id", ["<unknown>"])[-1])
print(f"\n\n\033[1m> Entering new {class_name} chain...\033[0m")
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Print out that we finished a chain."""
print("\n\033[1m> Finished chain.\033[0m")
def on_chain_error(self, error: BaseException, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
**kwargs: Any,
) -> None:
"""Do nothing."""
pass
def on_agent_action(
self, action: AgentAction, color: Optional[str] = None, **kwargs: Any
) -> Any:
"""Run on agent action."""
print_text(action.log, color=color or self.color)
def on_tool_end(
self,
output: str,
color: Optional[str] = None,
observation_prefix: Optional[str] = None,
llm_prefix: Optional[str] = None,
**kwargs: Any,
) -> None:
"""If not the final action, print out observation."""
if observation_prefix is not None:
print_text(f"\n{observation_prefix}")
print_text(output, color=color or self.color)
if llm_prefix is not None:
print_text(f"\n{llm_prefix}")
def on_tool_error(self, error: BaseException, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_text(
self,
text: str,
color: Optional[str] = None,
end: str = "",
**kwargs: Any,
) -> None:
"""Run when agent ends."""
print_text(text, color=color or self.color, end=end)
def on_agent_finish(
self, finish: AgentFinish, color: Optional[str] = None, **kwargs: Any
) -> None:
"""Run on agent end."""
print_text(finish.log, color=color or self.color, end="\n")

View File

@@ -0,0 +1,537 @@
"""Base interfaces for tracing runs."""
from __future__ import annotations
import logging
from abc import ABC, abstractmethod
from datetime import datetime
from typing import Any, Dict, List, Optional, Sequence, Union, cast
from uuid import UUID
from tenacity import RetryCallState
from langchain.load.dump import dumpd
from langchain.schema.callbacks.base import BaseCallbackHandler
from langchain.schema.callbacks.tracers.schemas import Run
from langchain.schema.document import Document
from langchain.schema.output import (
ChatGeneration,
ChatGenerationChunk,
GenerationChunk,
LLMResult,
)
logger = logging.getLogger(__name__)
class TracerException(Exception):
"""Base class for exceptions in tracers module."""
class BaseTracer(BaseCallbackHandler, ABC):
"""Base interface for tracers."""
def __init__(self, **kwargs: Any) -> None:
super().__init__(**kwargs)
self.run_map: Dict[str, Run] = {}
@staticmethod
def _add_child_run(
parent_run: Run,
child_run: Run,
) -> None:
"""Add child run to a chain run or tool run."""
parent_run.child_runs.append(child_run)
@abstractmethod
def _persist_run(self, run: Run) -> None:
"""Persist a run."""
def _start_trace(self, run: Run) -> None:
"""Start a trace for a run."""
if run.parent_run_id:
parent_run = self.run_map.get(str(run.parent_run_id))
if parent_run:
self._add_child_run(parent_run, run)
parent_run.child_execution_order = max(
parent_run.child_execution_order, run.child_execution_order
)
else:
logger.debug(f"Parent run with UUID {run.parent_run_id} not found.")
self.run_map[str(run.id)] = run
self._on_run_create(run)
def _end_trace(self, run: Run) -> None:
"""End a trace for a run."""
if not run.parent_run_id:
self._persist_run(run)
else:
parent_run = self.run_map.get(str(run.parent_run_id))
if parent_run is None:
logger.debug(f"Parent run with UUID {run.parent_run_id} not found.")
elif (
run.child_execution_order is not None
and parent_run.child_execution_order is not None
and run.child_execution_order > parent_run.child_execution_order
):
parent_run.child_execution_order = run.child_execution_order
self.run_map.pop(str(run.id))
self._on_run_update(run)
def _get_execution_order(self, parent_run_id: Optional[str] = None) -> int:
"""Get the execution order for a run."""
if parent_run_id is None:
return 1
parent_run = self.run_map.get(parent_run_id)
if parent_run is None:
logger.debug(f"Parent run with UUID {parent_run_id} not found.")
return 1
if parent_run.child_execution_order is None:
raise TracerException(
f"Parent run with UUID {parent_run_id} has no child execution order."
)
return parent_run.child_execution_order + 1
def on_llm_start(
self,
serialized: Dict[str, Any],
prompts: List[str],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for an LLM run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
llm_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"prompts": prompts},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
run_type="llm",
tags=tags or [],
name=name,
)
self._start_trace(llm_run)
self._on_llm_start(llm_run)
return llm_run
def on_llm_new_token(
self,
token: str,
*,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]] = None,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Run:
"""Run on new LLM token. Only available when streaming is enabled."""
if not run_id:
raise TracerException("No run_id provided for on_llm_new_token callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None or llm_run.run_type != "llm":
raise TracerException(f"No LLM Run found to be traced for {run_id}")
event_kwargs: Dict[str, Any] = {"token": token}
if chunk:
event_kwargs["chunk"] = chunk
llm_run.events.append(
{
"name": "new_token",
"time": datetime.utcnow(),
"kwargs": event_kwargs,
},
)
self._on_llm_new_token(llm_run, token, chunk)
return llm_run
def on_retry(
self,
retry_state: RetryCallState,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
if not run_id:
raise TracerException("No run_id provided for on_retry callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None:
raise TracerException("No Run found to be traced for on_retry")
retry_d: Dict[str, Any] = {
"slept": retry_state.idle_for,
"attempt": retry_state.attempt_number,
}
if retry_state.outcome is None:
retry_d["outcome"] = "N/A"
elif retry_state.outcome.failed:
retry_d["outcome"] = "failed"
exception = retry_state.outcome.exception()
retry_d["exception"] = str(exception)
retry_d["exception_type"] = exception.__class__.__name__
else:
retry_d["outcome"] = "success"
retry_d["result"] = str(retry_state.outcome.result())
llm_run.events.append(
{
"name": "retry",
"time": datetime.utcnow(),
"kwargs": retry_d,
},
)
return llm_run
def on_llm_end(self, response: LLMResult, *, run_id: UUID, **kwargs: Any) -> Run:
"""End a trace for an LLM run."""
if not run_id:
raise TracerException("No run_id provided for on_llm_end callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None or llm_run.run_type != "llm":
raise TracerException(f"No LLM Run found to be traced for {run_id}")
llm_run.outputs = response.dict()
for i, generations in enumerate(response.generations):
for j, generation in enumerate(generations):
output_generation = llm_run.outputs["generations"][i][j]
if "message" in output_generation:
output_generation["message"] = dumpd(
cast(ChatGeneration, generation).message
)
llm_run.end_time = datetime.utcnow()
llm_run.events.append({"name": "end", "time": llm_run.end_time})
self._end_trace(llm_run)
self._on_llm_end(llm_run)
return llm_run
def on_llm_error(
self,
error: BaseException,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Handle an error for an LLM run."""
if not run_id:
raise TracerException("No run_id provided for on_llm_error callback.")
run_id_ = str(run_id)
llm_run = self.run_map.get(run_id_)
if llm_run is None or llm_run.run_type != "llm":
raise TracerException(f"No LLM Run found to be traced for {run_id}")
llm_run.error = repr(error)
llm_run.end_time = datetime.utcnow()
llm_run.events.append({"name": "error", "time": llm_run.end_time})
self._end_trace(llm_run)
self._on_chain_error(llm_run)
return llm_run
def on_chain_start(
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
run_type: Optional[str] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for a chain run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
chain_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs=inputs if isinstance(inputs, dict) else {"input": inputs},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
child_runs=[],
run_type=run_type or "chain",
name=name,
tags=tags or [],
)
self._start_trace(chain_run)
self._on_chain_start(chain_run)
return chain_run
def on_chain_end(
self,
outputs: Dict[str, Any],
*,
run_id: UUID,
inputs: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Run:
"""End a trace for a chain run."""
if not run_id:
raise TracerException("No run_id provided for on_chain_end callback.")
chain_run = self.run_map.get(str(run_id))
if chain_run is None:
raise TracerException(f"No chain Run found to be traced for {run_id}")
chain_run.outputs = (
outputs if isinstance(outputs, dict) else {"output": outputs}
)
chain_run.end_time = datetime.utcnow()
chain_run.events.append({"name": "end", "time": chain_run.end_time})
if inputs is not None:
chain_run.inputs = inputs if isinstance(inputs, dict) else {"input": inputs}
self._end_trace(chain_run)
self._on_chain_end(chain_run)
return chain_run
def on_chain_error(
self,
error: BaseException,
*,
inputs: Optional[Dict[str, Any]] = None,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Handle an error for a chain run."""
if not run_id:
raise TracerException("No run_id provided for on_chain_error callback.")
chain_run = self.run_map.get(str(run_id))
if chain_run is None:
raise TracerException(f"No chain Run found to be traced for {run_id}")
chain_run.error = repr(error)
chain_run.end_time = datetime.utcnow()
chain_run.events.append({"name": "error", "time": chain_run.end_time})
if inputs is not None:
chain_run.inputs = inputs if isinstance(inputs, dict) else {"input": inputs}
self._end_trace(chain_run)
self._on_chain_error(chain_run)
return chain_run
def on_tool_start(
self,
serialized: Dict[str, Any],
input_str: str,
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Start a trace for a tool run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
tool_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"input": input_str},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
child_runs=[],
run_type="tool",
tags=tags or [],
name=name,
)
self._start_trace(tool_run)
self._on_tool_start(tool_run)
return tool_run
def on_tool_end(self, output: str, *, run_id: UUID, **kwargs: Any) -> Run:
"""End a trace for a tool run."""
if not run_id:
raise TracerException("No run_id provided for on_tool_end callback.")
tool_run = self.run_map.get(str(run_id))
if tool_run is None or tool_run.run_type != "tool":
raise TracerException(f"No tool Run found to be traced for {run_id}")
tool_run.outputs = {"output": output}
tool_run.end_time = datetime.utcnow()
tool_run.events.append({"name": "end", "time": tool_run.end_time})
self._end_trace(tool_run)
self._on_tool_end(tool_run)
return tool_run
def on_tool_error(
self,
error: BaseException,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Handle an error for a tool run."""
if not run_id:
raise TracerException("No run_id provided for on_tool_error callback.")
tool_run = self.run_map.get(str(run_id))
if tool_run is None or tool_run.run_type != "tool":
raise TracerException(f"No tool Run found to be traced for {run_id}")
tool_run.error = repr(error)
tool_run.end_time = datetime.utcnow()
tool_run.events.append({"name": "error", "time": tool_run.end_time})
self._end_trace(tool_run)
self._on_tool_error(tool_run)
return tool_run
def on_retriever_start(
self,
serialized: Dict[str, Any],
query: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> Run:
"""Run when Retriever starts running."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
retrieval_run = Run(
id=run_id,
name=name or "Retriever",
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"query": query},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
tags=tags,
child_runs=[],
run_type="retriever",
)
self._start_trace(retrieval_run)
self._on_retriever_start(retrieval_run)
return retrieval_run
def on_retriever_error(
self,
error: BaseException,
*,
run_id: UUID,
**kwargs: Any,
) -> Run:
"""Run when Retriever errors."""
if not run_id:
raise TracerException("No run_id provided for on_retriever_error callback.")
retrieval_run = self.run_map.get(str(run_id))
if retrieval_run is None or retrieval_run.run_type != "retriever":
raise TracerException(f"No retriever Run found to be traced for {run_id}")
retrieval_run.error = repr(error)
retrieval_run.end_time = datetime.utcnow()
retrieval_run.events.append({"name": "error", "time": retrieval_run.end_time})
self._end_trace(retrieval_run)
self._on_retriever_error(retrieval_run)
return retrieval_run
def on_retriever_end(
self, documents: Sequence[Document], *, run_id: UUID, **kwargs: Any
) -> Run:
"""Run when Retriever ends running."""
if not run_id:
raise TracerException("No run_id provided for on_retriever_end callback.")
retrieval_run = self.run_map.get(str(run_id))
if retrieval_run is None or retrieval_run.run_type != "retriever":
raise TracerException(f"No retriever Run found to be traced for {run_id}")
retrieval_run.outputs = {"documents": documents}
retrieval_run.end_time = datetime.utcnow()
retrieval_run.events.append({"name": "end", "time": retrieval_run.end_time})
self._end_trace(retrieval_run)
self._on_retriever_end(retrieval_run)
return retrieval_run
def __deepcopy__(self, memo: dict) -> BaseTracer:
"""Deepcopy the tracer."""
return self
def __copy__(self) -> BaseTracer:
"""Copy the tracer."""
return self
def _on_run_create(self, run: Run) -> None:
"""Process a run upon creation."""
def _on_run_update(self, run: Run) -> None:
"""Process a run upon update."""
def _on_llm_start(self, run: Run) -> None:
"""Process the LLM Run upon start."""
def _on_llm_new_token(
self,
run: Run,
token: str,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]],
) -> None:
"""Process new LLM token."""
def _on_llm_end(self, run: Run) -> None:
"""Process the LLM Run."""
def _on_llm_error(self, run: Run) -> None:
"""Process the LLM Run upon error."""
def _on_chain_start(self, run: Run) -> None:
"""Process the Chain Run upon start."""
def _on_chain_end(self, run: Run) -> None:
"""Process the Chain Run."""
def _on_chain_error(self, run: Run) -> None:
"""Process the Chain Run upon error."""
def _on_tool_start(self, run: Run) -> None:
"""Process the Tool Run upon start."""
def _on_tool_end(self, run: Run) -> None:
"""Process the Tool Run."""
def _on_tool_error(self, run: Run) -> None:
"""Process the Tool Run upon error."""
def _on_chat_model_start(self, run: Run) -> None:
"""Process the Chat Model Run upon start."""
def _on_retriever_start(self, run: Run) -> None:
"""Process the Retriever Run upon start."""
def _on_retriever_end(self, run: Run) -> None:
"""Process the Retriever Run."""
def _on_retriever_error(self, run: Run) -> None:
"""Process the Retriever Run upon error."""

View File

@@ -0,0 +1,222 @@
"""A tracer that runs evaluators over completed runs."""
from __future__ import annotations
import logging
import threading
import weakref
from concurrent.futures import Future, ThreadPoolExecutor, wait
from typing import Any, Dict, List, Optional, Sequence, Tuple, Union, cast
from uuid import UUID
import langsmith
from langsmith.evaluation.evaluator import EvaluationResult, EvaluationResults
from langchain.schema.callbacks import manager
from langchain.schema.callbacks.tracers import langchain as langchain_tracer
from langchain.schema.callbacks.tracers.base import BaseTracer
from langchain.schema.callbacks.tracers.langchain import _get_executor
from langchain.schema.callbacks.tracers.schemas import Run
logger = logging.getLogger(__name__)
_TRACERS: weakref.WeakSet[EvaluatorCallbackHandler] = weakref.WeakSet()
def wait_for_all_evaluators() -> None:
"""Wait for all tracers to finish."""
global _TRACERS
for tracer in list(_TRACERS):
if tracer is not None:
tracer.wait_for_futures()
class EvaluatorCallbackHandler(BaseTracer):
"""A tracer that runs a run evaluator whenever a run is persisted.
Parameters
----------
evaluators : Sequence[RunEvaluator]
The run evaluators to apply to all top level runs.
client : LangSmith Client, optional
The LangSmith client instance to use for evaluating the runs.
If not specified, a new instance will be created.
example_id : Union[UUID, str], optional
The example ID to be associated with the runs.
project_name : str, optional
The LangSmith project name to be organize eval chain runs under.
Attributes
----------
example_id : Union[UUID, None]
The example ID associated with the runs.
client : Client
The LangSmith client instance used for evaluating the runs.
evaluators : Sequence[RunEvaluator]
The sequence of run evaluators to be executed.
executor : ThreadPoolExecutor
The thread pool executor used for running the evaluators.
futures : Set[Future]
The set of futures representing the running evaluators.
skip_unfinished : bool
Whether to skip runs that are not finished or raised
an error.
project_name : Optional[str]
The LangSmith project name to be organize eval chain runs under.
"""
name = "evaluator_callback_handler"
def __init__(
self,
evaluators: Sequence[langsmith.RunEvaluator],
client: Optional[langsmith.Client] = None,
example_id: Optional[Union[UUID, str]] = None,
skip_unfinished: bool = True,
project_name: Optional[str] = "evaluators",
max_concurrency: Optional[int] = None,
**kwargs: Any,
) -> None:
super().__init__(**kwargs)
self.example_id = (
UUID(example_id) if isinstance(example_id, str) else example_id
)
self.client = client or langchain_tracer.get_client()
self.evaluators = evaluators
if max_concurrency is None:
self.executor: Optional[ThreadPoolExecutor] = _get_executor()
elif max_concurrency > 0:
self.executor = ThreadPoolExecutor(max_workers=max_concurrency)
weakref.finalize(
self,
lambda: cast(ThreadPoolExecutor, self.executor).shutdown(wait=True),
)
else:
self.executor = None
self.futures: weakref.WeakSet[Future] = weakref.WeakSet()
self.skip_unfinished = skip_unfinished
self.project_name = project_name
self.logged_eval_results: Dict[Tuple[str, str], List[EvaluationResult]] = {}
self.lock = threading.Lock()
global _TRACERS
_TRACERS.add(self)
def _evaluate_in_project(self, run: Run, evaluator: langsmith.RunEvaluator) -> None:
"""Evaluate the run in the project.
Parameters
----------
run : Run
The run to be evaluated.
evaluator : RunEvaluator
The evaluator to use for evaluating the run.
"""
try:
if self.project_name is None:
eval_result = self.client.evaluate_run(run, evaluator)
eval_results = [eval_result]
with manager.tracing_v2_enabled(
project_name=self.project_name, tags=["eval"], client=self.client
) as cb:
reference_example = (
self.client.read_example(run.reference_example_id)
if run.reference_example_id
else None
)
evaluation_result = evaluator.evaluate_run(
run,
example=reference_example,
)
eval_results = self._log_evaluation_feedback(
evaluation_result,
run,
source_run_id=cb.latest_run.id if cb.latest_run else None,
)
except Exception as e:
logger.error(
f"Error evaluating run {run.id} with "
f"{evaluator.__class__.__name__}: {repr(e)}",
exc_info=True,
)
raise e
example_id = str(run.reference_example_id)
with self.lock:
for res in eval_results:
run_id = (
str(getattr(res, "target_run_id"))
if hasattr(res, "target_run_id")
else str(run.id)
)
self.logged_eval_results.setdefault((run_id, example_id), []).append(
res
)
def _select_eval_results(
self,
results: Union[EvaluationResult, EvaluationResults],
) -> List[EvaluationResult]:
if isinstance(results, EvaluationResult):
results_ = [results]
elif isinstance(results, dict) and "results" in results:
results_ = cast(List[EvaluationResult], results["results"])
else:
raise TypeError(
f"Invalid evaluation result type {type(results)}."
" Expected EvaluationResult or EvaluationResults."
)
return results_
def _log_evaluation_feedback(
self,
evaluator_response: Union[EvaluationResult, EvaluationResults],
run: Run,
source_run_id: Optional[UUID] = None,
) -> List[EvaluationResult]:
results = self._select_eval_results(evaluator_response)
for res in results:
source_info_: Dict[str, Any] = {}
if res.evaluator_info:
source_info_ = {**res.evaluator_info, **source_info_}
run_id_ = (
getattr(res, "target_run_id")
if hasattr(res, "target_run_id") and res.target_run_id is not None
else run.id
)
self.client.create_feedback(
run_id_,
res.key,
score=res.score,
value=res.value,
comment=res.comment,
correction=res.correction,
source_info=source_info_,
source_run_id=res.source_run_id or source_run_id,
feedback_source_type=langsmith.schemas.FeedbackSourceType.MODEL,
)
return results
def _persist_run(self, run: Run) -> None:
"""Run the evaluator on the run.
Parameters
----------
run : Run
The run to be evaluated.
"""
if self.skip_unfinished and not run.outputs:
logger.debug(f"Skipping unfinished run {run.id}")
return
run_ = run.copy()
run_.reference_example_id = self.example_id
for evaluator in self.evaluators:
if self.executor is None:
self._evaluate_in_project(run_, evaluator)
else:
self.futures.add(
self.executor.submit(self._evaluate_in_project, run_, evaluator)
)
def wait_for_futures(self) -> None:
"""Wait for all futures to complete."""
wait(self.futures)

View File

@@ -0,0 +1,262 @@
"""A Tracer implementation that records to LangChain endpoint."""
from __future__ import annotations
import logging
import weakref
from concurrent.futures import Future, ThreadPoolExecutor, wait
from datetime import datetime
from typing import Any, Callable, Dict, List, Optional, Union
from uuid import UUID
from langsmith import Client
from langsmith import utils as ls_utils
from tenacity import (
Retrying,
retry_if_exception_type,
stop_after_attempt,
wait_exponential_jitter,
)
from langchain.env import get_runtime_environment
from langchain.load.dump import dumpd
from langchain.schema.callbacks.tracers.base import BaseTracer
from langchain.schema.callbacks.tracers.schemas import Run
from langchain.schema.messages import BaseMessage
logger = logging.getLogger(__name__)
_LOGGED = set()
_TRACERS: weakref.WeakSet[LangChainTracer] = weakref.WeakSet()
_CLIENT: Optional[Client] = None
_EXECUTOR: Optional[ThreadPoolExecutor] = None
def log_error_once(method: str, exception: Exception) -> None:
"""Log an error once."""
global _LOGGED
if (method, type(exception)) in _LOGGED:
return
_LOGGED.add((method, type(exception)))
logger.error(exception)
def wait_for_all_tracers() -> None:
"""Wait for all tracers to finish."""
global _TRACERS
for tracer in list(_TRACERS):
if tracer is not None:
tracer.wait_for_futures()
def get_client() -> Client:
"""Get the client."""
global _CLIENT
if _CLIENT is None:
_CLIENT = Client()
return _CLIENT
def _get_executor() -> ThreadPoolExecutor:
"""Get the executor."""
global _EXECUTOR
if _EXECUTOR is None:
_EXECUTOR = ThreadPoolExecutor()
return _EXECUTOR
def _copy(run: Run) -> Run:
"""Copy a run."""
try:
return run.copy(deep=True)
except TypeError:
# Fallback in case the object contains a lock or other
# non-pickleable object
return run.copy()
class LangChainTracer(BaseTracer):
"""An implementation of the SharedTracer that POSTS to the langchain endpoint."""
def __init__(
self,
example_id: Optional[Union[UUID, str]] = None,
project_name: Optional[str] = None,
client: Optional[Client] = None,
tags: Optional[List[str]] = None,
use_threading: bool = True,
**kwargs: Any,
) -> None:
"""Initialize the LangChain tracer."""
super().__init__(**kwargs)
self.example_id = (
UUID(example_id) if isinstance(example_id, str) else example_id
)
self.project_name = project_name or ls_utils.get_tracer_project()
self.client = client or get_client()
self._futures: weakref.WeakSet[Future] = weakref.WeakSet()
self.tags = tags or []
self.executor = _get_executor() if use_threading else None
self.latest_run: Optional[Run] = None
global _TRACERS
_TRACERS.add(self)
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
) -> None:
"""Start a trace for an LLM run."""
parent_run_id_ = str(parent_run_id) if parent_run_id else None
execution_order = self._get_execution_order(parent_run_id_)
start_time = datetime.utcnow()
if metadata:
kwargs.update({"metadata": metadata})
chat_model_run = Run(
id=run_id,
parent_run_id=parent_run_id,
serialized=serialized,
inputs={"messages": [[dumpd(msg) for msg in batch] for batch in messages]},
extra=kwargs,
events=[{"name": "start", "time": start_time}],
start_time=start_time,
execution_order=execution_order,
child_execution_order=execution_order,
run_type="llm",
tags=tags,
name=name,
)
self._start_trace(chat_model_run)
self._on_chat_model_start(chat_model_run)
def _persist_run(self, run: Run) -> None:
run_ = run.copy()
run_.reference_example_id = self.example_id
self.latest_run = run_
def get_run_url(self) -> str:
"""Get the LangSmith root run URL"""
if not self.latest_run:
raise ValueError("No traced run found.")
# If this is the first run in a project, the project may not yet be created.
# This method is only really useful for debugging flows, so we will assume
# there is some tolerace for latency.
for attempt in Retrying(
stop=stop_after_attempt(5),
wait=wait_exponential_jitter(),
retry=retry_if_exception_type(ls_utils.LangSmithError),
):
with attempt:
return self.client.get_run_url(
run=self.latest_run, project_name=self.project_name
)
raise ValueError("Failed to get run URL.")
def _get_tags(self, run: Run) -> List[str]:
"""Get combined tags for a run."""
tags = set(run.tags or [])
tags.update(self.tags or [])
return list(tags)
def _persist_run_single(self, run: Run) -> None:
"""Persist a run."""
run_dict = run.dict(exclude={"child_runs"})
run_dict["tags"] = self._get_tags(run)
extra = run_dict.get("extra", {})
extra["runtime"] = get_runtime_environment()
run_dict["extra"] = extra
try:
self.client.create_run(**run_dict, project_name=self.project_name)
except Exception as e:
# Errors are swallowed by the thread executor so we need to log them here
log_error_once("post", e)
raise
def _update_run_single(self, run: Run) -> None:
"""Update a run."""
try:
run_dict = run.dict()
run_dict["tags"] = self._get_tags(run)
self.client.update_run(run.id, **run_dict)
except Exception as e:
# Errors are swallowed by the thread executor so we need to log them here
log_error_once("patch", e)
raise
def _submit(self, function: Callable[[Run], None], run: Run) -> None:
"""Submit a function to the executor."""
if self.executor is None:
function(run)
else:
self._futures.add(self.executor.submit(function, run))
def _on_llm_start(self, run: Run) -> None:
"""Persist an LLM run."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_chat_model_start(self, run: Run) -> None:
"""Persist an LLM run."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_llm_end(self, run: Run) -> None:
"""Process the LLM Run."""
self._submit(self._update_run_single, _copy(run))
def _on_llm_error(self, run: Run) -> None:
"""Process the LLM Run upon error."""
self._submit(self._update_run_single, _copy(run))
def _on_chain_start(self, run: Run) -> None:
"""Process the Chain Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_chain_end(self, run: Run) -> None:
"""Process the Chain Run."""
self._submit(self._update_run_single, _copy(run))
def _on_chain_error(self, run: Run) -> None:
"""Process the Chain Run upon error."""
self._submit(self._update_run_single, _copy(run))
def _on_tool_start(self, run: Run) -> None:
"""Process the Tool Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_tool_end(self, run: Run) -> None:
"""Process the Tool Run."""
self._submit(self._update_run_single, _copy(run))
def _on_tool_error(self, run: Run) -> None:
"""Process the Tool Run upon error."""
self._submit(self._update_run_single, _copy(run))
def _on_retriever_start(self, run: Run) -> None:
"""Process the Retriever Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
self._submit(self._persist_run_single, _copy(run))
def _on_retriever_end(self, run: Run) -> None:
"""Process the Retriever Run."""
self._submit(self._update_run_single, _copy(run))
def _on_retriever_error(self, run: Run) -> None:
"""Process the Retriever Run upon error."""
self._submit(self._update_run_single, _copy(run))
def wait_for_futures(self) -> None:
"""Wait for the given futures to complete."""
wait(self._futures)

View File

@@ -0,0 +1,185 @@
from __future__ import annotations
import logging
import os
from typing import Any, Dict, Optional, Union
import requests
from langchain.schema.callbacks.tracers.base import BaseTracer
from langchain.schema.callbacks.tracers.schemas import (
ChainRun,
LLMRun,
Run,
ToolRun,
TracerSession,
TracerSessionV1,
TracerSessionV1Base,
)
from langchain.schema.messages import get_buffer_string
from langchain.utils import raise_for_status_with_text
logger = logging.getLogger(__name__)
def get_headers() -> Dict[str, Any]:
"""Get the headers for the LangChain API."""
headers: Dict[str, Any] = {"Content-Type": "application/json"}
if os.getenv("LANGCHAIN_API_KEY"):
headers["x-api-key"] = os.getenv("LANGCHAIN_API_KEY")
return headers
def _get_endpoint() -> str:
return os.getenv("LANGCHAIN_ENDPOINT", "http://localhost:8000")
class LangChainTracerV1(BaseTracer):
"""An implementation of the SharedTracer that POSTS to the langchain endpoint."""
def __init__(self, **kwargs: Any) -> None:
"""Initialize the LangChain tracer."""
super().__init__(**kwargs)
self.session: Optional[TracerSessionV1] = None
self._endpoint = _get_endpoint()
self._headers = get_headers()
def _convert_to_v1_run(self, run: Run) -> Union[LLMRun, ChainRun, ToolRun]:
session = self.session or self.load_default_session()
if not isinstance(session, TracerSessionV1):
raise ValueError(
"LangChainTracerV1 is not compatible with"
f" session of type {type(session)}"
)
if run.run_type == "llm":
if "prompts" in run.inputs:
prompts = run.inputs["prompts"]
elif "messages" in run.inputs:
prompts = [get_buffer_string(batch) for batch in run.inputs["messages"]]
else:
raise ValueError("No prompts found in LLM run inputs")
return LLMRun(
uuid=str(run.id) if run.id else None,
parent_uuid=str(run.parent_run_id) if run.parent_run_id else None,
start_time=run.start_time,
end_time=run.end_time,
extra=run.extra,
execution_order=run.execution_order,
child_execution_order=run.child_execution_order,
serialized=run.serialized,
session_id=session.id,
error=run.error,
prompts=prompts,
response=run.outputs if run.outputs else None,
)
if run.run_type == "chain":
child_runs = [self._convert_to_v1_run(run) for run in run.child_runs]
return ChainRun(
uuid=str(run.id) if run.id else None,
parent_uuid=str(run.parent_run_id) if run.parent_run_id else None,
start_time=run.start_time,
end_time=run.end_time,
execution_order=run.execution_order,
child_execution_order=run.child_execution_order,
serialized=run.serialized,
session_id=session.id,
inputs=run.inputs,
outputs=run.outputs,
error=run.error,
extra=run.extra,
child_llm_runs=[run for run in child_runs if isinstance(run, LLMRun)],
child_chain_runs=[
run for run in child_runs if isinstance(run, ChainRun)
],
child_tool_runs=[run for run in child_runs if isinstance(run, ToolRun)],
)
if run.run_type == "tool":
child_runs = [self._convert_to_v1_run(run) for run in run.child_runs]
return ToolRun(
uuid=str(run.id) if run.id else None,
parent_uuid=str(run.parent_run_id) if run.parent_run_id else None,
start_time=run.start_time,
end_time=run.end_time,
execution_order=run.execution_order,
child_execution_order=run.child_execution_order,
serialized=run.serialized,
session_id=session.id,
action=str(run.serialized),
tool_input=run.inputs.get("input", ""),
output=None if run.outputs is None else run.outputs.get("output"),
error=run.error,
extra=run.extra,
child_chain_runs=[
run for run in child_runs if isinstance(run, ChainRun)
],
child_tool_runs=[run for run in child_runs if isinstance(run, ToolRun)],
child_llm_runs=[run for run in child_runs if isinstance(run, LLMRun)],
)
raise ValueError(f"Unknown run type: {run.run_type}")
def _persist_run(self, run: Union[Run, LLMRun, ChainRun, ToolRun]) -> None:
"""Persist a run."""
if isinstance(run, Run):
v1_run = self._convert_to_v1_run(run)
else:
v1_run = run
if isinstance(v1_run, LLMRun):
endpoint = f"{self._endpoint}/llm-runs"
elif isinstance(v1_run, ChainRun):
endpoint = f"{self._endpoint}/chain-runs"
else:
endpoint = f"{self._endpoint}/tool-runs"
try:
response = requests.post(
endpoint,
data=v1_run.json(),
headers=self._headers,
)
raise_for_status_with_text(response)
except Exception as e:
logger.warning(f"Failed to persist run: {e}")
def _persist_session(
self, session_create: TracerSessionV1Base
) -> Union[TracerSessionV1, TracerSession]:
"""Persist a session."""
try:
r = requests.post(
f"{self._endpoint}/sessions",
data=session_create.json(),
headers=self._headers,
)
session = TracerSessionV1(id=r.json()["id"], **session_create.dict())
except Exception as e:
logger.warning(f"Failed to create session, using default session: {e}")
session = TracerSessionV1(id=1, **session_create.dict())
return session
def _load_session(self, session_name: Optional[str] = None) -> TracerSessionV1:
"""Load a session from the tracer."""
try:
url = f"{self._endpoint}/sessions"
if session_name:
url += f"?name={session_name}"
r = requests.get(url, headers=self._headers)
tracer_session = TracerSessionV1(**r.json()[0])
except Exception as e:
session_type = "default" if not session_name else session_name
logger.warning(
f"Failed to load {session_type} session, using empty session: {e}"
)
tracer_session = TracerSessionV1(id=1)
self.session = tracer_session
return tracer_session
def load_session(self, session_name: str) -> Union[TracerSessionV1, TracerSession]:
"""Load a session with the given name from the tracer."""
return self._load_session(session_name)
def load_default_session(self) -> Union[TracerSessionV1, TracerSession]:
"""Load the default tracing session and set it as the Tracer's session."""
return self._load_session("default")

View File

@@ -0,0 +1,311 @@
from __future__ import annotations
import math
import threading
from collections import defaultdict
from typing import (
Any,
AsyncIterator,
Dict,
List,
Optional,
Sequence,
TypedDict,
Union,
)
from uuid import UUID
import jsonpatch
from anyio import create_memory_object_stream
from langchain.load.load import load
from langchain.schema.callbacks.tracers.base import BaseTracer
from langchain.schema.callbacks.tracers.schemas import Run
from langchain.schema.output import ChatGenerationChunk, GenerationChunk
class LogEntry(TypedDict):
"""A single entry in the run log."""
id: str
"""ID of the sub-run."""
name: str
"""Name of the object being run."""
type: str
"""Type of the object being run, eg. prompt, chain, llm, etc."""
tags: List[str]
"""List of tags for the run."""
metadata: Dict[str, Any]
"""Key-value pairs of metadata for the run."""
start_time: str
"""ISO-8601 timestamp of when the run started."""
streamed_output_str: List[str]
"""List of LLM tokens streamed by this run, if applicable."""
final_output: Optional[Any]
"""Final output of this run.
Only available after the run has finished successfully."""
end_time: Optional[str]
"""ISO-8601 timestamp of when the run ended.
Only available after the run has finished."""
class RunState(TypedDict):
"""State of the run."""
id: str
"""ID of the run."""
streamed_output: List[Any]
"""List of output chunks streamed by Runnable.stream()"""
final_output: Optional[Any]
"""Final output of the run, usually the result of aggregating (`+`) streamed_output.
Only available after the run has finished successfully."""
logs: Dict[str, LogEntry]
"""Map of run names to sub-runs. If filters were supplied, this list will
contain only the runs that matched the filters."""
class RunLogPatch:
"""A patch to the run log."""
ops: List[Dict[str, Any]]
"""List of jsonpatch operations, which describe how to create the run state
from an empty dict. This is the minimal representation of the log, designed to
be serialized as JSON and sent over the wire to reconstruct the log on the other
side. Reconstruction of the state can be done with any jsonpatch-compliant library,
see https://jsonpatch.com for more information."""
def __init__(self, *ops: Dict[str, Any]) -> None:
self.ops = list(ops)
def __add__(self, other: Union[RunLogPatch, Any]) -> RunLog:
if type(other) == RunLogPatch:
ops = self.ops + other.ops
state = jsonpatch.apply_patch(None, ops)
return RunLog(*ops, state=state)
raise TypeError(
f"unsupported operand type(s) for +: '{type(self)}' and '{type(other)}'"
)
def __repr__(self) -> str:
from pprint import pformat
# 1:-1 to get rid of the [] around the list
return f"RunLogPatch({pformat(self.ops)[1:-1]})"
def __eq__(self, other: object) -> bool:
return isinstance(other, RunLogPatch) and self.ops == other.ops
class RunLog(RunLogPatch):
"""A run log."""
state: RunState
"""Current state of the log, obtained from applying all ops in sequence."""
def __init__(self, *ops: Dict[str, Any], state: RunState) -> None:
super().__init__(*ops)
self.state = state
def __add__(self, other: Union[RunLogPatch, Any]) -> RunLog:
if type(other) == RunLogPatch:
ops = self.ops + other.ops
state = jsonpatch.apply_patch(self.state, other.ops)
return RunLog(*ops, state=state)
raise TypeError(
f"unsupported operand type(s) for +: '{type(self)}' and '{type(other)}'"
)
def __repr__(self) -> str:
from pprint import pformat
return f"RunLog({pformat(self.state)})"
class LogStreamCallbackHandler(BaseTracer):
"""A tracer that streams run logs to a stream."""
def __init__(
self,
*,
auto_close: bool = True,
include_names: Optional[Sequence[str]] = None,
include_types: Optional[Sequence[str]] = None,
include_tags: Optional[Sequence[str]] = None,
exclude_names: Optional[Sequence[str]] = None,
exclude_types: Optional[Sequence[str]] = None,
exclude_tags: Optional[Sequence[str]] = None,
) -> None:
super().__init__()
self.auto_close = auto_close
self.include_names = include_names
self.include_types = include_types
self.include_tags = include_tags
self.exclude_names = exclude_names
self.exclude_types = exclude_types
self.exclude_tags = exclude_tags
send_stream, receive_stream = create_memory_object_stream(
math.inf, item_type=RunLogPatch
)
self.lock = threading.Lock()
self.send_stream = send_stream
self.receive_stream = receive_stream
self._key_map_by_run_id: Dict[UUID, str] = {}
self._counter_map_by_name: Dict[str, int] = defaultdict(int)
self.root_id: Optional[UUID] = None
def __aiter__(self) -> AsyncIterator[RunLogPatch]:
return self.receive_stream.__aiter__()
def include_run(self, run: Run) -> bool:
if run.id == self.root_id:
return False
run_tags = run.tags or []
if (
self.include_names is None
and self.include_types is None
and self.include_tags is None
):
include = True
else:
include = False
if self.include_names is not None:
include = include or run.name in self.include_names
if self.include_types is not None:
include = include or run.run_type in self.include_types
if self.include_tags is not None:
include = include or any(tag in self.include_tags for tag in run_tags)
if self.exclude_names is not None:
include = include and run.name not in self.exclude_names
if self.exclude_types is not None:
include = include and run.run_type not in self.exclude_types
if self.exclude_tags is not None:
include = include and all(tag not in self.exclude_tags for tag in run_tags)
return include
def _persist_run(self, run: Run) -> None:
# This is a legacy method only called once for an entire run tree
# therefore not useful here
pass
def _on_run_create(self, run: Run) -> None:
"""Start a run."""
if self.root_id is None:
self.root_id = run.id
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "replace",
"path": "",
"value": RunState(
id=str(run.id),
streamed_output=[],
final_output=None,
logs={},
),
}
)
)
if not self.include_run(run):
return
# Determine previous index, increment by 1
with self.lock:
self._counter_map_by_name[run.name] += 1
count = self._counter_map_by_name[run.name]
self._key_map_by_run_id[run.id] = (
run.name if count == 1 else f"{run.name}:{count}"
)
# Add the run to the stream
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "add",
"path": f"/logs/{self._key_map_by_run_id[run.id]}",
"value": LogEntry(
id=str(run.id),
name=run.name,
type=run.run_type,
tags=run.tags or [],
metadata=(run.extra or {}).get("metadata", {}),
start_time=run.start_time.isoformat(timespec="milliseconds"),
streamed_output_str=[],
final_output=None,
end_time=None,
),
}
)
)
def _on_run_update(self, run: Run) -> None:
"""Finish a run."""
try:
index = self._key_map_by_run_id.get(run.id)
if index is None:
return
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "add",
"path": f"/logs/{index}/final_output",
# to undo the dumpd done by some runnables / tracer / etc
"value": load(run.outputs),
},
{
"op": "add",
"path": f"/logs/{index}/end_time",
"value": run.end_time.isoformat(timespec="milliseconds")
if run.end_time is not None
else None,
},
)
)
finally:
if run.id == self.root_id:
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "replace",
"path": "/final_output",
"value": load(run.outputs),
}
)
)
if self.auto_close:
self.send_stream.close()
def _on_llm_new_token(
self,
run: Run,
token: str,
chunk: Optional[Union[GenerationChunk, ChatGenerationChunk]],
) -> None:
"""Process new LLM token."""
index = self._key_map_by_run_id.get(run.id)
if index is None:
return
self.send_stream.send_nowait(
RunLogPatch(
{
"op": "add",
"path": f"/logs/{index}/streamed_output_str/-",
"value": token,
}
)
)

View File

@@ -0,0 +1,54 @@
from typing import Callable, Optional, Union
from uuid import UUID
from langchain.schema.callbacks.tracers.base import BaseTracer
from langchain.schema.callbacks.tracers.schemas import Run
from langchain.schema.runnable.config import (
RunnableConfig,
call_func_with_variable_args,
)
Listener = Union[Callable[[Run], None], Callable[[Run, RunnableConfig], None]]
class RootListenersTracer(BaseTracer):
def __init__(
self,
*,
config: RunnableConfig,
on_start: Optional[Listener],
on_end: Optional[Listener],
on_error: Optional[Listener],
) -> None:
super().__init__()
self.config = config
self._arg_on_start = on_start
self._arg_on_end = on_end
self._arg_on_error = on_error
self.root_id: Optional[UUID] = None
def _persist_run(self, run: Run) -> None:
# This is a legacy method only called once for an entire run tree
# therefore not useful here
pass
def _on_run_create(self, run: Run) -> None:
if self.root_id is not None:
return
self.root_id = run.id
if self._arg_on_start is not None:
call_func_with_variable_args(self._arg_on_start, run, self.config)
def _on_run_update(self, run: Run) -> None:
if run.id != self.root_id:
return
if run.error is None:
if self._arg_on_end is not None:
call_func_with_variable_args(self._arg_on_end, run, self.config)
else:
if self._arg_on_error is not None:
call_func_with_variable_args(self._arg_on_error, run, self.config)

View File

@@ -0,0 +1,52 @@
"""A tracer that collects all nested runs in a list."""
from typing import Any, List, Optional, Union
from uuid import UUID
from langchain.schema.callbacks.tracers.base import BaseTracer
from langchain.schema.callbacks.tracers.schemas import Run
class RunCollectorCallbackHandler(BaseTracer):
"""
A tracer that collects all nested runs in a list.
This tracer is useful for inspection and evaluation purposes.
Parameters
----------
example_id : Optional[Union[UUID, str]], default=None
The ID of the example being traced. It can be either a UUID or a string.
"""
name: str = "run-collector_callback_handler"
def __init__(
self, example_id: Optional[Union[UUID, str]] = None, **kwargs: Any
) -> None:
"""
Initialize the RunCollectorCallbackHandler.
Parameters
----------
example_id : Optional[Union[UUID, str]], default=None
The ID of the example being traced. It can be either a UUID or a string.
"""
super().__init__(**kwargs)
self.example_id = (
UUID(example_id) if isinstance(example_id, str) else example_id
)
self.traced_runs: List[Run] = []
def _persist_run(self, run: Run) -> None:
"""
Persist a run by adding it to the traced_runs list.
Parameters
----------
run : Run
The run to be persisted.
"""
run_ = run.copy()
run_.reference_example_id = self.example_id
self.traced_runs.append(run_)

View File

@@ -0,0 +1,140 @@
"""Schemas for tracers."""
from __future__ import annotations
import datetime
import warnings
from typing import Any, Dict, List, Optional, Type
from uuid import UUID
from langsmith.schemas import RunBase as BaseRunV2
from langsmith.schemas import RunTypeEnum as RunTypeEnumDep
from langchain.pydantic_v1 import BaseModel, Field, root_validator
from langchain.schema import LLMResult
def RunTypeEnum() -> Type[RunTypeEnumDep]:
"""RunTypeEnum."""
warnings.warn(
"RunTypeEnum is deprecated. Please directly use a string instead"
" (e.g. 'llm', 'chain', 'tool').",
DeprecationWarning,
)
return RunTypeEnumDep
class TracerSessionV1Base(BaseModel):
"""Base class for TracerSessionV1."""
start_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
name: Optional[str] = None
extra: Optional[Dict[str, Any]] = None
class TracerSessionV1Create(TracerSessionV1Base):
"""Create class for TracerSessionV1."""
class TracerSessionV1(TracerSessionV1Base):
"""TracerSessionV1 schema."""
id: int
class TracerSessionBase(TracerSessionV1Base):
"""Base class for TracerSession."""
tenant_id: UUID
class TracerSession(TracerSessionBase):
"""TracerSessionV1 schema for the V2 API."""
id: UUID
class BaseRun(BaseModel):
"""Base class for Run."""
uuid: str
parent_uuid: Optional[str] = None
start_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
end_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
extra: Optional[Dict[str, Any]] = None
execution_order: int
child_execution_order: int
serialized: Dict[str, Any]
session_id: int
error: Optional[str] = None
class LLMRun(BaseRun):
"""Class for LLMRun."""
prompts: List[str]
response: Optional[LLMResult] = None
class ChainRun(BaseRun):
"""Class for ChainRun."""
inputs: Dict[str, Any]
outputs: Optional[Dict[str, Any]] = None
child_llm_runs: List[LLMRun] = Field(default_factory=list)
child_chain_runs: List[ChainRun] = Field(default_factory=list)
child_tool_runs: List[ToolRun] = Field(default_factory=list)
class ToolRun(BaseRun):
"""Class for ToolRun."""
tool_input: str
output: Optional[str] = None
action: str
child_llm_runs: List[LLMRun] = Field(default_factory=list)
child_chain_runs: List[ChainRun] = Field(default_factory=list)
child_tool_runs: List[ToolRun] = Field(default_factory=list)
# Begin V2 API Schemas
class Run(BaseRunV2):
"""Run schema for the V2 API in the Tracer."""
execution_order: int
child_execution_order: int
child_runs: List[Run] = Field(default_factory=list)
tags: Optional[List[str]] = Field(default_factory=list)
events: List[Dict[str, Any]] = Field(default_factory=list)
@root_validator(pre=True)
def assign_name(cls, values: dict) -> dict:
"""Assign name to the run."""
if values.get("name") is None:
if "name" in values["serialized"]:
values["name"] = values["serialized"]["name"]
elif "id" in values["serialized"]:
values["name"] = values["serialized"]["id"][-1]
if values.get("events") is None:
values["events"] = []
return values
ChainRun.update_forward_refs()
ToolRun.update_forward_refs()
Run.update_forward_refs()
__all__ = [
"BaseRun",
"ChainRun",
"LLMRun",
"Run",
"RunTypeEnum",
"ToolRun",
"TracerSession",
"TracerSessionBase",
"TracerSessionV1",
"TracerSessionV1Base",
"TracerSessionV1Create",
]

View File

@@ -0,0 +1,178 @@
import json
from typing import Any, Callable, List
from langchain.schema.callbacks.tracers.base import BaseTracer
from langchain.schema.callbacks.tracers.schemas import Run
from langchain.utils.input import get_bolded_text, get_colored_text
def try_json_stringify(obj: Any, fallback: str) -> str:
"""
Try to stringify an object to JSON.
Args:
obj: Object to stringify.
fallback: Fallback string to return if the object cannot be stringified.
Returns:
A JSON string if the object can be stringified, otherwise the fallback string.
"""
try:
return json.dumps(obj, indent=2, ensure_ascii=False)
except Exception:
return fallback
def elapsed(run: Any) -> str:
"""Get the elapsed time of a run.
Args:
run: any object with a start_time and end_time attribute.
Returns:
A string with the elapsed time in seconds or
milliseconds if time is less than a second.
"""
elapsed_time = run.end_time - run.start_time
milliseconds = elapsed_time.total_seconds() * 1000
if milliseconds < 1000:
return f"{milliseconds:.0f}ms"
return f"{(milliseconds / 1000):.2f}s"
class FunctionCallbackHandler(BaseTracer):
"""Tracer that calls a function with a single str parameter."""
name: str = "function_callback_handler"
def __init__(self, function: Callable[[str], None], **kwargs: Any) -> None:
super().__init__(**kwargs)
self.function_callback = function
def _persist_run(self, run: Run) -> None:
pass
def get_parents(self, run: Run) -> List[Run]:
parents = []
current_run = run
while current_run.parent_run_id:
parent = self.run_map.get(str(current_run.parent_run_id))
if parent:
parents.append(parent)
current_run = parent
else:
break
return parents
def get_breadcrumbs(self, run: Run) -> str:
parents = self.get_parents(run)[::-1]
string = " > ".join(
f"{parent.execution_order}:{parent.run_type}:{parent.name}"
if i != len(parents) - 1
else f"{parent.execution_order}:{parent.run_type}:{parent.name}"
for i, parent in enumerate(parents + [run])
)
return string
# logging methods
def _on_chain_start(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
run_type = run.run_type.capitalize()
self.function_callback(
f"{get_colored_text('[chain/start]', color='green')} "
+ get_bolded_text(f"[{crumbs}] Entering {run_type} run with input:\n")
+ f"{try_json_stringify(run.inputs, '[inputs]')}"
)
def _on_chain_end(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
run_type = run.run_type.capitalize()
self.function_callback(
f"{get_colored_text('[chain/end]', color='blue')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] Exiting {run_type} run with output:\n"
)
+ f"{try_json_stringify(run.outputs, '[outputs]')}"
)
def _on_chain_error(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
run_type = run.run_type.capitalize()
self.function_callback(
f"{get_colored_text('[chain/error]', color='red')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] {run_type} run errored with error:\n"
)
+ f"{try_json_stringify(run.error, '[error]')}"
)
def _on_llm_start(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
inputs = (
{"prompts": [p.strip() for p in run.inputs["prompts"]]}
if "prompts" in run.inputs
else run.inputs
)
self.function_callback(
f"{get_colored_text('[llm/start]', color='green')} "
+ get_bolded_text(f"[{crumbs}] Entering LLM run with input:\n")
+ f"{try_json_stringify(inputs, '[inputs]')}"
)
def _on_llm_end(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f"{get_colored_text('[llm/end]', color='blue')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] Exiting LLM run with output:\n"
)
+ f"{try_json_stringify(run.outputs, '[response]')}"
)
def _on_llm_error(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f"{get_colored_text('[llm/error]', color='red')} "
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] LLM run errored with error:\n"
)
+ f"{try_json_stringify(run.error, '[error]')}"
)
def _on_tool_start(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f'{get_colored_text("[tool/start]", color="green")} '
+ get_bolded_text(f"[{crumbs}] Entering Tool run with input:\n")
+ f'"{run.inputs["input"].strip()}"'
)
def _on_tool_end(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
if run.outputs:
self.function_callback(
f'{get_colored_text("[tool/end]", color="blue")} '
+ get_bolded_text(
f"[{crumbs}] [{elapsed(run)}] Exiting Tool run with output:\n"
)
+ f'"{run.outputs["output"].strip()}"'
)
def _on_tool_error(self, run: Run) -> None:
crumbs = self.get_breadcrumbs(run)
self.function_callback(
f"{get_colored_text('[tool/error]', color='red')} "
+ get_bolded_text(f"[{crumbs}] [{elapsed(run)}] ")
+ f"Tool run errored with error:\n"
f"{run.error}"
)
class ConsoleCallbackHandler(FunctionCallbackHandler):
"""Tracer that prints to the console."""
name: str = "console_callback_handler"
def __init__(self, **kwargs: Any) -> None:
super().__init__(function=print, **kwargs)

View File

@@ -32,12 +32,12 @@ from typing import (
from typing_extensions import Literal, get_args
if TYPE_CHECKING:
from langchain.callbacks.manager import (
from langchain.schema.callbacks.manager import (
AsyncCallbackManagerForChainRun,
CallbackManagerForChainRun,
)
from langchain.callbacks.tracers.log_stream import RunLog, RunLogPatch
from langchain.callbacks.tracers.root_listeners import Listener
from langchain.schema.callbacks.tracers.log_stream import RunLog, RunLogPatch
from langchain.schema.callbacks.tracers.root_listeners import Listener
from langchain.schema.runnable.fallbacks import (
RunnableWithFallbacks as RunnableWithFallbacksT,
)

View File

@@ -647,7 +647,7 @@ class Tokenizer:
"""Overlap in tokens between chunks"""
tokens_per_chunk: int
"""Maximum number of tokens per chunk"""
decode: Callable[[list[int]], str]
decode: Callable[[List[int]], str]
""" Function to decode a list of token ids to a string"""
encode: Callable[[str], List[int]]
""" Function to encode a string to a list of token ids"""

View File

@@ -34,7 +34,7 @@ DocDict = Dict[str, Any] # dicts expressing entries to insert
# (20 is the max batch size for the HTTP API at the time of writing)
DEFAULT_BATCH_SIZE = 20
# Number of threads to insert batches concurrently:
DEFAULT_BULK_INSERT_BATCH_CONCURRENCY = 5
DEFAULT_BULK_INSERT_BATCH_CONCURRENCY = 16
# Number of threads in a batch to insert pre-existing entries:
DEFAULT_BULK_INSERT_OVERWRITE_CONCURRENCY = 10
# Number of threads (for deleting multiple rows concurrently):
@@ -139,6 +139,20 @@ class AstraDB(VectorStore):
threads in a batch to insert pre-existing entries.
bulk_delete_concurrency (Optional[int]): Number of threads
(for deleting multiple rows concurrently).
A note on concurrency: as a rule of thumb, on a typical client machine
it is suggested to keep the quantity
bulk_insert_batch_concurrency * bulk_insert_overwrite_concurrency
much below 1000 to avoid exhausting the client multithreading/networking
resources. The hardcoded defaults are somewhat conservative to meet
most machines' specs, but a sensible choice to test may be:
bulk_insert_batch_concurrency = 80
bulk_insert_overwrite_concurrency = 10
A bit of experimentation is required to nail the best results here,
depending on both the machine/network specs and the expected workload
(specifically, how often a write is an update of an existing id).
Remember you can pass concurrency settings to individual calls to
add_texts and add_documents as well.
"""
# Conflicting-arg checks:
@@ -330,6 +344,12 @@ class AstraDB(VectorStore):
pre-existing documents in each batch (which require individual
API calls). Defaults to instance-level setting if not provided.
A note on metadata: there are constraints on the allowed field names
in this dictionary, coming from the underlying Astra DB API.
For instance, the `$` (dollar sign) cannot be used in the dict keys.
See this document for details:
docs.datastax.com/en/astra-serverless/docs/develop/dev-with-json.html
Returns:
List[str]: List of ids of the added texts.
"""

View File

@@ -102,13 +102,18 @@ class MongoDBAtlasVectorSearch(VectorStore):
"""
try:
from pymongo import MongoClient
from importlib.metadata import version
from pymongo import DriverInfo, MongoClient
except ImportError:
raise ImportError(
"Could not import pymongo, please install it with "
"`pip install pymongo`."
)
client: MongoClient = MongoClient(connection_string)
client: MongoClient = MongoClient(
connection_string,
driver=DriverInfo(name="Langchain", version=version("langchain")),
)
db_name, collection_name = namespace.split(".")
collection = client[db_name][collection_name]
return cls(collection, embedding, **kwargs)

View File

@@ -10,7 +10,7 @@ git grep '^from langchain import' langchain | grep -vE 'from langchain import (_
git grep '^from langchain ' langchain/pydantic_v1 | grep -vE 'from langchain.(pydantic_v1)' && errors=$((errors+1))
git grep '^from langchain' langchain/load | grep -vE 'from langchain.(pydantic_v1|load)' && errors=$((errors+1))
git grep '^from langchain' langchain/utils | grep -vE 'from langchain.(pydantic_v1|utils)' && errors=$((errors+1))
git grep '^from langchain' langchain/schema | grep -vE 'from langchain.(pydantic_v1|utils|schema|load)' && errors=$((errors+1))
git grep '^from langchain' langchain/schema | grep -vE 'from langchain.(pydantic_v1|utils|schema|load|env)' && errors=$((errors+1))
git grep '^from langchain' langchain/adapters | grep -vE 'from langchain.(pydantic_v1|utils|schema|load)' && errors=$((errors+1))
git grep '^from langchain' langchain/callbacks | grep -vE 'from langchain.(pydantic_v1|utils|schema|load|callbacks|env)' && errors=$((errors+1))
# TODO: it's probably not amazing so that so many other modules depend on `langchain.utilities`, because there can be a lot of imports there

View File

@@ -46,6 +46,22 @@ def test_func_call() -> None:
assert result.message_log == [msg]
# Test: Model response with a function call for a function taking no arguments
def test_func_call_no_args() -> None:
parser = OpenAIFunctionsAgentOutputParser()
msg = AIMessage(
content="LLM thoughts.",
additional_kwargs={"function_call": {"name": "foo", "arguments": ""}},
)
result = parser.invoke(msg)
assert isinstance(result, AgentActionMessageLog)
assert result.tool == "foo"
assert result.tool_input == {}
assert result.log == ("\nInvoking: `foo` with `{}`\nresponded: LLM thoughts.\n\n")
assert result.message_log == [msg]
# Test: Model response with a function call (old style tools).
def test_func_call_oldstyle() -> None:
parser = OpenAIFunctionsAgentOutputParser()

View File

@@ -10,15 +10,15 @@ from freezegun import freeze_time
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.tracers.base import BaseTracer, TracerException
from langchain.callbacks.tracers.langchain_v1 import (
from langchain.callbacks.tracers.schemas import Run, TracerSessionV1Base
from langchain.schema import LLMResult
from langchain.schema.callbacks.tracers.langchain_v1 import (
ChainRun,
LangChainTracerV1,
LLMRun,
ToolRun,
TracerSessionV1,
)
from langchain.callbacks.tracers.schemas import Run, TracerSessionV1Base
from langchain.schema import LLMResult
from langchain.schema.messages import HumanMessage
TEST_SESSION_ID = 2023

View File

@@ -3,7 +3,7 @@ from langchain.globals import get_debug, get_verbose, set_debug, set_verbose
def test_debug_is_settable_directly() -> None:
import langchain
from langchain.callbacks.manager import _get_debug
from langchain.schema.callbacks.manager import _get_debug
previous_value = langchain.debug
previous_fn_reading = _get_debug()
@@ -33,7 +33,7 @@ def test_debug_is_settable_directly() -> None:
def test_debug_is_settable_via_setter() -> None:
from langchain import globals
from langchain.callbacks.manager import _get_debug
from langchain.schema.callbacks.manager import _get_debug
previous_value = globals._debug
previous_fn_reading = _get_debug()

View File

@@ -322,6 +322,17 @@ files = [
marshmallow = ">=3.18.0,<4.0.0"
typing-inspect = ">=0.4.0,<1"
[[package]]
name = "distro"
version = "1.8.0"
description = "Distro - an OS platform information API"
optional = false
python-versions = ">=3.6"
files = [
{file = "distro-1.8.0-py3-none-any.whl", hash = "sha256:99522ca3e365cac527b44bde033f64c6945d90eb9f769703caaec52b09bbd3ff"},
{file = "distro-1.8.0.tar.gz", hash = "sha256:02e111d1dc6a50abb8eed6bf31c3e48ed8b0830d1ea2a1b78c61765c2513fdd8"},
]
[[package]]
name = "exceptiongroup"
version = "1.1.3"
@@ -915,25 +926,25 @@ files = [
[[package]]
name = "openai"
version = "0.28.1"
description = "Python client library for the OpenAI API"
version = "1.3.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
{file = "openai-0.28.1-py3-none-any.whl", hash = "sha256:d18690f9e3d31eedb66b57b88c2165d760b24ea0a01f150dd3f068155088ce68"},
{file = "openai-0.28.1.tar.gz", hash = "sha256:4be1dad329a65b4ce1a660fe6d5431b438f429b5855c883435f0f7fcb6d2dcc8"},
{file = "openai-1.3.2-py3-none-any.whl", hash = "sha256:97e2febbedc5f1308444d961df63aafb649efebf900d59dd3676fdede9bcd7b6"},
{file = "openai-1.3.2.tar.gz", hash = "sha256:7904d8f029339931805a8962d88e955f9223a983a8fbd06e01ae40e14735362b"},
]
[package.dependencies]
aiohttp = "*"
requests = ">=2.20"
tqdm = "*"
anyio = ">=3.5.0,<4"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
pydantic = ">=1.9.0,<3"
tqdm = ">4"
typing-extensions = ">=4.5,<5"
[package.extras]
datalib = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
dev = ["black (>=21.6b0,<22.0)", "pytest (==6.*)", "pytest-asyncio", "pytest-mock"]
embeddings = ["matplotlib", "numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "plotly", "scikit-learn (>=1.0.2)", "scipy", "tenacity (>=8.0.1)"]
wandb = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "wandb"]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
[[package]]
name = "packaging"
@@ -1470,4 +1481,4 @@ multidict = ">=4.0"
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "7526c9a94ba03df0f0fb38920acdb8fc19cd41b203ddc1c5ca284f75bacb37db"
content-hash = "28e79f0cc56298eb0e74f6b95c549d7b3c43f8000b06a2394be7d21abdea943c"

View File

@@ -8,7 +8,7 @@ readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = ">=0.0.313, <0.1"
openai = "^0.28.1"
openai = "<2"
[tool.poetry.group.dev.dependencies]
langchain-cli = ">=0.0.15"

View File

@@ -389,6 +389,17 @@ files = [
marshmallow = ">=3.18.0,<4.0.0"
typing-inspect = ">=0.4.0,<1"
[[package]]
name = "distro"
version = "1.8.0"
description = "Distro - an OS platform information API"
optional = false
python-versions = ">=3.6"
files = [
{file = "distro-1.8.0-py3-none-any.whl", hash = "sha256:99522ca3e365cac527b44bde033f64c6945d90eb9f769703caaec52b09bbd3ff"},
{file = "distro-1.8.0.tar.gz", hash = "sha256:02e111d1dc6a50abb8eed6bf31c3e48ed8b0830d1ea2a1b78c61765c2513fdd8"},
]
[[package]]
name = "exceptiongroup"
version = "1.1.3"
@@ -998,25 +1009,25 @@ files = [
[[package]]
name = "openai"
version = "0.28.1"
description = "Python client library for the OpenAI API"
version = "1.3.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
{file = "openai-0.28.1-py3-none-any.whl", hash = "sha256:d18690f9e3d31eedb66b57b88c2165d760b24ea0a01f150dd3f068155088ce68"},
{file = "openai-0.28.1.tar.gz", hash = "sha256:4be1dad329a65b4ce1a660fe6d5431b438f429b5855c883435f0f7fcb6d2dcc8"},
{file = "openai-1.3.2-py3-none-any.whl", hash = "sha256:97e2febbedc5f1308444d961df63aafb649efebf900d59dd3676fdede9bcd7b6"},
{file = "openai-1.3.2.tar.gz", hash = "sha256:7904d8f029339931805a8962d88e955f9223a983a8fbd06e01ae40e14735362b"},
]
[package.dependencies]
aiohttp = "*"
requests = ">=2.20"
tqdm = "*"
anyio = ">=3.5.0,<4"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
pydantic = ">=1.9.0,<3"
tqdm = ">4"
typing-extensions = ">=4.5,<5"
[package.extras]
datalib = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
dev = ["black (>=21.6b0,<22.0)", "pytest (==6.*)", "pytest-asyncio", "pytest-mock"]
embeddings = ["matplotlib", "numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "plotly", "scikit-learn (>=1.0.2)", "scipy", "tenacity (>=8.0.1)"]
wandb = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "wandb"]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
[[package]]
name = "packaging"
@@ -1706,4 +1717,4 @@ multidict = ">=4.0"
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "2bbb41d6a2b203cdef7902704f626c0b0006d12a660d4f85b91a48d44db7d5b9"
content-hash = "9be5d63593edd38a36c9be00a207d252b448c55a60cbfb42e445e5f0dbf467ec"

View File

@@ -10,7 +10,7 @@ readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = ">=0.0.325"
openai = "^0.28.1"
openai = "<2"
tiktoken = "^0.5.1"
cassio = "^0.1.3"

View File

@@ -389,6 +389,17 @@ files = [
marshmallow = ">=3.18.0,<4.0.0"
typing-inspect = ">=0.4.0,<1"
[[package]]
name = "distro"
version = "1.8.0"
description = "Distro - an OS platform information API"
optional = false
python-versions = ">=3.6"
files = [
{file = "distro-1.8.0-py3-none-any.whl", hash = "sha256:99522ca3e365cac527b44bde033f64c6945d90eb9f769703caaec52b09bbd3ff"},
{file = "distro-1.8.0.tar.gz", hash = "sha256:02e111d1dc6a50abb8eed6bf31c3e48ed8b0830d1ea2a1b78c61765c2513fdd8"},
]
[[package]]
name = "exceptiongroup"
version = "1.1.3"
@@ -998,25 +1009,25 @@ files = [
[[package]]
name = "openai"
version = "0.28.1"
description = "Python client library for the OpenAI API"
version = "1.3.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
{file = "openai-0.28.1-py3-none-any.whl", hash = "sha256:d18690f9e3d31eedb66b57b88c2165d760b24ea0a01f150dd3f068155088ce68"},
{file = "openai-0.28.1.tar.gz", hash = "sha256:4be1dad329a65b4ce1a660fe6d5431b438f429b5855c883435f0f7fcb6d2dcc8"},
{file = "openai-1.3.2-py3-none-any.whl", hash = "sha256:97e2febbedc5f1308444d961df63aafb649efebf900d59dd3676fdede9bcd7b6"},
{file = "openai-1.3.2.tar.gz", hash = "sha256:7904d8f029339931805a8962d88e955f9223a983a8fbd06e01ae40e14735362b"},
]
[package.dependencies]
aiohttp = "*"
requests = ">=2.20"
tqdm = "*"
anyio = ">=3.5.0,<4"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
pydantic = ">=1.9.0,<3"
tqdm = ">4"
typing-extensions = ">=4.5,<5"
[package.extras]
datalib = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
dev = ["black (>=21.6b0,<22.0)", "pytest (==6.*)", "pytest-asyncio", "pytest-mock"]
embeddings = ["matplotlib", "numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "plotly", "scikit-learn (>=1.0.2)", "scipy", "tenacity (>=8.0.1)"]
wandb = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "wandb"]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
[[package]]
name = "packaging"
@@ -1706,4 +1717,4 @@ multidict = ">=4.0"
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "2bbb41d6a2b203cdef7902704f626c0b0006d12a660d4f85b91a48d44db7d5b9"
content-hash = "9be5d63593edd38a36c9be00a207d252b448c55a60cbfb42e445e5f0dbf467ec"

View File

@@ -10,7 +10,7 @@ readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = ">=0.0.325"
openai = "^0.28.1"
openai = "<2"
tiktoken = "^0.5.1"
cassio = "^0.1.3"

View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2023 LangChain, Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,71 @@
# Chain-of-Note (Wikipedia)
Implements Chain-of-Note as described in https://arxiv.org/pdf/2311.09210.pdf by Yu, et al. Uses Wikipedia for retrieval.
Check out the prompt being used here https://smith.langchain.com/hub/bagatur/chain-of-note-wiki.
## Environment Setup
Uses Anthropic claude-2 chat model. Set Anthropic API key:
```bash
export ANTHROPIC_API_KEY="..."
```
## Usage
To use this package, you should first have the LangChain CLI installed:
```shell
pip install -U "langchain-cli[serve]"
```
To create a new LangChain project and install this as the only package, you can do:
```shell
langchain app new my-app --package chain-of-note-wiki
```
If you want to add this to an existing project, you can just run:
```shell
langchain app add chain-of-note-wiki
```
And add the following code to your `server.py` file:
```python
from chain_of_note_wiki import chain as chain_of_note_wiki_chain
add_routes(app, chain_of_note_wiki_chain, path="/chain-of-note-wiki")
```
(Optional) Let's now configure LangSmith.
LangSmith will help us trace, monitor and debug LangChain applications.
LangSmith is currently in private beta, you can sign up [here](https://smith.langchain.com/).
If you don't have access, you can skip this section
```shell
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project> # if not specified, defaults to "default"
```
If you are inside this directory, then you can spin up a LangServe instance directly by:
```shell
langchain serve
```
This will start the FastAPI app with a server is running locally at
[http://localhost:8000](http://localhost:8000)
We can see all templates at [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs)
We can access the playground at [http://127.0.0.1:8000/chain-of-note-wiki/playground](http://127.0.0.1:8000/chain-of-note-wiki/playground)
We can access the template from code with:
```python
from langserve.client import RemoteRunnable
runnable = RemoteRunnable("http://localhost:8000/chain-of-note-wiki")
```

View File

@@ -0,0 +1,3 @@
from chain_of_note_wiki.chain import chain
__all__ = ["chain"]

View File

@@ -0,0 +1,33 @@
from langchain import hub
from langchain.chat_models import ChatAnthropic
from langchain.pydantic_v1 import BaseModel
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnableLambda, RunnablePassthrough
from langchain.utilities import WikipediaAPIWrapper
class Question(BaseModel):
__root__: str
wiki = WikipediaAPIWrapper(top_k_results=5)
prompt = hub.pull("bagatur/chain-of-note-wiki")
llm = ChatAnthropic(model="claude-2")
def format_docs(docs):
return "\n\n".join(
f"Wikipedia {i+1}:\n{doc.page_content}" for i, doc in enumerate(docs)
)
chain = (
{
"passages": RunnableLambda(wiki.load) | format_docs,
"question": RunnablePassthrough(),
}
| prompt
| llm
| StrOutputParser()
).with_types(input_type=Question)

1909
templates/chain-of-note-wiki/poetry.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,26 @@
[tool.poetry]
name = "chain-of-note-wiki"
version = "0.0.1"
description = ""
authors = []
readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = ">=0.0.313, <0.1"
anthropic = "^0.7.0"
wikipedia = "^1.4.0"
langchainhub = "^0.1.14"
[tool.poetry.group.dev.dependencies]
langchain-cli = ">=0.0.4"
fastapi = "^0.104.0"
sse-starlette = "^1.6.5"
[tool.langserve]
export_module = "chain_of_note_wiki"
export_attr = "chain"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

View File

@@ -322,6 +322,17 @@ files = [
marshmallow = ">=3.18.0,<4.0.0"
typing-inspect = ">=0.4.0,<1"
[[package]]
name = "distro"
version = "1.8.0"
description = "Distro - an OS platform information API"
optional = false
python-versions = ">=3.6"
files = [
{file = "distro-1.8.0-py3-none-any.whl", hash = "sha256:99522ca3e365cac527b44bde033f64c6945d90eb9f769703caaec52b09bbd3ff"},
{file = "distro-1.8.0.tar.gz", hash = "sha256:02e111d1dc6a50abb8eed6bf31c3e48ed8b0830d1ea2a1b78c61765c2513fdd8"},
]
[[package]]
name = "exceptiongroup"
version = "1.1.3"
@@ -931,25 +942,25 @@ files = [
[[package]]
name = "openai"
version = "0.28.1"
description = "Python client library for the OpenAI API"
version = "1.3.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
{file = "openai-0.28.1-py3-none-any.whl", hash = "sha256:d18690f9e3d31eedb66b57b88c2165d760b24ea0a01f150dd3f068155088ce68"},
{file = "openai-0.28.1.tar.gz", hash = "sha256:4be1dad329a65b4ce1a660fe6d5431b438f429b5855c883435f0f7fcb6d2dcc8"},
{file = "openai-1.3.2-py3-none-any.whl", hash = "sha256:97e2febbedc5f1308444d961df63aafb649efebf900d59dd3676fdede9bcd7b6"},
{file = "openai-1.3.2.tar.gz", hash = "sha256:7904d8f029339931805a8962d88e955f9223a983a8fbd06e01ae40e14735362b"},
]
[package.dependencies]
aiohttp = "*"
requests = ">=2.20"
tqdm = "*"
anyio = ">=3.5.0,<4"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
pydantic = ">=1.9.0,<3"
tqdm = ">4"
typing-extensions = ">=4.5,<5"
[package.extras]
datalib = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
dev = ["black (>=21.6b0,<22.0)", "pytest (==6.*)", "pytest-asyncio", "pytest-mock"]
embeddings = ["matplotlib", "numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "plotly", "scikit-learn (>=1.0.2)", "scipy", "tenacity (>=8.0.1)"]
wandb = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "wandb"]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
[[package]]
name = "packaging"
@@ -1500,4 +1511,4 @@ multidict = ">=4.0"
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "d5d19532c070b2e370977972821d73cc394705c71bbb0a18365714ef125fb96a"
content-hash = "89c7c1eae996cfbdb931b6406d43d0512f76e0ab65321c1b4d7dc1b365d393ba"

View File

@@ -8,7 +8,7 @@ readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = ">=0.0.329"
openai = "^0.28.1"
openai = "<2"
langsmith = ">=0.0.54"
langchainhub = ">=0.1.13"

View File

@@ -322,6 +322,17 @@ files = [
marshmallow = ">=3.18.0,<4.0.0"
typing-inspect = ">=0.4.0,<1"
[[package]]
name = "distro"
version = "1.8.0"
description = "Distro - an OS platform information API"
optional = false
python-versions = ">=3.6"
files = [
{file = "distro-1.8.0-py3-none-any.whl", hash = "sha256:99522ca3e365cac527b44bde033f64c6945d90eb9f769703caaec52b09bbd3ff"},
{file = "distro-1.8.0.tar.gz", hash = "sha256:02e111d1dc6a50abb8eed6bf31c3e48ed8b0830d1ea2a1b78c61765c2513fdd8"},
]
[[package]]
name = "exceptiongroup"
version = "1.1.3"
@@ -971,25 +982,25 @@ files = [
[[package]]
name = "openai"
version = "0.28.1"
description = "Python client library for the OpenAI API"
version = "1.3.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
{file = "openai-0.28.1-py3-none-any.whl", hash = "sha256:d18690f9e3d31eedb66b57b88c2165d760b24ea0a01f150dd3f068155088ce68"},
{file = "openai-0.28.1.tar.gz", hash = "sha256:4be1dad329a65b4ce1a660fe6d5431b438f429b5855c883435f0f7fcb6d2dcc8"},
{file = "openai-1.3.2-py3-none-any.whl", hash = "sha256:97e2febbedc5f1308444d961df63aafb649efebf900d59dd3676fdede9bcd7b6"},
{file = "openai-1.3.2.tar.gz", hash = "sha256:7904d8f029339931805a8962d88e955f9223a983a8fbd06e01ae40e14735362b"},
]
[package.dependencies]
aiohttp = "*"
requests = ">=2.20"
tqdm = "*"
anyio = ">=3.5.0,<4"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
pydantic = ">=1.9.0,<3"
tqdm = ">4"
typing-extensions = ">=4.5,<5"
[package.extras]
datalib = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
dev = ["black (>=21.6b0,<22.0)", "pytest (==6.*)", "pytest-asyncio", "pytest-mock"]
embeddings = ["matplotlib", "numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "plotly", "scikit-learn (>=1.0.2)", "scipy", "tenacity (>=8.0.1)"]
wandb = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "wandb"]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
[[package]]
name = "packaging"
@@ -1812,4 +1823,4 @@ multidict = ">=4.0"
[metadata]
lock-version = "2.0"
python-versions = ">=3.9,<3.13"
content-hash = "5109be7be4a1228d6d85e94ca16ebdb31b791fb97bb764a10d564b6ddbfb00c9"
content-hash = "b635ca551e6b9cec07c838b51b18c37531ed6718d9595098e6427449d11ceb93"

View File

@@ -8,7 +8,7 @@ readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.9,<3.13"
langchain = ">=0.0.325"
openai = "^0.28.1"
openai = "<2"
tiktoken = "^0.5.1"
faiss-cpu = "^1.7.4"
pandas = "^2.1.1"

View File

@@ -322,6 +322,17 @@ files = [
marshmallow = ">=3.18.0,<4.0.0"
typing-inspect = ">=0.4.0,<1"
[[package]]
name = "distro"
version = "1.8.0"
description = "Distro - an OS platform information API"
optional = false
python-versions = ">=3.6"
files = [
{file = "distro-1.8.0-py3-none-any.whl", hash = "sha256:99522ca3e365cac527b44bde033f64c6945d90eb9f769703caaec52b09bbd3ff"},
{file = "distro-1.8.0.tar.gz", hash = "sha256:02e111d1dc6a50abb8eed6bf31c3e48ed8b0830d1ea2a1b78c61765c2513fdd8"},
]
[[package]]
name = "elastic-transport"
version = "8.10.0"
@@ -952,25 +963,25 @@ files = [
[[package]]
name = "openai"
version = "0.28.1"
description = "Python client library for the OpenAI API"
version = "1.3.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
{file = "openai-0.28.1-py3-none-any.whl", hash = "sha256:d18690f9e3d31eedb66b57b88c2165d760b24ea0a01f150dd3f068155088ce68"},
{file = "openai-0.28.1.tar.gz", hash = "sha256:4be1dad329a65b4ce1a660fe6d5431b438f429b5855c883435f0f7fcb6d2dcc8"},
{file = "openai-1.3.2-py3-none-any.whl", hash = "sha256:97e2febbedc5f1308444d961df63aafb649efebf900d59dd3676fdede9bcd7b6"},
{file = "openai-1.3.2.tar.gz", hash = "sha256:7904d8f029339931805a8962d88e955f9223a983a8fbd06e01ae40e14735362b"},
]
[package.dependencies]
aiohttp = "*"
requests = ">=2.20"
tqdm = "*"
anyio = ">=3.5.0,<4"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
pydantic = ">=1.9.0,<3"
tqdm = ">4"
typing-extensions = ">=4.5,<5"
[package.extras]
datalib = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
dev = ["black (>=21.6b0,<22.0)", "pytest (==6.*)", "pytest-asyncio", "pytest-mock"]
embeddings = ["matplotlib", "numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "plotly", "scikit-learn (>=1.0.2)", "scipy", "tenacity (>=8.0.1)"]
wandb = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "wandb"]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
[[package]]
name = "packaging"
@@ -1507,4 +1518,4 @@ multidict = ">=4.0"
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "33001bc7e12b08eddbb1a85ee9c55cc30e573e246def02deb3812e0fc073d032"
content-hash = "254ddcf941808df556f7666a6a244af276e8b182c7231a19b69ba46af29b9c2a"

View File

@@ -9,7 +9,7 @@ readme = "README.md"
python = ">=3.8.1,<4.0"
langchain = ">=0.0.325"
elasticsearch = "^8.10.1"
openai = "^0.28.1"
openai = "<2"
[tool.poetry.group.dev.dependencies]
langchain-cli = ">=0.0.15"

View File

@@ -322,6 +322,17 @@ files = [
marshmallow = ">=3.18.0,<4.0.0"
typing-inspect = ">=0.4.0,<1"
[[package]]
name = "distro"
version = "1.8.0"
description = "Distro - an OS platform information API"
optional = false
python-versions = ">=3.6"
files = [
{file = "distro-1.8.0-py3-none-any.whl", hash = "sha256:99522ca3e365cac527b44bde033f64c6945d90eb9f769703caaec52b09bbd3ff"},
{file = "distro-1.8.0.tar.gz", hash = "sha256:02e111d1dc6a50abb8eed6bf31c3e48ed8b0830d1ea2a1b78c61765c2513fdd8"},
]
[[package]]
name = "exceptiongroup"
version = "1.1.3"
@@ -916,25 +927,25 @@ files = [
[[package]]
name = "openai"
version = "0.28.1"
description = "Python client library for the OpenAI API"
version = "1.3.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
{file = "openai-0.28.1-py3-none-any.whl", hash = "sha256:d18690f9e3d31eedb66b57b88c2165d760b24ea0a01f150dd3f068155088ce68"},
{file = "openai-0.28.1.tar.gz", hash = "sha256:4be1dad329a65b4ce1a660fe6d5431b438f429b5855c883435f0f7fcb6d2dcc8"},
{file = "openai-1.3.2-py3-none-any.whl", hash = "sha256:97e2febbedc5f1308444d961df63aafb649efebf900d59dd3676fdede9bcd7b6"},
{file = "openai-1.3.2.tar.gz", hash = "sha256:7904d8f029339931805a8962d88e955f9223a983a8fbd06e01ae40e14735362b"},
]
[package.dependencies]
aiohttp = "*"
requests = ">=2.20"
tqdm = "*"
anyio = ">=3.5.0,<4"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
pydantic = ">=1.9.0,<3"
tqdm = ">4"
typing-extensions = ">=4.5,<5"
[package.extras]
datalib = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
dev = ["black (>=21.6b0,<22.0)", "pytest (==6.*)", "pytest-asyncio", "pytest-mock"]
embeddings = ["matplotlib", "numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "plotly", "scikit-learn (>=1.0.2)", "scipy", "tenacity (>=8.0.1)"]
wandb = ["numpy", "openpyxl (>=3.0.7)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)", "wandb"]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
[[package]]
name = "packaging"
@@ -1471,4 +1482,4 @@ multidict = ">=4.0"
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "a0e41ba0ace2d9fc3e361d782463e4e2a10343475fd269a5ac71ef0a9c2c930b"
content-hash = "4738e6bfc147dbf18aafd4a6c45901d2cf85874ea41fce5783b7604eec372762"

View File

@@ -10,7 +10,7 @@ readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = ">=0.0.325"
openai = "^0.28.1"
openai = "<2"
[tool.poetry.group.dev.dependencies]
langchain-cli = ">=0.0.15"

Some files were not shown because too many files have changed in this diff Show More