mirror of
https://github.com/hwchase17/langchain.git
synced 2026-04-03 10:55:08 +00:00
Use docusaurus versioning with a callout, merged master as well @hwchase17 @baskaryan --------- Signed-off-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: Averi Kitsch <akitsch@google.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com> Co-authored-by: Fayfox <admin@fayfox.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com> Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com> Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: WeichenXu <weichen.xu@databricks.com> Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com> Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com> Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai> Co-authored-by: MacanPN <martin.triska@gmail.com> Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com> Co-authored-by: Hyeongchan Kim <kozistr@gmail.com> Co-authored-by: sdan <git@sdan.io> Co-authored-by: Guangdong Liu <liugddx@gmail.com> Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com> Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com> Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com> Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com> Co-authored-by: Tomer Cagan <tomer@tomercagan.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
572 lines
21 KiB
Plaintext
572 lines
21 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9eb8dfa6fdb71ef5",
|
|
"metadata": {
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"source": [
|
|
"# Zep\n",
|
|
"\n",
|
|
">[Zep](https://docs.getzep.com/) is an open-source platform for LLM apps. Go from a prototype\n",
|
|
">built in LangChain or LlamaIndex, or a custom app, to production in minutes without rewriting code.\n",
|
|
"\n",
|
|
"## Key Features:\n",
|
|
"\n",
|
|
"- **Fast!** `Zep` operates independently of your chat loop, ensuring a snappy user experience.\n",
|
|
"- **Chat History Memory, Archival, and Enrichment**, populate your prompts with relevant chat history, summaries, named entities, intent data, and more.\n",
|
|
"- **Vector Search over Chat History and Documents** Automatic embedding of documents, chat histories, and summaries. Use Zep's similarity or native MMR Re-ranked search to find the most relevant.\n",
|
|
"- **Manage Users and their Chat Sessions** Users and their Chat Sessions are first-class citizens in Zep, allowing you to manage user interactions with your bots or agents easily.\n",
|
|
"- **Records Retention and Privacy Compliance** Comply with corporate and regulatory mandates for records retention while ensuring compliance with privacy regulations such as CCPA and GDPR. Fulfill *Right To Be Forgotten* requests with a single API call\n",
|
|
"\n",
|
|
"**Note:** The `ZepVectorStore` works with `Documents` and is intended to be used as a `Retriever`.\n",
|
|
"It offers separate functionality to Zep's `ZepMemory` class, which is designed for persisting, enriching\n",
|
|
"and searching your user's chat history.\n",
|
|
"\n",
|
|
"## Installation\n",
|
|
"\n",
|
|
"Follow the [Zep Quickstart Guide](https://docs.getzep.com/deployment/quickstart/) to install and get started with Zep.\n",
|
|
"\n",
|
|
"You'll need your Zep API URL and optionally an API key to use the Zep VectorStore. \n",
|
|
"See the [Zep docs](https://docs.getzep.com) for more information.\n",
|
|
"\n",
|
|
"## Usage\n",
|
|
"\n",
|
|
"In the examples below, we're using Zep's auto-embedding feature which automatically embeds documents on the Zep server \n",
|
|
"using low-latency embedding models.\n",
|
|
"\n",
|
|
"## Note\n",
|
|
"- These examples use Zep's async interfaces. Call sync interfaces by removing the `a` prefix from the method names.\n",
|
|
"- If you pass in an `Embeddings` instance Zep will use this to embed documents rather than auto-embed them.\n",
|
|
"You must also set your document collection to `isAutoEmbedded === false`. \n",
|
|
"- If you set your collection to `isAutoEmbedded === false`, you must pass in an `Embeddings` instance."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9a3a11aab1412d98",
|
|
"metadata": {
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"source": [
|
|
"## Load or create a Collection from documents"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "519418421a32e4d",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2023-08-13T01:07:50.672390Z",
|
|
"start_time": "2023-08-13T01:07:48.777799Z"
|
|
},
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from uuid import uuid4\n",
|
|
"\n",
|
|
"from langchain_community.document_loaders import WebBaseLoader\n",
|
|
"from langchain_community.vectorstores import ZepVectorStore\n",
|
|
"from langchain_community.vectorstores.zep import CollectionConfig\n",
|
|
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
|
|
"\n",
|
|
"ZEP_API_URL = \"http://localhost:8000\" # this is the API url of your Zep instance\n",
|
|
"ZEP_API_KEY = \"<optional_key>\" # optional API Key for your Zep instance\n",
|
|
"collection_name = f\"babbage{uuid4().hex}\" # a unique collection name. alphanum only\n",
|
|
"\n",
|
|
"# Collection config is needed if we're creating a new Zep Collection\n",
|
|
"config = CollectionConfig(\n",
|
|
" name=collection_name,\n",
|
|
" description=\"<optional description>\",\n",
|
|
" metadata={\"optional_metadata\": \"associated with the collection\"},\n",
|
|
" is_auto_embedded=True, # we'll have Zep embed our documents using its low-latency embedder\n",
|
|
" embedding_dimensions=1536, # this should match the model you've configured Zep to use.\n",
|
|
")\n",
|
|
"\n",
|
|
"# load the document\n",
|
|
"article_url = \"https://www.gutenberg.org/cache/epub/71292/pg71292.txt\"\n",
|
|
"loader = WebBaseLoader(article_url)\n",
|
|
"documents = loader.load()\n",
|
|
"\n",
|
|
"# split it into chunks\n",
|
|
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
|
|
"docs = text_splitter.split_documents(documents)\n",
|
|
"\n",
|
|
"# Instantiate the VectorStore. Since the collection does not already exist in Zep,\n",
|
|
"# it will be created and populated with the documents we pass in.\n",
|
|
"vs = ZepVectorStore.from_documents(\n",
|
|
" docs,\n",
|
|
" collection_name=collection_name,\n",
|
|
" config=config,\n",
|
|
" api_url=ZEP_API_URL,\n",
|
|
" api_key=ZEP_API_KEY,\n",
|
|
" embedding=None, # we'll have Zep embed our documents using its low-latency embedder\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "201dc57b124cb6d7",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2023-08-13T01:07:53.807663Z",
|
|
"start_time": "2023-08-13T01:07:50.671241Z"
|
|
},
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Embedding status: 0/401 documents embedded\n",
|
|
"Embedding status: 0/401 documents embedded\n",
|
|
"Embedding status: 0/401 documents embedded\n",
|
|
"Embedding status: 0/401 documents embedded\n",
|
|
"Embedding status: 0/401 documents embedded\n",
|
|
"Embedding status: 0/401 documents embedded\n",
|
|
"Embedding status: 401/401 documents embedded\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# wait for the collection embedding to complete\n",
|
|
"\n",
|
|
"\n",
|
|
"async def wait_for_ready(collection_name: str) -> None:\n",
|
|
" import time\n",
|
|
"\n",
|
|
" from zep_python import ZepClient\n",
|
|
"\n",
|
|
" client = ZepClient(ZEP_API_URL, ZEP_API_KEY)\n",
|
|
"\n",
|
|
" while True:\n",
|
|
" c = await client.document.aget_collection(collection_name)\n",
|
|
" print(\n",
|
|
" \"Embedding status: \"\n",
|
|
" f\"{c.document_embedded_count}/{c.document_count} documents embedded\"\n",
|
|
" )\n",
|
|
" time.sleep(1)\n",
|
|
" if c.status == \"ready\":\n",
|
|
" break\n",
|
|
"\n",
|
|
"\n",
|
|
"await wait_for_ready(collection_name)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "94ca9dfa7d0ecaa5",
|
|
"metadata": {
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"source": [
|
|
"## Simarility Search Query over the Collection"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "1998de0a96fe89c3",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2023-08-13T01:07:54.195988Z",
|
|
"start_time": "2023-08-13T01:07:53.808550Z"
|
|
},
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"the positions of the two principal planets, (and these the most\n",
|
|
"necessary for the navigator,) Jupiter and Saturn, require each not less\n",
|
|
"than one hundred and sixteen tables. Yet it is not only necessary to\n",
|
|
"predict the position of these bodies, but it is likewise expedient to\n",
|
|
"tabulate the motions of the four satellites of Jupiter, to predict the\n",
|
|
"exact times at which they enter his shadow, and at which their shadows\n",
|
|
"cross his disc, as well as the times at which they are interposed -> 0.9003241539387915 \n",
|
|
"====\n",
|
|
"\n",
|
|
"furnish more than a small fraction of that aid to navigation (in the\n",
|
|
"large sense of that term), which, with greater facility, expedition, and\n",
|
|
"economy in the calculation and printing of tables, it might be made to\n",
|
|
"supply.\n",
|
|
"\n",
|
|
"Tables necessary to determine the places of the planets are not less\n",
|
|
"necessary than those for the sun, moon, and stars. Some notion of the\n",
|
|
"number and complexity of these tables may be formed, when we state that -> 0.8911165633479508 \n",
|
|
"====\n",
|
|
"\n",
|
|
"the scheme of notation thus applied, immediately suggested the\n",
|
|
"advantages which must attend it as an instrument for expressing the\n",
|
|
"structure, operation, and circulation of the animal system; and we\n",
|
|
"entertain no doubt of its adequacy for that purpose. Not only the\n",
|
|
"mechanical connexion of the solid members of the bodies of men and\n",
|
|
"animals, but likewise the structure and operation of the softer parts,\n",
|
|
"including the muscles, integuments, membranes, &c. the nature, motion, -> 0.8899750214770481 \n",
|
|
"====\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# query it\n",
|
|
"query = \"what is the structure of our solar system?\"\n",
|
|
"docs_scores = await vs.asimilarity_search_with_relevance_scores(query, k=3)\n",
|
|
"\n",
|
|
"# print results\n",
|
|
"for d, s in docs_scores:\n",
|
|
" print(d.page_content, \" -> \", s, \"\\n====\\n\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e02b61a9af0b2c80",
|
|
"metadata": {
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"source": [
|
|
"## Search over Collection Re-ranked by MMR\n",
|
|
"\n",
|
|
"Zep offers native, hardware-accelerated MMR re-ranking of search results."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "488112da752b1d58",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2023-08-13T01:07:54.394873Z",
|
|
"start_time": "2023-08-13T01:07:54.180901Z"
|
|
},
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"the positions of the two principal planets, (and these the most\n",
|
|
"necessary for the navigator,) Jupiter and Saturn, require each not less\n",
|
|
"than one hundred and sixteen tables. Yet it is not only necessary to\n",
|
|
"predict the position of these bodies, but it is likewise expedient to\n",
|
|
"tabulate the motions of the four satellites of Jupiter, to predict the\n",
|
|
"exact times at which they enter his shadow, and at which their shadows\n",
|
|
"cross his disc, as well as the times at which they are interposed \n",
|
|
"====\n",
|
|
"\n",
|
|
"the scheme of notation thus applied, immediately suggested the\n",
|
|
"advantages which must attend it as an instrument for expressing the\n",
|
|
"structure, operation, and circulation of the animal system; and we\n",
|
|
"entertain no doubt of its adequacy for that purpose. Not only the\n",
|
|
"mechanical connexion of the solid members of the bodies of men and\n",
|
|
"animals, but likewise the structure and operation of the softer parts,\n",
|
|
"including the muscles, integuments, membranes, &c. the nature, motion, \n",
|
|
"====\n",
|
|
"\n",
|
|
"resistance, economizing time, harmonizing the mechanism, and giving to\n",
|
|
"the whole mechanical action the utmost practical perfection.\n",
|
|
"\n",
|
|
"The system of mechanical contrivances by which the results, here\n",
|
|
"attempted to be described, are attained, form only one order of\n",
|
|
"expedients adopted in this machinery;--although such is the perfection\n",
|
|
"of their action, that in any ordinary case they would be regarded as\n",
|
|
"having attained the ends in view with an almost superfluous degree of \n",
|
|
"====\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"query = \"what is the structure of our solar system?\"\n",
|
|
"docs = await vs.asearch(query, search_type=\"mmr\", k=3)\n",
|
|
"\n",
|
|
"for d in docs:\n",
|
|
" print(d.page_content, \"\\n====\\n\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "42455e31d4ab0d68",
|
|
"metadata": {
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"source": [
|
|
"# Filter by Metadata\n",
|
|
"\n",
|
|
"Use a metadata filter to narrow down results. First, load another book: \"Adventures of Sherlock Holmes\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "146c8a96201c0ab9",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2023-08-13T01:08:06.323569Z",
|
|
"start_time": "2023-08-13T01:07:54.381822Z"
|
|
},
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Embedding status: 401/1691 documents embedded\n",
|
|
"Embedding status: 401/1691 documents embedded\n",
|
|
"Embedding status: 401/1691 documents embedded\n",
|
|
"Embedding status: 401/1691 documents embedded\n",
|
|
"Embedding status: 401/1691 documents embedded\n",
|
|
"Embedding status: 401/1691 documents embedded\n",
|
|
"Embedding status: 901/1691 documents embedded\n",
|
|
"Embedding status: 901/1691 documents embedded\n",
|
|
"Embedding status: 901/1691 documents embedded\n",
|
|
"Embedding status: 901/1691 documents embedded\n",
|
|
"Embedding status: 901/1691 documents embedded\n",
|
|
"Embedding status: 901/1691 documents embedded\n",
|
|
"Embedding status: 1401/1691 documents embedded\n",
|
|
"Embedding status: 1401/1691 documents embedded\n",
|
|
"Embedding status: 1401/1691 documents embedded\n",
|
|
"Embedding status: 1401/1691 documents embedded\n",
|
|
"Embedding status: 1691/1691 documents embedded\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Let's add more content to the existing Collection\n",
|
|
"article_url = \"https://www.gutenberg.org/files/48320/48320-0.txt\"\n",
|
|
"loader = WebBaseLoader(article_url)\n",
|
|
"documents = loader.load()\n",
|
|
"\n",
|
|
"# split it into chunks\n",
|
|
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
|
|
"docs = text_splitter.split_documents(documents)\n",
|
|
"\n",
|
|
"await vs.aadd_documents(docs)\n",
|
|
"\n",
|
|
"await wait_for_ready(collection_name)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5b225f3ae1e61de8",
|
|
"metadata": {
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"source": [
|
|
"We see results from both books. Note the `source` metadata"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"id": "53700a9cd817cde4",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2023-08-13T01:08:06.504769Z",
|
|
"start_time": "2023-08-13T01:08:06.325435Z"
|
|
},
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"or remotely, for this purpose. But in addition to these, a great number\n",
|
|
"of tables, exclusively astronomical, are likewise indispensable. The\n",
|
|
"predictions of the astronomer, with respect to the positions and motions\n",
|
|
"of the bodies of the firmament, are the means, and the only means, which\n",
|
|
"enable the mariner to prosecute his art. By these he is enabled to\n",
|
|
"discover the distance of his ship from the Line, and the extent of his -> {'source': 'https://www.gutenberg.org/cache/epub/71292/pg71292.txt'} \n",
|
|
"====\n",
|
|
"\n",
|
|
"possess all knowledge which is likely to be useful to him in his work,\n",
|
|
"and this I have endeavored in my case to do. If I remember rightly, you\n",
|
|
"on one occasion, in the early days of our friendship, defined my limits\n",
|
|
"in a very precise fashion.”\n",
|
|
"\n",
|
|
"“Yes,” I answered, laughing. “It was a singular document. Philosophy,\n",
|
|
"astronomy, and politics were marked at zero, I remember. Botany\n",
|
|
"variable, geology profound as regards the mud-stains from any region -> {'source': 'https://www.gutenberg.org/files/48320/48320-0.txt'} \n",
|
|
"====\n",
|
|
"\n",
|
|
"of astronomy, and its kindred sciences, with the various arts dependent\n",
|
|
"on them. In none are computations more operose than those which\n",
|
|
"astronomy in particular requires;--in none are preparatory facilities\n",
|
|
"more needful;--in none is error more detrimental. The practical\n",
|
|
"astronomer is interrupted in his pursuit, and diverted from his task of\n",
|
|
"observation by the irksome labours of computation, or his diligence in\n",
|
|
"observing becomes ineffectual for want of yet greater industry of -> {'source': 'https://www.gutenberg.org/cache/epub/71292/pg71292.txt'} \n",
|
|
"====\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"query = \"Was he interested in astronomy?\"\n",
|
|
"docs = await vs.asearch(query, search_type=\"similarity\", k=3)\n",
|
|
"\n",
|
|
"for d in docs:\n",
|
|
" print(d.page_content, \" -> \", d.metadata, \"\\n====\\n\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7b81d7cae351a1ec",
|
|
"metadata": {
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"source": [
|
|
"Now, we set up a filter"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "8f1bdcba03979d22",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2023-08-13T01:08:06.672836Z",
|
|
"start_time": "2023-08-13T01:08:06.505944Z"
|
|
},
|
|
"collapsed": false,
|
|
"jupyter": {
|
|
"outputs_hidden": false
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"possess all knowledge which is likely to be useful to him in his work,\n",
|
|
"and this I have endeavored in my case to do. If I remember rightly, you\n",
|
|
"on one occasion, in the early days of our friendship, defined my limits\n",
|
|
"in a very precise fashion.”\n",
|
|
"\n",
|
|
"“Yes,” I answered, laughing. “It was a singular document. Philosophy,\n",
|
|
"astronomy, and politics were marked at zero, I remember. Botany\n",
|
|
"variable, geology profound as regards the mud-stains from any region -> {'source': 'https://www.gutenberg.org/files/48320/48320-0.txt'} \n",
|
|
"====\n",
|
|
"\n",
|
|
"the light shining upon his strong-set aquiline features. So he sat as I\n",
|
|
"dropped off to sleep, and so he sat when a sudden ejaculation caused me\n",
|
|
"to wake up, and I found the summer sun shining into the apartment. The\n",
|
|
"pipe was still between his lips, the smoke still curled upward, and the\n",
|
|
"room was full of a dense tobacco haze, but nothing remained of the heap\n",
|
|
"of shag which I had seen upon the previous night.\n",
|
|
"\n",
|
|
"“Awake, Watson?” he asked.\n",
|
|
"\n",
|
|
"“Yes.”\n",
|
|
"\n",
|
|
"“Game for a morning drive?” -> {'source': 'https://www.gutenberg.org/files/48320/48320-0.txt'} \n",
|
|
"====\n",
|
|
"\n",
|
|
"“I glanced at the books upon the table, and in spite of my ignorance\n",
|
|
"of German I could see that two of them were treatises on science, the\n",
|
|
"others being volumes of poetry. Then I walked across to the window,\n",
|
|
"hoping that I might catch some glimpse of the country-side, but an oak\n",
|
|
"shutter, heavily barred, was folded across it. It was a wonderfully\n",
|
|
"silent house. There was an old clock ticking loudly somewhere in the\n",
|
|
"passage, but otherwise everything was deadly still. A vague feeling of -> {'source': 'https://www.gutenberg.org/files/48320/48320-0.txt'} \n",
|
|
"====\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"filter = {\n",
|
|
" \"where\": {\n",
|
|
" \"jsonpath\": (\n",
|
|
" \"$[*] ? (@.source == 'https://www.gutenberg.org/files/48320/48320-0.txt')\"\n",
|
|
" )\n",
|
|
" },\n",
|
|
"}\n",
|
|
"\n",
|
|
"docs = await vs.asearch(query, search_type=\"similarity\", metadata=filter, k=3)\n",
|
|
"\n",
|
|
"for d in docs:\n",
|
|
" print(d.page_content, \" -> \", d.metadata, \"\\n====\\n\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "96132aa6",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.12"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|