mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-08 22:42:05 +00:00
community[minor]: Add support for Upstash Vector (#20824)
## Description Adding `UpstashVectorStore` to utilize [Upstash Vector](https://upstash.com/docs/vector/overall/getstarted)! #17012 was opened to add Upstash Vector to langchain but was closed to wait for filtering. Now filtering is added to Upstash vector and we open a new PR. Additionally, [embedding feature](https://upstash.com/docs/vector/features/embeddingmodels) was added and we add this to our vectorstore aswell. ## Dependencies [upstash-vector](https://pypi.org/project/upstash-vector/) should be installed to use `UpstashVectorStore`. Didn't update dependencies because of [this comment in the previous PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450). ## Tests Tests are added and they pass. Tests are naturally network bound since Upstash Vector is offered through an API. There was [a discussion in the previous PR about mocking the unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567). We didn't make changes to this end yet. We can update the tests if you can explain how the tests should be mocked. --------- Co-authored-by: ytkimirti <yusuftaha9@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
@@ -1,6 +1,166 @@
|
||||
# Upstash Redis
|
||||
Upstash offers developers serverless databases and messaging
|
||||
platforms to build powerful applications without having to worry
|
||||
about the operational complexity of running databases at scale.
|
||||
|
||||
Upstash offers developers serverless databases and messaging platforms to build powerful applications without having to worry about the operational complexity of running databases at scale.
|
||||
One significant advantage of Upstash is that their databases support HTTP and all of their SDKs use HTTP.
|
||||
This means that you can run this in serverless platforms, edge or any platform that does not support TCP connections.
|
||||
|
||||
Currently, there are two Upstash integrations available for LangChain:
|
||||
Upstash Vector as a vector embedding database and Upstash Redis as a cache and memory store.
|
||||
|
||||
# Upstash Vector
|
||||
|
||||
Upstash Vector is a serverless vector database that can be used to store and query vectors.
|
||||
|
||||
## Installation
|
||||
|
||||
Create a new serverless vector database at the [Upstash Console](https://console.upstash.com/vector).
|
||||
Select your preferred distance metric and dimension count according to your model.
|
||||
|
||||
|
||||
Install the Upstash Vector Python SDK with `pip install upstash-vector`.
|
||||
The Upstash Vector integration in langchain is a wrapper for the Upstash Vector Python SDK. That's why the `upstash-vector` package is required.
|
||||
|
||||
## Integrations
|
||||
|
||||
Create a `UpstashVectorStore` object using credentials from the Upstash Console.
|
||||
You also need to pass in an `Embeddings` object which can turn text into vector embeddings.
|
||||
|
||||
```python
|
||||
from langchain_community.vectorstores.upstash import UpstashVectorStore
|
||||
import os
|
||||
|
||||
os.environ["UPSTASH_VECTOR_REST_URL"] = "<UPSTASH_VECTOR_REST_URL>"
|
||||
os.environ["UPSTASH_VECTOR_REST_TOKEN"] = "<UPSTASH_VECTOR_REST_TOKEN>"
|
||||
|
||||
store = UpstashVectorStore(
|
||||
embedding=embeddings
|
||||
)
|
||||
```
|
||||
|
||||
An alternative way of `UpstashVectorStore` is to pass `embedding=True`. This is a unique
|
||||
feature of the `UpstashVectorStore` thanks to the ability of the Upstash Vector indexes
|
||||
to have an associated embedding model. In this configuration, documents we want to insert or
|
||||
queries we want to search for are simply sent to Upstash Vector as text. In the background,
|
||||
Upstash Vector embeds these text and executes the request with these embeddings. To use this
|
||||
feature, [create an Upstash Vector index by selecting a model](https://upstash.com/docs/vector/features/embeddingmodels#using-a-model)
|
||||
and simply pass `embedding=True`:
|
||||
|
||||
```python
|
||||
from langchain_community.vectorstores.upstash import UpstashVectorStore
|
||||
import os
|
||||
|
||||
os.environ["UPSTASH_VECTOR_REST_URL"] = "<UPSTASH_VECTOR_REST_URL>"
|
||||
os.environ["UPSTASH_VECTOR_REST_TOKEN"] = "<UPSTASH_VECTOR_REST_TOKEN>"
|
||||
|
||||
store = UpstashVectorStore(
|
||||
embedding=True
|
||||
)
|
||||
```
|
||||
|
||||
See [Upstash Vector documentation](https://upstash.com/docs/vector/features/embeddingmodels)
|
||||
for more detail on embedding models.
|
||||
|
||||
### Inserting Vectors
|
||||
|
||||
```python
|
||||
from langchain.text_splitter import CharacterTextSplitter
|
||||
from langchain_community.document_loaders import TextLoader
|
||||
from langchain_openai import OpenAIEmbeddings
|
||||
|
||||
loader = TextLoader("../../modules/state_of_the_union.txt")
|
||||
documents = loader.load()
|
||||
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
|
||||
docs = text_splitter.split_documents(documents)
|
||||
|
||||
# Create a new embeddings object
|
||||
embeddings = OpenAIEmbeddings()
|
||||
|
||||
# Create a new UpstashVectorStore object
|
||||
store = UpstashVectorStore(
|
||||
embedding=embeddings
|
||||
)
|
||||
|
||||
# Insert the document embeddings into the store
|
||||
store.add_documents(docs)
|
||||
```
|
||||
|
||||
When inserting documents, first they are embedded using the `Embeddings` object.
|
||||
|
||||
Most embedding models can embed multiple documents at once, so the documents are batched and embedded in parallel.
|
||||
The size of the batch can be controlled using the `embedding_chunk_size` parameter.
|
||||
|
||||
The embedded vectors are then stored in the Upstash Vector database. When they are sent, multiple vectors are batched together to reduce the number of HTTP requests.
|
||||
The size of the batch can be controlled using the `batch_size` parameter. Upstash Vector has a limit of 1000 vectors per batch in the free tier.
|
||||
|
||||
```python
|
||||
store.add_documents(
|
||||
documents,
|
||||
batch_size=100,
|
||||
embedding_chunk_size=200
|
||||
)
|
||||
```
|
||||
|
||||
### Querying Vectors
|
||||
|
||||
Vectors can be queried using a text query or another vector.
|
||||
|
||||
The returned value is a list of Document objects.
|
||||
|
||||
```python
|
||||
result = store.similarity_search(
|
||||
"The United States of America",
|
||||
k=5
|
||||
)
|
||||
```
|
||||
|
||||
Or using a vector:
|
||||
|
||||
```python
|
||||
vector = embeddings.embed_query("Hello world")
|
||||
|
||||
result = store.similarity_search_by_vector(
|
||||
vector,
|
||||
k=5
|
||||
)
|
||||
```
|
||||
|
||||
When searching, you can also utilize the `filter` parameter which will allow you to filter by metadata:
|
||||
|
||||
```python
|
||||
result = store.similarity_search(
|
||||
"The United States of America",
|
||||
k=5,
|
||||
filter="type = 'country'"
|
||||
)
|
||||
```
|
||||
|
||||
See [Upstash Vector documentation](https://upstash.com/docs/vector/features/filtering)
|
||||
for more details on metadata filtering.
|
||||
|
||||
### Deleting Vectors
|
||||
|
||||
Vectors can be deleted by their IDs.
|
||||
|
||||
```python
|
||||
store.delete(["id1", "id2"])
|
||||
```
|
||||
|
||||
### Getting information about the store
|
||||
|
||||
You can get information about your database like the distance metric dimension using the info function.
|
||||
|
||||
When an insert happens, the database an indexing takes place. While this is happening new vectors can not be queried. `pendingVectorCount` represents the number of vector that are currently being indexed.
|
||||
|
||||
```python
|
||||
info = store.info()
|
||||
print(info)
|
||||
|
||||
# Output:
|
||||
# {'vectorCount': 44, 'pendingVectorCount': 0, 'indexSize': 2642412, 'dimension': 1536, 'similarityFunction': 'COSINE'}
|
||||
```
|
||||
|
||||
# Upstash Redis
|
||||
|
||||
This page covers how to use [Upstash Redis](https://upstash.com/redis) with LangChain.
|
||||
|
||||
@@ -12,7 +172,6 @@ This page covers how to use [Upstash Redis](https://upstash.com/redis) with Lang
|
||||
## Integrations
|
||||
All of Upstash-LangChain integrations are based on `upstash-redis` Python SDK being utilized as wrappers for LangChain.
|
||||
This SDK utilizes Upstash Redis DB by giving UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN parameters from the console.
|
||||
One significant advantage of this is that, this SDK uses a REST API. This means, you can run this in serverless platforms, edge or any platform that does not support TCP connections.
|
||||
|
||||
|
||||
### Cache
|
||||
|
389
docs/docs/integrations/vectorstores/upstash.ipynb
Normal file
389
docs/docs/integrations/vectorstores/upstash.ipynb
Normal file
@@ -0,0 +1,389 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Upstash Vector\n",
|
||||
"\n",
|
||||
"> [Upstash Vector](https://upstash.com/docs/vector/overall/whatisvector) is a serverless vector database designed for working with vector embeddings.\n",
|
||||
">\n",
|
||||
"> The vector langchain integration is a wrapper around the [upstash-vector](https://github.com/upstash/vector-py) package.\n",
|
||||
">\n",
|
||||
"> The python package uses the [vector rest api](https://upstash.com/docs/vector/api/get-started) behind the scenes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Installation\n",
|
||||
"\n",
|
||||
"Create a free vector database from [upstash console](https://console.upstash.com/vector) with the desired dimensions and distance metric.\n",
|
||||
"\n",
|
||||
"You can then create an `UpstashVectorStore` instance by:\n",
|
||||
"\n",
|
||||
"- Providing the environment variables `UPSTASH_VECTOR_URL` and `UPSTASH_VECTOR_TOKEN`\n",
|
||||
"\n",
|
||||
"- Giving them as parameters to the constructor\n",
|
||||
"\n",
|
||||
"- Passing an Upstash Vector `Index` instance to the constructor\n",
|
||||
"\n",
|
||||
"Also, an `Embeddings` instance is required to turn given texts into embeddings. Here we use `OpenAIEmbeddings` as an example"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install langchain-openai langchain upstash-vector"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 37,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"from langchain_community.vectorstores.upstash import UpstashVectorStore\n",
|
||||
"from langchain_openai import OpenAIEmbeddings\n",
|
||||
"\n",
|
||||
"os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\"\n",
|
||||
"os.environ[\"UPSTASH_VECTOR_REST_URL\"] = \"<YOUR_UPSTASH_VECTOR_URL>\"\n",
|
||||
"os.environ[\"UPSTASH_VECTOR_REST_TOKEN\"] = \"<YOUR_UPSTASH_VECTOR_TOKEN>\"\n",
|
||||
"\n",
|
||||
"# Create an embeddings instance\n",
|
||||
"embeddings = OpenAIEmbeddings()\n",
|
||||
"\n",
|
||||
"# Create a vector store instance\n",
|
||||
"store = UpstashVectorStore(embedding=embeddings)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"An alternative way of creating `UpstashVectorStore` is to [create an Upstash Vector index by selecting a model](https://upstash.com/docs/vector/features/embeddingmodels#using-a-model) and passing `embedding=True`. In this configuration, documents or queries will be sent to Upstash as text and embedded there.\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"store = UpstashVectorStore(embedding=True)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"If you are interested in trying out this approach, you can update the initialization of `store` like above and run the rest of the tutorial."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load documents\n",
|
||||
"\n",
|
||||
"Load an example text file and split it into chunks which can be turned into vector embeddings."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \\n\\nLast year COVID-19 kept us apart. This year we are finally together again. \\n\\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \\n\\nWith a duty to one another to the American people to the Constitution. \\n\\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \\n\\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \\n\\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \\n\\nHe met the Ukrainian people. \\n\\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.', metadata={'source': '../../modules/state_of_the_union.txt'}),\n",
|
||||
" Document(page_content='Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \\n\\nIn this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \\n\\nLet each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \\n\\nPlease rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \\n\\nThroughout our history we’ve learned this lesson when dictators do not pay a price for their aggression they cause more chaos. \\n\\nThey keep moving. \\n\\nAnd the costs and the threats to America and the world keep rising. \\n\\nThat’s why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \\n\\nThe United States is a member along with 29 other nations. \\n\\nIt matters. American diplomacy matters. American resolve matters.', metadata={'source': '../../modules/state_of_the_union.txt'}),\n",
|
||||
" Document(page_content='Putin’s latest attack on Ukraine was premeditated and unprovoked. \\n\\nHe rejected repeated efforts at diplomacy. \\n\\nHe thought the West and NATO wouldn’t respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did. \\n\\nWe prepared extensively and carefully. \\n\\nWe spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. \\n\\nI spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression. \\n\\nWe countered Russia’s lies with truth. \\n\\nAnd now that he has acted the free world is holding him accountable. \\n\\nAlong with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland.', metadata={'source': '../../modules/state_of_the_union.txt'})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 17,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain.text_splitter import CharacterTextSplitter\n",
|
||||
"from langchain_community.document_loaders import TextLoader\n",
|
||||
"\n",
|
||||
"loader = TextLoader(\"../../modules/state_of_the_union.txt\")\n",
|
||||
"documents = loader.load()\n",
|
||||
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
|
||||
"docs = text_splitter.split_documents(documents)\n",
|
||||
"\n",
|
||||
"docs[:3]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Inserting documents\n",
|
||||
"\n",
|
||||
"The vectorstore embeds text chunks using the embedding object and batch inserts them into the database. This returns an id array of the inserted vectors."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 35,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['82b3781b-817c-4a4d-8f8b-cbd07c1d005a',\n",
|
||||
" 'a20e0a49-29d8-465e-8eae-0bc5ac3d24dc',\n",
|
||||
" 'c19f4108-b652-4890-873e-d4cad00f1b1a',\n",
|
||||
" '23d1fcf9-6ee1-4638-8c70-0f5030762301',\n",
|
||||
" '2d775784-825d-4627-97a3-fee4539d8f58']"
|
||||
]
|
||||
},
|
||||
"execution_count": 35,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"inserted_vectors = store.add_documents(docs)\n",
|
||||
"\n",
|
||||
"inserted_vectors[:5]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"store"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['fe1f7a7b-42e2-4828-88b0-5b449c49fe86',\n",
|
||||
" '154a0021-a99c-427e-befb-f0b2b18ed83c',\n",
|
||||
" 'a8218226-18a9-4ab5-ade5-5a71b19a7831',\n",
|
||||
" '62b7ef97-83bf-4b6d-8c93-f471796244dc',\n",
|
||||
" 'ab43fd2e-13df-46d4-8cf7-e6e16506e4bb',\n",
|
||||
" '6841e7f9-adaa-41d9-af3d-0813ee52443f',\n",
|
||||
" '45dda5a1-f0c1-4ac7-9acb-50253e4ee493']"
|
||||
]
|
||||
},
|
||||
"execution_count": 27,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"store.add_texts(\n",
|
||||
" [\n",
|
||||
" \"A timeless tale set in the Jazz Age, this novel delves into the lives of affluent socialites, their pursuits of wealth, love, and the elusive American Dream. Amidst extravagant parties and glittering opulence, the story unravels the complexities of desire, ambition, and the consequences of obsession.\",\n",
|
||||
" \"Set in a small Southern town during the 1930s, this novel explores themes of racial injustice, moral growth, and empathy through the eyes of a young girl. It follows her father, a principled lawyer, as he defends a black man accused of assaulting a white woman, confronting deep-seated prejudices and challenging societal norms along the way.\",\n",
|
||||
" \"A chilling portrayal of a totalitarian regime, this dystopian novel offers a bleak vision of a future world dominated by surveillance, propaganda, and thought control. Through the eyes of a disillusioned protagonist, it explores the dangers of totalitarianism and the erosion of individual freedom in a society ruled by fear and oppression.\",\n",
|
||||
" \"Set in the English countryside during the early 19th century, this novel follows the lives of the Bennet sisters as they navigate the intricate social hierarchy of their time. Focusing on themes of marriage, class, and societal expectations, the story offers a witty and insightful commentary on the complexities of romantic relationships and the pursuit of happiness.\",\n",
|
||||
" \"Narrated by a disillusioned teenager, this novel follows his journey of self-discovery and rebellion against the phoniness of the adult world. Through a series of encounters and reflections, it explores themes of alienation, identity, and the search for authenticity in a society marked by conformity and hypocrisy.\",\n",
|
||||
" \"In a society where emotion is suppressed and individuality is forbidden, one man dares to defy the oppressive regime. Through acts of rebellion and forbidden love, he discovers the power of human connection and the importance of free will.\",\n",
|
||||
" \"Set in a future world devastated by environmental collapse, this novel follows a group of survivors as they struggle to survive in a harsh, unforgiving landscape. Amidst scarcity and desperation, they must confront moral dilemmas and question the nature of humanity itself.\",\n",
|
||||
" ],\n",
|
||||
" [\n",
|
||||
" {\"title\": \"The Great Gatsby\", \"author\": \"F. Scott Fitzgerald\", \"year\": 1925},\n",
|
||||
" {\"title\": \"To Kill a Mockingbird\", \"author\": \"Harper Lee\", \"year\": 1960},\n",
|
||||
" {\"title\": \"1984\", \"author\": \"George Orwell\", \"year\": 1949},\n",
|
||||
" {\"title\": \"Pride and Prejudice\", \"author\": \"Jane Austen\", \"year\": 1813},\n",
|
||||
" {\"title\": \"The Catcher in the Rye\", \"author\": \"J.D. Salinger\", \"year\": 1951},\n",
|
||||
" {\"title\": \"Brave New World\", \"author\": \"Aldous Huxley\", \"year\": 1932},\n",
|
||||
" {\"title\": \"The Road\", \"author\": \"Cormac McCarthy\", \"year\": 2006},\n",
|
||||
" ],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Querying\n",
|
||||
"\n",
|
||||
"The database can be queried using a vector or a text prompt.\n",
|
||||
"If a text prompt is used, it's first converted into embedding and then queried.\n",
|
||||
"\n",
|
||||
"The `k` parameter specifies how many results to return from the query."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(page_content='And my report is this: the State of the Union is strong—because you, the American people, are strong. \\n\\nWe are stronger today than we were a year ago. \\n\\nAnd we will be stronger a year from now than we are today. \\n\\nNow is our moment to meet and overcome the challenges of our time. \\n\\nAnd we will, as one people. \\n\\nOne America. \\n\\nThe United States of America. \\n\\nMay God bless you all. May God protect our troops.', metadata={'source': '../../modules/state_of_the_union.txt'}),\n",
|
||||
" Document(page_content='And built the strongest, freest, and most prosperous nation the world has ever known. \\n\\nNow is the hour. \\n\\nOur moment of responsibility. \\n\\nOur test of resolve and conscience, of history itself. \\n\\nIt is in this moment that our character is formed. Our purpose is found. Our future is forged. \\n\\nWell I know this nation. \\n\\nWe will meet the test. \\n\\nTo protect freedom and liberty, to expand fairness and opportunity. \\n\\nWe will save democracy. \\n\\nAs hard as these times have been, I am more optimistic about America today than I have been my whole life. \\n\\nBecause I see the future that is within our grasp. \\n\\nBecause I know there is simply nothing beyond our capacity. \\n\\nWe are the only nation on Earth that has always turned every crisis we have faced into an opportunity. \\n\\nThe only nation that can be defined by a single word: possibilities. \\n\\nSo on this night, in our 245th year as a nation, I have come to report on the State of the Union.', metadata={'source': '../../modules/state_of_the_union.txt'}),\n",
|
||||
" Document(page_content='Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \\n\\nIn this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \\n\\nLet each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \\n\\nPlease rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \\n\\nThroughout our history we’ve learned this lesson when dictators do not pay a price for their aggression they cause more chaos. \\n\\nThey keep moving. \\n\\nAnd the costs and the threats to America and the world keep rising. \\n\\nThat’s why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \\n\\nThe United States is a member along with 29 other nations. \\n\\nIt matters. American diplomacy matters. American resolve matters.', metadata={'source': '../../modules/state_of_the_union.txt'}),\n",
|
||||
" Document(page_content='When we use taxpayer dollars to rebuild America – we are going to Buy American: buy American products to support American jobs. \\n\\nThe federal government spends about $600 Billion a year to keep the country safe and secure. \\n\\nThere’s been a law on the books for almost a century \\nto make sure taxpayers’ dollars support American jobs and businesses. \\n\\nEvery Administration says they’ll do it, but we are actually doing it. \\n\\nWe will buy American to make sure everything from the deck of an aircraft carrier to the steel on highway guardrails are made in America. \\n\\nBut to compete for the best jobs of the future, we also need to level the playing field with China and other competitors. \\n\\nThat’s why it is so important to pass the Bipartisan Innovation Act sitting in Congress that will make record investments in emerging technologies and American manufacturing. \\n\\nLet me give you one example of why it’s so important to pass it.', metadata={'source': '../../modules/state_of_the_union.txt'}),\n",
|
||||
" Document(page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \\n\\nLast year COVID-19 kept us apart. This year we are finally together again. \\n\\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \\n\\nWith a duty to one another to the American people to the Constitution. \\n\\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \\n\\nSix days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \\n\\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \\n\\nHe met the Ukrainian people. \\n\\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.', metadata={'source': '../../modules/state_of_the_union.txt'})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"result = store.similarity_search(\"The United States of America\", k=5)\n",
|
||||
"result"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 30,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(page_content='A chilling portrayal of a totalitarian regime, this dystopian novel offers a bleak vision of a future world dominated by surveillance, propaganda, and thought control. Through the eyes of a disillusioned protagonist, it explores the dangers of totalitarianism and the erosion of individual freedom in a society ruled by fear and oppression.', metadata={'title': '1984', 'author': 'George Orwell', 'year': 1949}),\n",
|
||||
" Document(page_content='Narrated by a disillusioned teenager, this novel follows his journey of self-discovery and rebellion against the phoniness of the adult world. Through a series of encounters and reflections, it explores themes of alienation, identity, and the search for authenticity in a society marked by conformity and hypocrisy.', metadata={'title': 'The Catcher in the Rye', 'author': 'J.D. Salinger', 'year': 1951}),\n",
|
||||
" Document(page_content='Set in the English countryside during the early 19th century, this novel follows the lives of the Bennet sisters as they navigate the intricate social hierarchy of their time. Focusing on themes of marriage, class, and societal expectations, the story offers a witty and insightful commentary on the complexities of romantic relationships and the pursuit of happiness.', metadata={'title': 'Pride and Prejudice', 'author': 'Jane Austen', 'year': 1813})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 30,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"result = store.similarity_search(\"dystopia\", k=3, filter=\"year < 2000\")\n",
|
||||
"result"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Querying with score\n",
|
||||
"\n",
|
||||
"The score of the query can be included for every result. \n",
|
||||
"\n",
|
||||
"> The score returned in the query requests is a normalized value between 0 and 1, where 1 indicates the highest similarity and 0 the lowest regardless of the similarity function used. For more information look at the [docs](https://upstash.com/docs/vector/overall/features#vector-similarity-functions)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{'source': '../../modules/state_of_the_union.txt'} - 0.87391514\n",
|
||||
"{'source': '../../modules/state_of_the_union.txt'} - 0.8549463\n",
|
||||
"{'source': '../../modules/state_of_the_union.txt'} - 0.847913\n",
|
||||
"{'source': '../../modules/state_of_the_union.txt'} - 0.84328896\n",
|
||||
"{'source': '../../modules/state_of_the_union.txt'} - 0.832347\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"result = store.similarity_search_with_score(\"The United States of America\", k=5)\n",
|
||||
"\n",
|
||||
"for doc, score in result:\n",
|
||||
" print(f\"{doc.metadata} - {score}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Deleting vectors\n",
|
||||
"\n",
|
||||
"Vectors can be deleted by their ids"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 32,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"store.delete(inserted_vectors)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Clearing the vector database\n",
|
||||
"\n",
|
||||
"This will clear the vector database"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 33,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"store.delete(delete_all=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Getting info about vector database\n",
|
||||
"\n",
|
||||
"You can get information about your database like the distance metric dimension using the info function.\n",
|
||||
"\n",
|
||||
"> When an insert happens, the database an indexing takes place. While this is happening new vectors can not be queried. `pendingVectorCount` represents the number of vector that are currently being indexed. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 36,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"InfoResult(vector_count=42, pending_vector_count=0, index_size=6470, dimension=384, similarity_function='COSINE')"
|
||||
]
|
||||
},
|
||||
"execution_count": 36,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"store.info()"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "ai",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.6"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
@@ -60,7 +60,7 @@
|
||||
" * document addition by id (`add_documents` method with `ids` argument)\n",
|
||||
" * delete by id (`delete` method with `ids` argument)\n",
|
||||
"\n",
|
||||
"Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `LanceDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
|
||||
"Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `LanceDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `UpstashVectorStore`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
|
||||
" \n",
|
||||
"## Caution\n",
|
||||
"\n",
|
||||
|
Reference in New Issue
Block a user