docs: add Pinecone tab to vector stores page (#21969)

Thank you for contributing to LangChain!

- [x] **PR title**: docs: add Pinecone tab to [vector stores
page](https://python.langchain.com/v0.1/docs/modules/data_connection/vectorstores/).


- [x] **PR message**: Recreation of
https://github.com/langchain-ai/langchain/pull/21721.
Adds information about PineconeVectorStore to the LangChain vector
stores page. Although this page is deprecated, it still shows up
prominently in Google search results, so it will still be very helpful
to users to have correct information.
![search
results](https://github.com/langchain-ai/langchain/assets/19216250/e05d8d74-03da-44a1-b87f-0f8087d3c13a)


- [x] **Add tests and docs**: N/A


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
This commit is contained in:
junefish 2024-05-21 12:35:20 -04:00 committed by GitHub
parent cb45caa02e
commit 5a40413bfd
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -56,6 +56,50 @@ documents = text_splitter.split_documents(raw_documents)
db = Chroma.from_documents(documents, OpenAIEmbeddings())
```
</TabItem>
<TabItem value="pinecone" label="Pinecone">
This walkthrough uses the `Pinecone` vector database, which provides broad functionality to store and search over vectors.
```bash
pip install langchain-pinecone
```
We want to use OpenAIEmbeddings so we have to get the OpenAI API Key.
```python
import os
import getpass
os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')
```
```python
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
# Load the document, split it into chunks, and embed each chunk.
loader = TextLoader("../../modules/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
```
Next, go to the [Pinecone console](https://app.pinecone.io) and create a new index with `dimension=1536` called "langchain-test-index". Then, copy the API key and index name.
```python
from langchain_pinecone import PineconeVectorStore
os.environ['PINECONE_API_KEY'] = '<YOUR_PINECONE_API_KEY>'
index_name = "langchain-test-index"
# Connect to Pinecone index and insert the chunked docs as contents
docsearch = PineconeVectorStore.from_documents(docs, embeddings, index_name=index_name)
```
</TabItem>
<TabItem value="faiss" label="FAISS">
@ -280,4 +324,4 @@ Ive worked on these issues a long time.
I know what works: Investing in crime prevention and community police officers wholl walk the beat, wholl know the neighborhood, and who can restore trust and safety.
```
</CodeOutputBlock>
</CodeOutputBlock>