langchain/docs
Jan Chorowski b8b42ccbc5
community[minor]: Pathway vectorstore(#14859)
- **Description:** Integration with pathway.com data processing pipeline
acting as an always updated vectorstore
  - **Issue:** not applicable
- **Dependencies:** optional dependency on
[`pathway`](https://pypi.org/project/pathway/)
  - **Twitter handle:** pathway_com

The PR provides and integration with `pathway` to provide an easy to use
always updated vector store:

```python
import pathway as pw
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import PathwayVectorClient, PathwayVectorServer

data_sources = []
data_sources.append(
    pw.io.gdrive.read(object_id="17H4YpBOAKQzEJ93xmC2z170l0bP2npMy", service_user_credentials_file="credentials.json", with_metadata=True))

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
embeddings_model = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
vector_server = PathwayVectorServer(
    *data_sources,
    embedder=embeddings_model,
    splitter=text_splitter,
)
vector_server.run_server(host="127.0.0.1", port="8765", threaded=True, with_cache=False)
client = PathwayVectorClient(
    host="127.0.0.1",
    port="8765",
)
query = "What is Pathway?"
docs = client.similarity_search(query)
```

The `PathwayVectorServer` builds a data processing pipeline which
continusly scans documents in a given source connector (google drive,
s3, ...) and builds a vector store. The `PathwayVectorClient` implements
LangChain's `VectorStore` interface and connects to the server to
retrieve documents.

---------

Co-authored-by: Mateusz Lewandowski <lewymati@users.noreply.github.com>
Co-authored-by: mlewandowski <mlewandowski@MacBook-Pro-mlewandowski.local>
Co-authored-by: Berke <berkecanrizai1@gmail.com>
Co-authored-by: Adrian Kosowski <adrian@pathway.com>
Co-authored-by: mlewandowski <mlewandowski@macbook-pro-mlewandowski.home>
Co-authored-by: berkecanrizai <63911408+berkecanrizai@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: mlewandowski <mlewandowski@MBPmlewandowski.ht.home>
Co-authored-by: Szymon Dudycz <szymond@pathway.com>
Co-authored-by: Szymon Dudycz <szymon.dudycz@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 10:50:39 -07:00
..
api_reference community[patch], langchain[minor]: Add retriever self_query and score_threshold in DingoDB (#18106) 2024-03-05 15:47:29 -08:00
data 👥 Update LangChain people data (#18473) 2024-03-03 19:58:58 -08:00
docs community[minor]: Pathway vectorstore(#14859) 2024-03-29 10:50:39 -07:00
scripts add script to check imports (#19611) 2024-03-29 13:30:20 -04:00
src docs[minor]: Add chat model selection tabs component (#19296) 2024-03-19 18:12:46 -07:00
static docs: update use_cases/question_answering/chat_history (#19349) 2024-03-28 12:51:01 -04:00
.gitignore docs[minor]: Swap gtag for supabase (#18937) 2024-03-11 14:23:12 -07:00
.local_build.sh docs: partner packages (#16960) 2024-02-02 15:12:21 -08:00
.yarnrc.yml docs[minor]: Add thumbs up/down to all docs pages (#18526) 2024-03-04 15:14:28 -08:00
babel.config.js Restructure docs (#11620) 2023-10-10 12:55:19 -07:00
code-block-loader.js Restructure docs (#11620) 2023-10-10 12:55:19 -07:00
docusaurus.config.js docs[patch]: properly load/use env vars (#18942) 2024-03-11 15:38:05 -07:00
package.json ci[minor]: Bump LC scripts package, add retry option (#19285) 2024-03-19 10:42:59 -07:00
README.md docs: developer docs (#14776) 2023-12-17 12:55:49 -08:00
settings.ini Restructure docs (#11620) 2023-10-10 12:55:19 -07:00
sidebars.js docs: Toolkits menu (#16217) 2024-02-08 14:52:26 -08:00
vercel_build.sh docs: fix vercel build script (#19090) 2024-03-14 20:53:43 +00:00
vercel_requirements.txt infra: docs build install community editable (#14739) 2023-12-14 16:13:09 -08:00
vercel.json community[minor]: migrate bigdl-llm to ipex-llm (#19518) 2024-03-27 20:12:59 -07:00
yarn.lock ci[minor]: Bump LC scripts package, add retry option (#19285) 2024-03-19 10:42:59 -07:00

LangChain Documentation

For more information on contributing to our documentation, see the Documentation Contributing Guide