mirror of https://github.com/hwchase17/langchain.git synced 2025-09-01 19:12:42 +00:00

Go to file

berkedilekoglu 73b9ca54cb Using batches for update document with a new function in ChromaDB (#6561 )

2a4b32dee2/langchain/vectorstores/chroma.py (L355-L375)

Currently, the defined update_document function only takes a single
document and its ID for updating. However, Chroma can update multiple
documents by taking a list of IDs and documents for batch updates. If we
update 'update_document' function both document_id and document can be
`Union[str, List[str]]` but we need to do type check. Because
embed_documents and update functions takes List for text and
document_ids variables. I believe that, writing a new function is the
best option.

I update the Chroma vectorstore with refreshed information from my
website every 20 minutes. Updating the update_document function to
perform simultaneous updates for each changed piece of information would
significantly reduce the update time in such use cases.

For my case I update a total of 8810 chunks. Updating these 8810
individual chunks using the current function takes a total of 8.5
minutes. However, if we process the inputs in batches and update them
collectively, all 8810 separate chunks can be updated in just 1 minute.
This significantly reduces the time it takes for users of actively used
chatbots to access up-to-date information.

I can add an integration test and an example for the documentation for
the new update_document_batch function.

@hwchase17 

[berkedilekoglu](https://twitter.com/berkedilekoglu)

2023-09-13 11:39:56 -07:00

.devcontainer

Devcontainer README -> Clarification. (#8414 )

2023-07-28 15:09:42 -07:00

.github

fixed PR template (#10515 )

2023-09-13 09:35:48 -07:00

docs

Integration with ElevenLabs text to speech (#10181 )

2023-09-12 22:56:53 -07:00

libs

Using batches for update document with a new function in ChromaDB (#6561 )

2023-09-13 11:39:56 -07:00

tests/integration_tests/vectorstores

Add Vearch vectorstore (#9846 )

2023-09-08 16:51:14 -07:00

.gitattributes

Update dev container (#6189 )

2023-06-16 15:42:14 -07:00

.gitignore

add experimental ref (#8435 )

2023-07-28 14:26:47 -07:00

.gitmodules

Doc refactor (#6300 )

2023-06-16 11:52:56 -07:00

.readthedocs.yaml

use top nav docs (#8090 )

2023-07-21 13:52:03 -07:00

CITATION.cff

…

LICENSE

…

Makefile

fix makefile help (#8723 )

2023-08-04 15:37:00 -04:00

MIGRATE.md

2023-07-28 17:47:00 -07:00

poetry.lock

poetry lock the top-level environment. (#9477 )

2023-08-22 14:09:11 -04:00

poetry.toml

Unbreak devcontainer (#8154 )

2023-07-23 19:33:47 -07:00

pyproject.toml

Pinecone upsert parallelization (#9859 )

2023-09-03 15:37:41 -07:00

README.md

docs(readme): fixed badges with new github url (#9493 )

2023-08-19 14:51:38 -07:00

SECURITY.md

Update SECURITY.md email address. (#9558 )

2023-08-21 14:52:21 -04:00

README.md

🦜️🔗 LangChain

⚡ Building applications with LLMs through composability ⚡

Looking for the JS/TS version? Check out LangChain.js.

Production Support: As you move your LangChains into production, we'd love to offer more hands-on support. Fill out this form to share more about what you're building, and our team will get in touch.

🚨Breaking Changes for select chains (SQLDatabase) on 7/28/23

In an effort to make langchain leaner and safer, we are moving select chains to langchain_experimental. This migration has already started, but we are remaining backwards compatible until 7/28. On that date, we will remove functionality from langchain. Read more about the motivation and the progress here. Read how to migrate your code here.

Quick Install

pip install langchain or pip install langsmith && conda install langchain -c conda-forge

🤔 What is this?

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.

This library aims to assist in the development of those types of applications. Common examples of these applications include:

❓ Question Answering over specific documents

Documentation
End-to-end Example: Question Answering over Notion Database

💬 Chatbots

Documentation
End-to-end Example: Chat-LangChain

🤖 Agents

Documentation
End-to-end Example: GPT+WolframAlpha

📖 Documentation

Please see here for full documentation on:

Getting started (installation, setting up the environment, simple examples)
How-To examples (demos, integrations, helper functions)
Reference (full API docs)
Resources (high-level explanation of core concepts)

🚀 What can this help with?

There are six main areas that LangChain is designed to help with. These are, in increasing order of complexity:

📃 LLMs and Prompts:

This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.

🔗 Chains:

Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.

📚 Data Augmented Generation:

Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.

🤖 Agents:

Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.

🧠 Memory:

Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

🧐 Evaluation:

[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

For more information on these concepts, please see our full documentation.

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see here.

Description

⚡ Building applications with LLMs through composability ⚡

Readme MIT Cite this repository 4.8 GiB

Languages

Jupyter Notebook 74.2%

Python 20.7%

omnetpp-msg 4.8%

Makefile 0.1%

MDX 0.1%

README.md Unescape Escape

🦜️🔗 LangChain

🚨Breaking Changes for select chains (SQLDatabase) on 7/28/23

Quick Install

🤔 What is this?

📖 Documentation

🚀 What can this help with?

💁 Contributing

README.md