## Description
This PR adds back snippets demonstrating sparse and hybrid retrieval in
the Qdrant notebook.
Without the snippets, it's hard to grok the usage.
Thank you for contributing to LangChain!
- [ ] **PR title**: "Documentation Update : Semantic Caching Update for
Upstash"
- Docs, llm caching integrations update
- **Description:** Upstash supports semantic caching, and we would like
to inform you about this
- **Twitter handle:** You can mention eray_eroglu_ if you want to post a
tweet about the PR
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
updated with langchain_google_community instead as the latest revision
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description:**
In this PR, I am adding three stock market tools from
financialdatasets.ai (my API!):
- get balance sheets
- get cash flow statements
- get income statements
Twitter handle: [@virattt](https://twitter.com/virattt)
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- [x] **PR title**: "docs: changed example for Exa search retriever
usage"
- [x] **PR message**:
- **Description:** Changed Exa integration doc at
`docs/docs/integrations/tools/exa_search.ipynb` to better reflect simple
Exa use case
- **Issue:** move toward more canonical use of Exa method
(`search_and_contents` rather than just `search`)
- **Dependencies:** no dependencies; docs only change
- **Twitter handle:** n/a - small change
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. - will do
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** The UnstructuredClient will have a breaking change in
the near future. Add a note in the docs that the examples here may not
use the latest version and users should refer to the SDK docs for the
latest info.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- **Description:**
Support ChatMlflow.bind_tools method
Tested in Databricks:
<img width="836" alt="image"
src="https://github.com/user-attachments/assets/fa28ef50-0110-4698-8eda-4faf6f0b9ef8">
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
- **Description:** When adding docs for constructing ChatHuggingFace
using a HuggingFacePipeline, I forgot to add `return_full_text=False` as
an argument. In this setup, the chat response would incorrectly contain
all the input text. I am fixing that here by adding that line to the
offending notebook.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
## Description
This PR:
- Fixes the validation error in `FastEmbedEmbeddings`.
- Adds support for `batch_size`, `parallel` params.
- Removes support for very old FastEmbed versions.
- Updates the FastEmbed doc with the new params.
Associated Issues:
- Resolves#24039
- Resolves #https://github.com/qdrant/fastembed/issues/296
Thank you for contributing to LangChain!
- [x] **PR title**: "Add documentaiton on InMemoryVectorStore driver for
MemoryDB to langchain-aws"
- Langchain-aws repo :Add MemoryDB documentation
- Example: "community: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Added documentation on InMemoryVectorStore driver to
aws.mdx and usage example on MemoryDB clusuter
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
Add memorydb notebook to docs/docs/integrations/ folde
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "community:add Yi LLM", "docs:add Yi Documentation"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** This PR adds support for the Yi model to LangChain.
- **Dependencies:**
[langchain_core,requests,contextlib,typing,logging,json,langchain_community]
- **Twitter handle:** 01.AI
- [x] **Add tests and docs**: I've added the corresponding documentation
to the relevant paths
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Description:
- This PR adds a self query retriever implementation for SAP HANA Cloud
Vector Engine. The retriever supports all operators except for contains.
- Issue: N/A
- Dependencies: no new dependencies added
**Add tests and docs:**
Added integration tests to:
libs/community/tests/unit_tests/query_constructors/test_hanavector.py
**Documentation for self query retriever:**
/docs/integrations/retrievers/self_query/hanavector_self_query.ipynb
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- [x] **PR title**:
community: Add OCI Generative AI tool and structured output support
- [x] **PR message**:
- **Description:** adding tool calling and structured output support for
chat models offered by OCI Generative AI services. This is an update to
our last PR 22880 with changes in
/langchain_community/chat_models/oci_generative_ai.py
- **Issue:** NA
- **Dependencies:** NA
- **Twitter handle:** NA
- [x] **Add tests and docs**:
1. we have updated our unit tests
2. we have updated our documentation under
/docs/docs/integrations/chat/oci_generative_ai.ipynb
- [x] **Lint and test**: `make format`, `make lint` and `make test` we
run successfully
---------
Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com>
Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
**Description:**
- This PR exposes some functions in VDMS vectorstore, updates VDMS
related notebooks, updates tests, and upgrade version of VDMS (>=0.0.20)
**Issue:** N/A
**Dependencies:**
- Update vdms>=0.0.20
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration.
ScrapingAnt is a web scraping API that allows extracting web page data
into accessible and well-formatted markdown.
Description: Added ScrapingAnt web loader for retrieving web page data
as markdown
Dependencies: scrapingant-client
Twitter: @WeRunTheWorld3
---------
Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>
#### Update (2):
A single `UnstructuredLoader` is added to handle both local and api
partitioning. This loader also handles single or multiple documents.
#### Changes in `community`:
Changes here do not affect users. In the initial process of using the
SDK for the API Loaders, the Loaders in community were refactored.
Other changes include:
The `UnstructuredBaseLoader` has a new check to see if both
`mode="paged"` and `chunking_strategy="by_page"`. It also now has
`Element.element_id` added to the `Document.metadata`.
`UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such,
now both directly inherit from `UnstructuredBaseLoader` and initialize
their `file_path`/`file` attributes respectively and implement their own
`_post_process_elements` methods.
--------
#### Update:
New SDK Loaders in a [partner
package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo)
are introduced to prevent breaking changes for users (see discussion
below).
##### TODO:
- [x] Test docstring examples
--------
- **Description:** UnstructuredAPIFileIOLoader and
UnstructuredAPIFileLoader calls to the unstructured api are now made
using the unstructured-client sdk.
- **New Dependencies:** unstructured-client
- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] a test for the integration, preferably unit tests that do not rely
on network access,
- [x] update the description in
`docs/docs/integrations/providers/unstructured.mdx`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
TODO:
- [x] Update
https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api
-
`langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb`
- The description here needs to indicate that users should install
`unstructured-client` instead of `unstructured`. Read over closely to
look for any other changes that need to be made.
- [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to
handle json responses from the API instead of just lists of elements.
- This method may need to be overwritten by the API loaders instead of
changing it in the `UnstructuredBaseLoader`.
- [x] Update the documentation links in the class docstrings (the
Unstructured documents have moved)
- [x] Update Document.metadata to include `element_id` (see thread
[here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419))
---------
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
This PR is under WIP and adds the following functionalities:
- [X] Supports tool calling across the langchain ecosystem. (However
streaming is not supported)
- [X] Update documentation
- [ ] **Community**: "Retrievers: Product Quantization"
- [X] This PR adds Product Quantization feature to the retrievers to the
Langchain Community. PQ is one of the fastest retrieval methods if the
embeddings are rich enough in context due to the concepts of
quantization and representation through centroids
- **Description:** Adding PQ as one of the retrievers
- **Dependencies:** using the package nanopq for this PR
- **Twitter handle:** vishnunkumar_
- [X] **Add tests and docs**: If you're adding a new integration, please
include
- [X] Added unit tests for the same in the retrievers.
- [] Will add an example notebook subsequently
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/ -
done the same
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: Update IBM docs about information to pass client
into WatsonxLLM and WatsonxEmbeddings object.
- [x] **PR message**:
- **Description:** Update IBM docs about information to pass client into
WatsonxLLM and WatsonxEmbeddings object.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
**Description**
Add support for Pinecone hosted embedding models as
`PineconeEmbeddings`. Replacement for #22890
**Dependencies**
Add `aiohttp` to support async embeddings call against REST directly
- [x] **Add tests and docs**: If you're adding a new integration, please
include
Added `docs/docs/integrations/text_embedding/pinecone.ipynb`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Twitter: `gdjdg17`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
### Description
This pull request added new document loaders to load documents of
various formats using [Dedoc](https://github.com/ispras/dedoc):
- `DedocFileLoader` (determine file types automatically and parse)
- `DedocPDFLoader` (for `PDF` and images parsing)
- `DedocAPIFileLoader` (determine file types automatically and parse
using Dedoc API without library installation)
[Dedoc](https://dedoc.readthedocs.io) is an open-source library/service
that extracts texts, tables, attached files and document structure
(e.g., titles, list items, etc.) from files of various formats. The
library is actively developed and maintained by a group of developers.
`Dedoc` supports `DOCX`, `XLSX`, `PPTX`, `EML`, `HTML`, `PDF`, images
and more.
Full list of supported formats can be found
[here](https://dedoc.readthedocs.io/en/latest/#id1).
For `PDF` documents, `Dedoc` allows to determine textual layer
correctness and split the document into paragraphs.
### Issue
This pull request extends variety of document loaders supported by
`langchain_community` allowing users to choose the most suitable option
for raw documents parsing.
### Dependencies
The PR added a new (optional) dependency `dedoc>=2.2.5` ([library
documentation](https://dedoc.readthedocs.io)) to the
`extended_testing_deps.txt`
### Twitter handle
None
### Add tests and docs
1. Test for the integration:
`libs/community/tests/integration_tests/document_loaders/test_dedoc.py`
2. Example notebook:
`docs/docs/integrations/document_loaders/dedoc.ipynb`
3. Information about the library:
`docs/docs/integrations/providers/dedoc.mdx`
### Lint and test
Done locally:
- `make format`
- `make lint`
- `make integration_tests`
- `make docs_build` (from the project root)
---------
Co-authored-by: Nasty <bogatenkova.anastasiya@mail.ru>
https://www.youtube.com/watch?v=ZIyB9e_7a4c
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description:**
**TextEmbed** is a high-performance embedding inference server designed
to provide a high-throughput, low-latency solution for serving
embeddings. It supports various sentence-transformer models and includes
the ability to deploy image and text embedding models. TextEmbed offers
flexibility and scalability for diverse applications.
- **PyPI Package:** [TextEmbed on
PyPI](https://pypi.org/project/textembed/)
- **Docker Image:** [TextEmbed on Docker
Hub](https://hub.docker.com/r/kevaldekivadiya/textembed)
- **GitHub Repository:** [TextEmbed on
GitHub](https://github.com/kevaldekivadiya2415/textembed)
**PR Description**
This PR adds functionality for embedding documents and queries using the
`TextEmbedEmbeddings` class. The implementation allows for both
synchronous and asynchronous embedding requests to a TextEmbed API
endpoint. The class handles batching and permuting of input texts to
optimize the embedding process.
**Example Usage:**
```python
from langchain_community.embeddings import TextEmbedEmbeddings
# Initialise the embeddings class
embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url")
# Define a list of documents
documents = [
"Data science involves extracting insights from data.",
"Artificial intelligence is transforming various industries.",
"Cloud computing provides scalable computing resources over the internet.",
"Big data analytics helps in understanding large datasets.",
"India has a diverse cultural heritage."
]
# Define a query
query = "What is the cultural heritage of India?"
# Embed all documents
document_embeddings = embeddings.embed_documents(documents)
# Embed the query
query_embedding = embeddings.embed_query(query)
# Print embeddings for each document
for i, embedding in enumerate(document_embeddings):
print(f"Document {i+1} Embedding:", embedding)
# Print the query embedding
print("Query Embedding:", query_embedding)
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
- **Description:** Add Riza Python/JS code execution tool
- **Issue:** N/A
- **Dependencies:** an optional dependency on the `rizaio` pypi package
- **Twitter handle:** [@rizaio](https://x.com/rizaio)
[Riza](https://riza.io) is a safe code execution environment for
agent-generated Python and JavaScript that's easy to integrate into
langchain apps. This PR adds two new tool classes to the community
package.
Description: added support for LangChain v0.2 for nvidia ai endpoint.
Implremented inMemory storage for chains using
RunnableWithMessageHistory which is analogous to using
`ConversationChain` which was used in v0.1 with the default
`ConversationBufferMemory`. This class is deprecated in favor of
`RunnableWithMessageHistory` in LangChain v0.2
Issue: None
Dependencies: None.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Description:** : Add support for chat message history using Couchbase
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Description: added support for LangChain v0.2 for PipelineAI
integration. Removed deprecated classes and incorporated support for
LangChain v0.2 to integrate with PipelineAI. Removed LLMChain and
replaced it with Runnable interface. Also added StrOutputParser, that
parses LLMResult into the top likely string.
Issue: None
Dependencies: None.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Description: Added support for langchain v0.2 for shale protocol.
Replaced LLMChain with Runnable interface which allows any two Runnables
to be 'chained' together into sequences. Also added
StreamingStdOutCallbackHandler. Callback handler for streaming.
Issue: None
Dependencies: None.
Thank you for contributing to LangChain!
- [X] *ApertureDB as vectorstore**: "community: Add ApertureDB as a
vectorestore"
- **Description:** this change provides a new community integration that
uses ApertureData's ApertureDB as a vector store.
- **Issue:** none
- **Dependencies:** depends on ApertureDB Python SDK
- **Twitter handle:** ApertureData
- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Integration tests rely on a local run of a public docker image.
Example notebook additionally relies on a local Ollama server.
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
All lint tests pass.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Gautam <gautam@aperturedata.io>
**Description:**
Databricks Vector Search recently added support for hybrid
keyword-similarity search.
See [usage
examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint)
from their documentation.
This PR updates the Langchain vectorstore interface for Databricks to
enable the user to pass the *query_type* parameter to
*similarity_search* to make use of this functionality.
By default, there will not be any changes for existing users of this
interface. To use the new hybrid search feature, it is now possible to
do
```python
# ...
dvs = DatabricksVectorSearch(index)
dvs.similarity_search("my search query", query_type="HYBRID")
```
Or using the retriever:
```python
retriever = dvs.as_retriever(
search_kwargs={
"query_type": "HYBRID",
}
)
retriever.invoke("my search query")
```
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** Spell check fixes for docs, comments, and a couple of
strings. No code change e.g. variable names.
**Issue:** none
**Dependencies:** none
**Twitter handle:** hmartin
Latest langchain-cohere sdk mandates passing in the model parameter into
the Embeddings and Reranker inits.
This PR is to update the docs to reflect these changes.
Thank you for contributing to LangChain!
**Description:** Add support for caching (standard + semantic) LLM
responses using Couchbase
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
mmemory in the description -> memory (corrected spelling mistake)
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
- **Description:** Support PGVector in PebbloRetrievalQA
- Identity and Semantic Enforcement support for PGVector
- Refactor Vectorstore validation and name check
- Clear the overridden identity and semantic enforcement filters
- **Issue:** NA
- **Dependencies:** NA
- **Tests**: NA(already added)
- **Docs**: Updated
- **Twitter handle:** [@Raj__725](https://twitter.com/Raj__725)
Thank you for contributing to LangChain!
- [x] **PR title**: "IBM: Added WatsonxChat to chat models preview,
update passing params to invoke method"
- [x] **PR message**:
- **Description:** Added WatsonxChat passing params to invoke method,
added integration tests
- **Dependencies:** `ibm_watsonx_ai`
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Description: Fixed a typo during the imports for the
GoogleDriveSearchTool
Issue: It's only for the docs, but it bothered me so i decided to fix it
quickly :D
- **Description:** Enhance JiraAPIWrapper to accept the 'cloud'
parameter through an environment variable. This update allows more
flexibility in configuring the environment for the Jira API.
- **Twitter handle:** Andre_Q_Pereira
---------
Co-authored-by: André Quintino <andre.quintino@tui.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
This PR adds a `SingleStoreDBSemanticCache` class that implements a
cache based on SingleStoreDB vector store, integration tests, and a
notebook example.
Additionally, this PR contains minor changes to SingleStoreDB vector
store:
- change add texts/documents methods to return a list of inserted ids
- implement delete(ids) method to delete documents by list of ids
- added drop() method to drop a correspondent database table
- updated integration tests to use and check functionality implemented
above
CC: @baskaryan, @hwchase17
---------
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
**Description:** Update docs content on agent memory
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
added pre-filtering documentation
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**:
- **Description:** added filter vector search
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:**: n/a
- [x] **Add tests and docs**: If you're adding a new integration, please
include - No need for tests, just a simple doc update
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
After merging the [PR #22594 to include Jina AI multimodal capabilities
in the Langchain
documentation](https://github.com/langchain-ai/langchain/pull/22594), we
updated the notebook to showcase the difference between text and
multimodal capabilities more clearly.
- Update Meta Llama 3 cookbook link
- Add prereq section and information on `messages_modifier` to LangGraph
migration guide
- Update `PydanticToolsParser` explanation and entrypoint in tool
calling guide
- Add more obvious warning to `OllamaFunctions`
- Fix Wikidata tool install flow
- Update Bedrock LLM initialization
@baskaryan can you add a bit of information on how to authenticate into
the `ChatBedrock` and `BedrockLLM` models? I wasn't able to figure it
out :(
- Updates chat few shot prompt tutorial to show off a more cohesive
example
- Fix async Chromium loader guide
- Fix Excel loader install instructions
- Reformat Html2Text page
- Add install instructions to Azure OpenAI embeddings page
- Add missing dep install to SQL QA tutorial
@baskaryan
## Description
Created a helper method to make vector search indexes via client-side
pymongo.
**Recent Update** -- Removed error suppressing/overwriting layer in
favor of letting the original exception provide information.
## ToDo's
- [x] Make _wait_untils for integration test delete index
functionalities.
- [x] Add documentation for its use. Highlight it's experimental
- [x] Post Integration Test Results in a screenshot
- [x] Get review from MongoDB internal team (@shaneharvey, @blink1073 ,
@NoahStapp , @caseyclements)
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. Added new integration tests. Not eligible for unit testing since the
operation is Atlas Cloud specific.
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Thank you for contributing to LangChain!
- [X] **PR title**: "community: fix code example in ZenGuard docs"
- [X] **PR message**:
- **Description:** corrected the docs by indicating in the code example
that the tool accepts a list of prompts instead of just one
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Thank you for review
---------
Co-authored-by: Baur <baur.krykpayev@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "community: update docs and add tool to init.py"
- [x] **PR message**:
- **Description:** Fixed some errors and comments in the docs and added
our ZenGuardTool and additional classes to init.py for easy access when
importing
- **Question:** when will you update the langchain-community package in
pypi to make our tool available?
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Thank you for review!
---------
Co-authored-by: Baur <baur.krykpayev@gmail.com>
- [x] PR title:
community: Add OCI Generative AI new model support
- [x] PR message:
- Description: adding support for new models offered by OCI Generative
AI services. This is a moderate update of our initial integration PR
16548 and includes a new integration for our chat models under
/langchain_community/chat_models/oci_generative_ai.py
- Issue: NA
- Dependencies: No new Dependencies, just latest version of our OCI sdk
- Twitter handle: NA
- [x] Add tests and docs:
1. we have updated our unit tests
2. we have updated our documentation including a new ipynb for our new
chat integration
- [x] Lint and test:
`make format`, `make lint`, and `make test` run successfully
---------
Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com>
Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
** Description**
This is the community integration of ZenGuard AI - the fastest
guardrails for GenAI applications. ZenGuard AI protects against:
- Prompts Attacks
- Veering of the pre-defined topics
- PII, sensitive info, and keywords leakage.
- Toxicity
- Etc.
**Twitter Handle** : @zenguardai
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. Added an integration test
2. Added colab
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.
---------
Co-authored-by: Nuradil <nuradil.maksut@icloud.com>
Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com>
They are now rejecting with code 401 calls from users with expired or
invalid tokens (while before they were being considered anonymous).
Thus, the authorization header has to be removed when there is no token.
Related to: #23178
---------
Signed-off-by: Joffref <mariusjoffre@gmail.com>
minor changes to module import error handling and minor issues in
tutorial documents.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
- **Description:** sambanova sambaverse integration improvement: removed
input parsing that was changing raw user input, and was making to use
process prompt parameter as true mandatory
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Add chat history store based on Kafka.
Files added:
`libs/community/langchain_community/chat_message_histories/kafka.py`
`docs/docs/integrations/memory/kafka_chat_message_history.ipynb`
New issue to be created for future improvement:
1. Async method implementation.
2. Message retrieval based on timestamp.
3. Support for other configs when connecting to cloud hosted Kafka (e.g.
add `api_key` field)
4. Improve unit testing & integration testing.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
**Description:** This PR adds a chat model integration for [Snowflake
Cortex](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions),
which gives an instant access to industry-leading large language models
(LLMs) trained by researchers at companies like Mistral, Reka, Meta, and
Google, including [Snowflake
Arctic](https://www.snowflake.com/en/data-cloud/arctic/), an open
enterprise-grade model developed by Snowflake.
**Dependencies:** Snowflake's
[snowpark](https://pypi.org/project/snowflake-snowpark-python/) library
is required for using this integration.
**Twitter handle:** [@gethouseware](https://twitter.com/gethouseware)
- [x] **Add tests and docs**:
1. integration tests:
`libs/community/tests/integration_tests/chat_models/test_snowflake.py`
2. unit tests:
`libs/community/tests/unit_tests/chat_models/test_snowflake.py`
3. example notebook: `docs/docs/integrations/chat/snowflake.ipynb`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Add admonition to the documentation to make sure users are aware that
the tool allows execution of code on the host machine using a python
interpreter (by design).
- **PR title**: [community] add chat model llamacpp
- **PR message**:
- **Description:** This PR introduces a new chat model integration with
llamacpp_python, designed to work similarly to the existing ChatOpenAI
model.
+ Work well with instructed chat, chain and function/tool calling.
+ Work with LangGraph (persistent memory, tool calling), will update
soon
- **Dependencies:** This change requires the llamacpp_python library to
be installed.
@baskaryan
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This PR adds the feature add Prem Template feature in ChatPremAI.
Additionally it fixes a minor bug for API auth error when API passed
through arguments.
Description: Adjusting the syntax for creating the vectorstore
collection (in the case of automatic embedding computation) for the most
idiomatic way to submit the stored secret name.
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
**Description:**
Update the NVIDIA Riva tool documentation to use NVIDIA NIM for the LLM.
Show how to use NVIDIA NIMs and link to documentation for LangChain with
NIM.
---------
Co-authored-by: Hayden Wolff <hwolff@nvidia.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
- **Description:** Add a new format, `CHUNKS`, to
`langchain_community.document_loaders.youtube.YoutubeLoader` which
creates multiple `Document` objects from YouTube video transcripts
(captions), each of a fixed duration. The metadata of each chunk
`Document` includes the start time of each one and a URL to that time in
the video on the YouTube website.
I had implemented this for UMich (@umich-its-ai) in a local module, but
it makes sense to contribute this to LangChain community for all to
benefit and to simplify maintenance.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter:** lsloan_umich
- **Mastodon:**
[lsloan@mastodon.social](https://mastodon.social/@lsloan)
With regards to **tests and documentation**, most existing features of
the `YoutubeLoader` class are not tested. Only the
`YoutubeLoader.extract_video_id()` static method had a test. However,
while I was waiting for this PR to be reviewed and merged, I had time to
add a test for the chunking feature I've proposed in this PR.
I have added an example of using chunking to the
`docs/docs/integrations/document_loaders/youtube_transcript.ipynb`
notebook.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR add supports for Azure Cosmos DB for NoSQL vector store.
Summary:
Description: added vector store integration for Azure Cosmos DB for
NoSQL Vector Store,
Dependencies: azure-cosmos dependency,
Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- [ ] **Miscellaneous updates and fixes**:
- **Description:** Handled error in querying; quotes in table names;
updated gpudb API
- **Issue:** Threw an error with an error message difficult to
understand if a query failed or returned no records
- **Dependencies:** Updated GPUDB API version to `7.2.0.9`
@baskaryan @hwchase17
LLMs struggle with Graph RAG, because it's different from vector RAG in
a way that you don't provide the whole context, only the answer and the
LLM has to believe. However, that doesn't really work a lot of the time.
However, if you wrap the context as function response the accuracy is
much better.
btw... `union[LLMChain, Runnable]` is linting fun, that's why so many
ignores
**Description:** this PR adds Volcengine Rerank capability to Langchain,
you can find Volcengine Rerank API from
[here](https://www.volcengine.com/docs/84313/1254474) &
[here](https://www.volcengine.com/docs/84313/1254605).
[Volcengine](https://www.volcengine.com/) is a cloud service platform
developed by ByteDance, the parent company of TikTok. You can obtain
Volcengine API AK/SK from
[here](https://www.volcengine.com/docs/84313/1254553).
**Dependencies:** VolcengineRerank depends on `volcengine` python
package.
**Twitter handle:** my twitter/x account is https://x.com/LastMonopoly
and I'd like a mention, thank you!
**Tests and docs**
1. integration test: `test_volcengine_rerank.py`
2. example notebook: `volcengine_rerank.ipynb`
**Lint and test**: I have run `make format`, `make lint` and `make test`
from the root of the package I've modified.
Hi 👋
First off, thanks a ton for your work on this 💚 Really appreciate what
you're providing here for the community.
## Description
This PR adds a basic language parser for the
[Elixir](https://elixir-lang.org/) programming language. The parser code
is based upon the approach outlined in
https://github.com/langchain-ai/langchain/pull/13318: it's using
`tree-sitter` under the hood and aligns with all the other `tree-sitter`
based parses added that PR.
The `CHUNK_QUERY` I'm using here is probably not the most sophisticated
one, but it worked for my application. It's a starting point to provide
"core" parsing support for Elixir in LangChain. It enables people to use
the language parser out in real world applications which may then lead
to further tweaking of the queries. I consider this PR just the ground
work.
- **Dependencies:** requires `tree-sitter` and `tree-sitter-languages`
from the extended dependencies
- **Twitter handle:**`@bitcrowd`
## Checklist
- [x] **PR title**: "package: description"
- [x] **Add tests and docs**
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.
<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->
Adding `UpstashRatelimitHandler` callback for rate limiting based on
number of chain invocations or LLM token usage.
For more details, see [upstash/ratelimit-py
repository](https://github.com/upstash/ratelimit-py) or the notebook
guide included in this PR.
Twitter handle: @cahidarda
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This PR adds support for using Databricks Unity Catalog functions as
LangChain tools, which runs inside a Databricks SQL warehouse.
* An example notebook is provided.
**Description:** This PR adds a `USER_AGENT` env variable that is to be
used for web scraping. It creates a util to get that user agent and uses
it in the classes used for scraping in [this piece of
doc](https://python.langchain.com/v0.1/docs/use_cases/web_scraping/).
Identifying your scraper is considered a good politeness practice, this
PR aims at easing it.
**Issue:** `None`
**Dependencies:** `None`
**Twitter handle:** `None`
Thank you for contributing to LangChain!
**Description:** update to the Vectara / Langchain integration to
integrate new Vectara capabilities:
- Full RAG implemented as a Runnable with as_rag()
- Vectara chat supported with as_chat()
- Both support streaming response
- Updated documentation and example notebook to reflect all the changes
- Updated Vectara templates
**Twitter handle:** ofermend
**Add tests and docs**: no new tests or docs, but updated both existing
tests and existing docs
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**:
- **Description:** Updated dead link referencing chroma docs in Chroma
notebook under vectorstores
…s and Opensearch Semantic Cache
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description
Add tools implementation to `ChatEdenAI`:
- `bind_tools()`
- `with_structured_output()`
### Documentation
Updated `docs/docs/integrations/chat/edenai.ipynb`
### Notes
We don´t support stream with tools as of yet. If stream is called with
tools we directly yield the whole message from `generate` (implemented
the same way as Anthropic did).
- **Description:** Added support for using HuggingFacePipeline in
ChatHuggingFace (previously it was only usable with API endpoints,
probably by oversight).
- **Issue:** #19997
- **Dependencies:** none
- **Twitter handle:** none
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This PR introduces namespace support for Upstash Vector Store, which
would allow users to partition their data in the vector index.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:**
This PR fixes a rendering issue in the docs (Python notebook) of HANA
Cloud Vector Engine.
- **Issue:** N/A
- **Dependencies:** no new dependencies added
File of the fixed notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
- [X] **PR title**: "community: added optional params to Airtable
table.all()"
- [X] **PR message**:
- **Description:** Add's **kwargs to AirtableLoader to allow for kwargs:
https://pyairtable.readthedocs.io/en/latest/api.html#pyairtable.Table.all
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** parakoopa88
- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
"community/embeddings: update oracleai.py"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
Adding oracle VECTOR_ARRAY_T support.
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Tests are not impacted.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Done.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description:** [IPEX-LLM](https://github.com/intel-analytics/ipex-llm)
is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local
PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low
latency. This PR adds ipex-llm integrations to langchain for BGE
embedding support on both Intel CPU and GPU.
**Dependencies:** `ipex-llm`, `sentence-transformers`
**Contribution maintainer**: @Oscilloscope98
**tests and docs**:
- langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb
- langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb
-
langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py
---------
Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
- [x] Docs Update: Ollama
- llm/ollama
- Switched to using llama3 as model with reference to templating and
prompting
- Added concurrency notes to llm/ollama docs
- chat_models/ollama
- Added concurrency notes to llm/ollama docs
- text_embedding/ollama
- include example for specific embedding models from Ollama
- [ ] **PR title**: "Fix list handling in Clova embeddings example
documentation"
- Description:
Fixes a bug in the Clova Embeddings example documentation where
document_text was incorrectly wrapped in an additional list.
- Rationale
The embed_documents method expects a list, but the previous example
wrapped document_text in an unnecessary additional list, causing an
error. The updated example correctly passes document_text directly to
the method, ensuring it functions as intended.
- **Description:** updated documentation for llama, falcona and gemma on
Vertex AI Model garden
- **Issue:** NA
- **Dependencies:** NA
- **Twitter handle:** NA
@lkuligin for review
---------
Co-authored-by: adityarane@google.com <adityarane@google.com>
Thank you for contributing to LangChain!
- [x] **PR title**: community: Add Zep Cloud components + docs +
examples
- [x] **PR message**:
We have recently released our new zep-cloud sdks that are compatible
with Zep Cloud (not Zep Open Source). We have also maintained our Cloud
version of langchain components (ChatMessageHistory, VectorStore) as
part of our sdks. This PRs goal is to port these components to langchain
community repo, and close the gap with the existing Zep Open Source
components already present in community repo (added
ZepCloudMemory,ZepCloudVectorStore,ZepCloudRetriever).
Also added a ZepCloudChatMessageHistory components together with an
expression language example ported from our repo. We have left the
original open source components intact on purpose as to not introduce
any breaking changes.
- **Issue:** -
- **Dependencies:** Added optional dependency of our new cloud sdk
`zep-cloud`
- **Twitter handle:** @paulpaliychuk51
- [x] **Add tests and docs**
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
3 fixes of DuckDB vector store:
- unify defaults in constructor and from_texts (users no longer have to
specify `vector_key`).
- include search similarity into output metadata (fixes#20969)
- significantly improve performance of `from_documents`
Dependencies: added Pandas to speed up `from_documents`.
I was thinking about CSV and JSON options, but I expect trouble loading
JSON values this way and also CSV and JSON options require storing data
to disk.
Anyway, the poetry file for langchain-community already contains a
dependency on Pandas.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [X] **PR title**: community: Updated langchain-community PremAI
documentation
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Added [Scrapfly](https://scrapfly.io/) Web Loader integration. Scrapfly
is a web scraping API that allows extracting web page data into
accessible markdown or text datasets.
- __Description__: Added Scrapfly web loader for retrieving web page
data as markdown or text.
- Dependencies: scrapfly-sdk
- Twitter: @thealchemi1st
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Backwards compatible extension of the initialisation
interface of HanaDB to allow the user to specify
specific_metadata_columns that are used for metadata storage of selected
keys which yields increased filter performance. Any not-mentioned
metadata remains in the general metadata column as part of a JSON
string. Furthermore switched to executemany for batch inserts into
HanaDB.
**Issue:** N/A
**Dependencies:** no new dependencies added
**Twitter handle:** @sapopensource
---------
Co-authored-by: Martin Kolb <martin.kolb@sap.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Integrate RankLLM reranker (https://github.com/castorini/rank_llm) into
LangChain
An example notebook is given in
`docs/docs/integrations/retrievers/rankllm-reranker.ipynb`
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "update IBM WatsonxLLM docs with deprecated
LLMChain"
- [x] **PR message**:
- **Description:** update IBM WatsonxLLM docs with deprecated LLMChain
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
- Updated docs to have an example to use Jamba instead of J2
---------
Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Tongyi uses different client for chat model and
vision model. This PR chooses proper client based on model name to
support both chat model and vision model. Reference [tongyi
document](https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.27404c9a7upm11)
for details.
```
from langchain_core.messages import HumanMessage
from langchain_community.chat_models import ChatTongyi
llm = ChatTongyi(model_name='qwen-vl-max')
image_message = {
"image": "https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png"
}
text_message = {
"text": "summarize this picture",
}
message = HumanMessage(content=[text_message, image_message])
llm.invoke([message])
```
- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** None
We add a tool and retriever for the [AskNews](https://asknews.app)
platform with example notebooks.
The retriever can be invoked with:
```py
from langchain_community.retrievers import AskNewsRetriever
retriever = AskNewsRetriever(k=3)
retriever.invoke("impact of fed policy on the tech sector")
```
To retrieve 3 documents in then news related to fed policy impacts on
the tech sector. The included notebook also includes deeper details
about controlling filters such as category and time, as well as
including the retriever in a chain.
The tool is quite interesting, as it allows the agent to decide how to
obtain the news by forming a query and deciding how far back in time to
look for the news:
```py
from langchain_community.tools.asknews import AskNewsSearch
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI
tool = AskNewsSearch()
instructions = """You are an assistant."""
base_prompt = hub.pull("langchain-ai/openai-functions-template")
prompt = base_prompt.partial(instructions=instructions)
llm = ChatOpenAI(temperature=0)
asknews_tool = AskNewsSearch()
tools = [asknews_tool]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
)
agent_executor.invoke({"input": "How is the tech sector being affected by fed policy?"})
```
---------
Co-authored-by: Emre <e@emre.pm>
Please let me know if you see any possible areas of improvement. I would
very much appreciate your constructive criticism if time allows.
**Description:**
- Added a aerospike vector store integration that utilizes
[Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/)
add-on.
- Added both unit tests and integration tests
- Added a docker compose file for spinning up a test environment
- Added a notebook
**Dependencies:** any dependencies required for this change
- aerospike-vector-search
**Twitter handle:**
- No twitter, you can use my GitHub handle or LinkedIn if you'd like
Thanks!
---------
Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Closes#20561
This PR fixes MLX LLM stream `AttributeError`.
Recently, `mlx-lm` changed the token decoding logic, which affected the
LC+MLX integration.
Additionally, I made minor fixes such as: docs example broken link and
enforcing pipeline arguments (max_tokens, temp and etc) for invoke.
- **Issue:** #20561
- **Twitter handle:** @Prince_Canuma
UpTrain has a new dashboard now that makes it easier to view projects
and evaluations. Using this requires specifying both project_name and
evaluation_name when performing evaluations. I have updated the code to
support it.
The Diffbot DocumentLoader page doesn't actually run for a number of
reasons. This PR fixes it along with some light details on the Graph
Transformer and Provider pages.
## Full Changelog
[Document Loader
Page](https://python.langchain.com/v0.1/docs/integrations/document_loaders/diffbot/)
* Fixed the notebook so that it actually runs (missing required modules,
env variables, etc..)
* Added "open in colab" button like the Graph Transformer page
[Graph Transformer
Page](https://python.langchain.com/v0.2/docs/integrations/graphs/diffbot/)
* Fixed broken colab link
* Moved "open in colab" button to below description so the description
in the [Graphs category
page](https://python.langchain.com/v0.2/docs/integrations/graphs/) shows
up correctly
[Provider
Page](https://python.langchain.com/v0.2/docs/integrations/providers/diffbot/)
* Clarified explanations of Diffbot products
* Added section and link to LangChain Graph Transformer page
---------
Co-authored-by: jeromechoo <hello@jeromechoo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
**Description:** Add `Origin/langchain` to Apify's client's user-agent
to attribute API activity to LangChain (at Apify, we aim to monitor our
integrations to evaluate whether we should invest more in the LangChain
integration regarding functionality and content)
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
Fixed Documentation for HuggingFaceEndpoint as per the issue #21903
---------
Co-authored-by: keenborder786 <mohammad.mohtashim78@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "docs: update notebook for latest Pinecone API +
serverless"
- [x] **PR message**: Published notebook is incompatible with latest
`pinecone-client` and not runnable. Updated for use with latest Pinecone
Python SDK. Also updated to be compatible with serverless indexes (only
index type available on Pinecone free tier).
- [x] **Add tests and docs**: N/A (tested in Colab)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
- https://app.asana.com/0/0/1207328087952499
Thank you for contributing to LangChain!
- [x] **PR title**: "docs: update notebook for new Pinecone API +
serverless"
- [x] **PR message**: The published notebook is not runnable after
`pinecone-client` v2, which is deprecated. `langchain-pinecone` is not
compatible with the latest `pinecone-client` (v4), so I hardcoded it to
the last v3. Also updated for serverless indexes (only index type
available on Pinecone free plan).
- [x] **Add tests and docs**: N/A (tested in Colab)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
- https://app.asana.com/0/0/1207328087952500
There are 2 issues fixed here:
* In the notebook pandas dataframes are formatted as HTML in the cells.
On the documentation site the renderer that converts notebooks
incorrectly displays the raw HTML. I can't find any examples of where
this is working and so I am formatting the dataframes as text.
* Some incorrect table names were referenced resulting in errors.
This PR improves on the `CassandraCache` and `CassandraSemanticCache`
classes, mainly in the constructor signature, and also introduces
several minor improvements around these classes.
### Init signature
A (sigh) breaking change is tentatively introduced to the constructor.
To me, the advantages outweigh the possible discomfort: the new syntax
places the DB-connection objects `session` and `keyspace` later in the
param list, so that they can be given a default value. This is what
enables the pattern of _not_ specifying them, provided one has
previously initialized the Cassandra connection through the versatile
utility method `cassio.init(...)`.
In this way, a much less unwieldy instantiation can be done, such as
`CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`,
everything else falling back to defaults.
A downside is that, compared to the earlier signature, this might turn
out to be breaking for those doing positional instantiation. As a way to
mitigate this problem, this PR typechecks its first argument trying to
detect the legacy usage.
(And to make this point less tricky in the future, most arguments are
left to be keyword-only).
If this is considered too harsh, I'd like guidance on how to further
smoothen this transition. **Our plan is to make the pattern of optional
session/keyspace a standard across all Cassandra classes**, so that a
repeatable strategy would be ideal. A possibility would be to keep
positional arguments for legacy reasons but issue a deprecation warning
if any of them is actually used, to later remove them with 0.2 - please
advise on this point.
### Other changes
- class docstrings: enriched, completely moved to class level, added
note on `cassio.init(...)` pattern, added tiny sample usage code.
- semantic cache: revised terminology to never mention "distance" (it is
in fact a similarity!). Kept the legacy constructor param with a
deprecation warning if used.
- `llm_caching` notebook: uniform flow with the Cassandra and Astra DB
separate cases; better and Cassandra-first description; all imports made
explicit and from community where appropriate.
- cache integration tests moved to community (incl. the imported tools),
env var bugfix for `CASSANDRA_CONTACT_POINTS`.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Thank you for contributing to LangChain!
- [x] **PR title**: "community: updated Browserbase loader"
- [x] **PR message**:
Updates the Browserbase loader with more options and improved docs.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
## Description
This PR introduces the new `langchain-qdrant` partner package, intending
to deprecate the community package.
## Changes
- Moved the Qdrant vector store implementation `/libs/partners/qdrant`
with integration tests.
- The conditional imports of the client library are now regular with
minor implementation improvements.
- Added a deprecation warning to
`langchain_community.vectorstores.qdrant.Qdrant`.
- Replaced references/imports from `langchain_community` with either
`langchain_core` or by moving the definitions to the `langchain_qdrant`
package itself.
- Updated the Qdrant vector store documentation to reflect the changes.
## Testing
- `QDRANT_URL` and
[`QDRANT_API_KEY`](583e36bf6b)
env values need to be set to [run integration
tests](d608c93d1f)
in the [cloud](https://cloud.qdrant.tech).
- If a Qdrant instance is running at `http://localhost:6333`, the
integration tests will use it too.
- By default, tests use an
[`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode)
instance(Not comprehensive).
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
This PR makes some small updates for `KuzuQAChain` for graph QA.
- Updated Cypher generation prompt (we now support `WHERE EXISTS`) and
generalize it more
- Support different LLMs for Cypher generation and QA
- Update docs and examples
First Pr for the langchain_huggingface partner Package
- Moved some of the hugging face related class from `community` to the
new `partner package`
Still needed :
- Documentation
- Tests
- Support for the new apply_chat_template in `ChatHuggingFace`
- Confirm choice of class to support for embeddings witht he
sentence-transformer team.
cc : @efriis
---------
Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
- Added new document_transformer: MarkdonifyTransformer, that uses
`markdonify` package with customizable options to convert HTML to
Markdown. It's similar to Html2TextTransformer, but has more flexible
options and also I've noticed that sometimes MarkdownifyTransformer
performs better than html2text one, so that's why I use markdownify on
my project.
- Added docs and tests
- Usage:
```python
from langchain_community.document_transformers import MarkdownifyTransformer
markdownify = MarkdownifyTransformer()
docs_transform = markdownify.transform_documents(docs)
```
- Example of better performance on simple task, that I've noticed:
```
<html>
<head><title>Reports on product movement</title></head>
<body>
<p data-block-key="2wst7">The reports on product movement will be useful for forming supplier orders and controlling outcomes.</p>
</body>
```
**Html2TextTransformer**:
```python
[Document(page_content='The reports on product movement will be useful for forming supplier orders and\ncontrolling outcomes.\n\n')]
# Here we can see 'and\ncontrolling', which has extra '\n' in it
```
**MarkdownifyTranformer**:
```python
[Document(page_content='Reports on product movement\n\nThe reports on product movement will be useful for forming supplier orders and controlling outcomes.')]
```
---------
Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.bbrouter>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.local>
Co-authored-by: Sokolov Fedor <f.sokolov@192.168.1.6>
Description: We are merging UPSTAGE_DOCUMENT_AI_API_KEY and
UPSTAGE_API_KEY into one, and only UPSTAGE_API_KEY will be used going
forward. And we changed the base class of ChatUpstage to BaseChatOpenAI.
---------
Co-authored-by: Sean <chosh0615@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
- Added Together docs in chat models section
- Update Together provider docs to match the LLM & chat models sections
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
This is a doc update. It fixes up formatting and product name
references. The example code is updated to use a local built-in text
file.
@mmhangami Please take a look
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:** Updated the together integration docs by leading with
the streaming example, explicitly specifying a model to show users how
to do that, and updating the sections to more closely match other
integrations.
**Description:** Adding chat completions to the Together AI package,
which is our most popular API. Also staying backwards compatible with
the old API so folks can continue to use the completions API as well.
Also moved the embedding API to use the OpenAI library to standardize it
further.
**Twitter handle:** @nutlope
- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** Update LarkSuite loader doc to give an example for
loading data from LarkSuite wiki.
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
Thank you for contributing to LangChain!
- Oracle AI Vector Search
Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
- Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
This Pull Requests Adds the following functionalities
Oracle AI Vector Search : Vector Store
Oracle AI Vector Search : Document Loader
Oracle AI Vector Search : Document Splitter
Oracle AI Vector Search : Summary
Oracle AI Vector Search : Oracle Embeddings
- We have added unit tests and have our own local unit test suite which
verifies all the code is correct. We have made sure to add guides for
each of the components and one end to end guide that shows how the
entire thing runs.
- We have made sure that make format and make lint run clean.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com>
Co-authored-by: hroyofc <harichandan.roy@oracle.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Description:Added documentation on Anthropic models on Vertex
@lkuligin for review
---------
Co-authored-by: adityarane@google.com <adityarane@google.com>
Refactors the docs build in order to:
- run the same `make build` command in both vercel and local build
- incrementally build artifacts in 2 distinct steps, instead of building
all docs in-place (in vercel) or in a _dist dir (locally)
Highlights:
- introduces `make build` in order to build the docs
- collects and generates all files for the build in
`docs/build/intermediate`
- renders those jupyter notebook + markdown files into
`docs/build/outputs`
And now the outputs to host are in `docs/build/outputs`, which will need
a vercel settings change.
Todo:
- [ ] figure out how to point the right directory (right now deleting
and moving docs dir in vercel_build.sh isn't great)
**Description:**
This pull request introduces a new feature for LangChain: the
integration with the Rememberizer API through a custom retriever.
This enables LangChain applications to allow users to load and sync
their data from Dropbox, Google Drive, Slack, their hard drive into a
vector database that LangChain can query. Queries involve sending text
chunks generated within LangChain and retrieving a collection of
semantically relevant user data for inclusion in LLM prompts.
User knowledge dramatically improved AI applications.
The Rememberizer integration will also allow users to access general
purpose vectorized data such as Reddit channel discussions and US
patents.
**Issue:**
N/A
**Dependencies:**
N/A
**Twitter handle:**
https://twitter.com/Rememberizer
Vertex DIY RAG APIs helps to build complex RAG systems and provide more
granular control, and are suited for custom use cases.
The Ranking API takes in a list of documents and reranks those documents
based on how relevant the documents are to a given query. Compared to
embeddings that look purely at the semantic similarity of a document and
a query, the ranking API can give you a more precise score for how well
a document answers a given query.
[Reference](https://cloud.google.com/generative-ai-app-builder/docs/ranking)
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description**
This pull request updates the Bagel Network package name from
"betabageldb" to "bagelML" to align with the latest changes made by the
Bagel Network team.
The following modifications have been made:
- Updated all references to the old package name ("betabageldb") with
the new package name ("bagelML") throughout the codebase.
- Modified the documentation, and any relevant scripts to reflect the
package name change.
- Tested the changes to ensure that the functionality remains intact and
no breaking changes were introduced.
By merging this pull request, our project will stay up to date with the
latest Bagel Network package naming convention, ensuring compatibility
and smooth integration with their updated library.
Please review the changes and provide any feedback or suggestions. Thank
you!
## Description
Adding `UpstashVectorStore` to utilize [Upstash
Vector](https://upstash.com/docs/vector/overall/getstarted)!
#17012 was opened to add Upstash Vector to langchain but was closed to
wait for filtering. Now filtering is added to Upstash vector and we open
a new PR. Additionally, [embedding
feature](https://upstash.com/docs/vector/features/embeddingmodels) was
added and we add this to our vectorstore aswell.
## Dependencies
[upstash-vector](https://pypi.org/project/upstash-vector/) should be
installed to use `UpstashVectorStore`. Didn't update dependencies
because of [this comment in the previous
PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450).
## Tests
Tests are added and they pass. Tests are naturally network bound since
Upstash Vector is offered through an API.
There was [a discussion in the previous PR about mocking the
unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567).
We didn't make changes to this end yet. We can update the tests if you
can explain how the tests should be mocked.
---------
Co-authored-by: ytkimirti <yusuftaha9@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description**: ToolKit and Tools for accessing data in a Cassandra
Database primarily for Agent integration. Initially, this includes the
following tools:
- `cassandra_db_schema` Gathers all schema information for the connected
database or a specific schema. Critical for the agent when determining
actions.
- `cassandra_db_select_table_data` Selects data from a specific keyspace
and table. The agent can pass paramaters for a predicate and limits on
the number of returned records.
- `cassandra_db_query` Expiriemental alternative to
`cassandra_db_select_table_data` which takes a query string completely
formed by the agent instead of parameters. May be removed in future
versions.
Includes unit test and two notebooks to demonstrate usage.
**Dependencies**: cassio
**Twitter handle**: @PatrickMcFadin
---------
Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** This pull request introduces a new feature to community
tools, enhancing its search capabilities by integrating the Mojeek
search engine
**Dependencies:** None
---------
Co-authored-by: Igor Brai <igor@mojeek.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Document: Updated google_drive,ipynb for loading following extended
metadata.
- full_path - Full path of the file/s in google drive.
- owner - owner of the file/s.
- size - size of the file/s.
Code changes:
[langchain-google/pull/179.](https://github.com/langchain-ai/langchain-google/pull/179)
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "docs: switched GCSLoaders docs to
langchain-google-community"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** switched GCSLoaders docs to
langchain-google-community
Implemented bind_tools for OllamaFunctions.
Made OllamaFunctions sub class of ChatOllama.
Implemented with_structured_output for OllamaFunctions.
integration unit test has been updated.
notebook has been updated.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
MindsDB integrates with LangChain, enabling users to deploy, serve, and
fine-tune models available via LangChain within MindsDB, making them
accessible to numerous data sources.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
* Groundedness Check takes `str` or `list[Document]` as input.
* Deprecate `GroundednessCheck` due to its naming.
* Added `UpstageGroundednessCheck`.
* Hotfix for Groundedness Check parameter.
The name `query` was misleading and it should be `answer` instead.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- [ ] **Kinetica Document Loader**: "community: a class to load
Documents from Kinetica"
- [ ] **Kinetica Document Loader**:
- **Description:** implemented KineticaLoader in `kinetica_loader.py`
- **Dependencies:** install the Kinetica API using `pip install
gpudb==7.2.0.1 `
- **Description**:
- **add support for more data types**: by default `IpexLLM` will load
the model in int4 format. This PR adds more data types support such as
`sym_in5`, `sym_int8`, etc. Data formats like NF3, NF4, FP4 and FP8 are
only supported on GPU and will be added in future PR.
- Fix a small issue in saving/loading, update api docs
- **Dependencies**: `ipex-llm` library
- **Document**: In `docs/docs/integrations/llms/ipex_llm.ipynb`, added
instructions for saving/loading low-bit model.
- **Tests**: added new test cases to
`libs/community/tests/integration_tests/llms/test_ipex_llm.py`, added
config params.
- **Contribution maintainer**: @shane-huang
Description: Add support for Semantic topics and entities.
Classification done by pebblo-server is not used to enhance metadata of
Documents loaded by document loaders.
Dependencies: None
Documentation: Updated.
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
**Description:** In VoyageAI text-embedding examples use voyage-law-2
model
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: Fix misplaced zep cloud example links
- [x] **PR message**:
- **Description:** Fixes misplaced links for vector store and memory zep
cloud examples
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
- **Description:** Adapt JinaEmbeddings to run with the new Jina AI
Rerank API
- **Twitter handle:** https://twitter.com/JinaAI_
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
* Updating the provider docs page.
The RAG example was meant to be moved to cookbook, but was merged by
mistake.
* Fix bug in Groundedness Check
---------
Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Description :
- added functionalities - delete, index creation, using existing
connection object etc.
- updated usage
- Added LaceDB cloud OSS support
make lint_diff , make test checks done
Implemented the ability to enable full-text search within the
SingleStore vector store, offering users a versatile range of search
strategies. This enhancement allows users to seamlessly combine
full-text search with vector search, enabling the following search
strategies:
* Search solely by vector similarity.
* Conduct searches exclusively based on text similarity, utilizing
Lucene internally.
* Filter search results by text similarity score, with the option to
specify a threshold, followed by a search based on vector similarity.
* Filter results by vector similarity score before conducting a search
based on text similarity.
* Perform searches using a weighted sum of vector and text similarity
scores.
Additionally, integration tests have been added to comprehensively cover
all scenarios.
Updated notebook with examples.
CC: @baskaryan, @hwchase17
---------
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- added guard on the `pyTigerGraph` import
- added a missed example page in the `docs/integrations/graphs/`
- formatted the `docs/integrations/providers/` page to the consistent
format. Added links.
- **Description:**
This PR adds support for advanced filtering to the integration of HANA
Vector Engine.
The newly supported filtering operators are: $eq, $ne, $gt, $gte, $lt,
$lte, $between, $in, $nin, $like, $and, $or
- **Issue:** N/A
- **Dependencies:** no new dependencies added
Added integration tests to:
`libs/community/tests/integration_tests/vectorstores/test_hanavector.py`
Description of the new capabilities in notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
Thank you for contributing to LangChain!
- [x] **PR title**: docs: Update Zep Messaging, add links to Zep Cloud
Docs
- [x] **PR message**:
- **Description:** This PR updates Zep messaging in the docs + links to
Langchain Zep Cloud examples in our documentation
- **Twitter handle:** @paulpaliychuk51
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
langchain_community.document_loaders depricated
new langchain_google_community
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Thank you for contributing to LangChain!
- [ ] **PR title**: "docs: added a description of differences
langchain_google_genai vs langchain_google_vertexai"
- [ ]
- **Description:** added a description of differences
langchain_google_genai vs langchain_google_vertexai
**Description:** implemented GraphStore class for Apache Age graph db
**Dependencies:** depends on psycopg2
Unit and integration tests included. Formatting and linting have been
run.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: docs"
- [ ] **PR message**:
- **Description:** Updated Tutorials for Vertex Vector Search
- **Issue:** NA
- **Dependencies:** NA
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
@lkuligin for review
---------
Co-authored-by: adityarane@google.com <adityarane@google.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Description: you don't need to pass a version for Replicate official
models. That was broken on LangChain until now!
You can now run:
```
llm = Replicate(
model="meta/meta-llama-3-8b-instruct",
model_kwargs={"temperature": 0.75, "max_length": 500, "top_p": 1},
)
prompt = """
User: Answer the following yes/no question by reasoning step by step. Can a dog drive a car?
Assistant:
"""
llm(prompt)
```
I've updated the replicate.ipynb to reflect that.
twitter: @charliebholtz
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
VSDX data contains EMF files. Some of these apparently can contain
exploits with some Adobe tools.
This is likely a false positive from antivirus software, but we
can remove it nonetheless.
Hey @eyurtsev, I noticed that the notebook isn't displaying the outputs
properly. I've gone ahead and rerun the cells to ensure that readers can
easily understand the functionality without having to run the code
themselves.
fix timeout issue
fix zhipuai usecase notebookbook
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
**Community: Unify Titan Takeoff Integrations and Adding Embedding
Support**
**Description:**
Titan Takeoff no longer reflects this either of the integrations in the
community folder. The two integrations (TitanTakeoffPro and
TitanTakeoff) where causing confusion with clients, so have moved code
into one place and created an alias for backwards compatibility. Added
Takeoff Client python package to do the bulk of the work with the
requests, this is because this package is actively updated with new
versions of Takeoff. So this integration will be far more robust and
will not degrade as badly over time.
**Issue:**
Fixes bugs in the old Titan integrations and unified the code with added
unit test converge to avoid future problems.
**Dependencies:**
Added optional dependency takeoff-client, all imports still work without
dependency including the Titan Takeoff classes but just will fail on
initialisation if not pip installed takeoff-client
**Twitter**
@MeryemArik9
Thanks all :)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
## Description:
The PR introduces 3 changes:
1. added `recursive` property to `O365BaseLoader`. (To keep the behavior
unchanged, by default is set to `False`). When `recursive=True`,
`_load_from_folder()` also recursively loads all nested folders.
2. added `folder_id` to SharePointLoader.(similar to (this
PR)[https://github.com/langchain-ai/langchain/pull/10780] ) This
provides an alternative to `folder_path` that doesn't seem to reliably
work.
3. when none of `document_ids`, `folder_id`, `folder_path` is provided,
the loader fetches documets from root folder. Combined with
`recursive=True` this provides an easy way of loading all compatible
documents from SharePoint.
The PR contains the same logic as [this stale
PR](https://github.com/langchain-ai/langchain/pull/10780) by
@WaleedAlfaris. I'd like to ask his blessing for moving forward with
this one.
## Issue:
- As described in https://github.com/langchain-ai/langchain/issues/19938
and https://github.com/langchain-ai/langchain/pull/10780 the sharepoint
loader often does not seem to work with folder_path.
- Recursive loading of subfolders is a missing functionality
## Dependecies: None
Twitter handle:
@martintriska1 @WRhetoric
This is my first PR here, please be gentle :-)
Please review @baskaryan
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This PR updates OctoAIEndpoint LLM to subclass BaseOpenAI as OctoAI is
an OpenAI-compatible service. The documentation and tests have also been
updated.
**Description:** Adds ThirdAI NeuralDB retriever integration. NeuralDB
is a CPU-friendly and fine-tunable text retrieval engine. We previously
added a vector store integration but we think that it will be easier for
our customers if they can also find us under under
langchain-community/retrievers.
---------
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
- **Description:**
This PR adds a callback handler for UpTrain. It performs evaluations in
the RAG pipeline to check the quality of retrieved documents, generated
queries and responses.
- **Dependencies:**
- The UpTrainCallbackHandler requires the uptrain package
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
- **Description:**Add Google Firestore Vector store docs
- **Issue:** NA
- **Dependencies:** NA
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** Updates the documentation for Portkey and Langchain.
Also updates the notebook. The current documentation is fairly old and
is non-functional.
**Twitter handle:** @portkeyai
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Added the [FireCrawl](https://firecrawl.dev) document loader. Firecrawl
crawls and convert any website into LLM-ready data. It crawls all
accessible subpages and give you clean markdown for each.
- **Description:** Adds FireCrawl data loader
- **Dependencies:** firecrawl-py
- **Twitter handle:** @mendableai
ccing contributors: (@ericciarla @nickscamara)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
avaliable -> available
- **Description:** fixed typo
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- make Tencent Cloud VectorDB support metadata filtering.
- implement delete function for Tencent Cloud VectorDB.
- support both Langchain Embedding model and Tencent Cloud VDB embedding
model.
- Tencent Cloud VectorDB support filter search keyword, compatible with
langchain filtering syntax.
- add Tencent Cloud VectorDB TranslationVisitor, now work with self
query retriever.
- more documentations.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Langchain-Predibase integration was failing, because
it was not current with the Predibase SDK; in addition, Predibase
integration tests were instantiating the Langchain Community `Predibase`
class with one required argument (`model`) missing. This change updates
the Predibase SDK usage and fixes the integration tests.
- **Twitter handle:** `@alexsherstinsky`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Last year Microsoft [changed the
name](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search)
of Azure Cognitive Search to Azure AI Search. This PR updates the
Langchain Azure Retriever API and it's associated docs to reflect this
change. It may be confusing for users to see the name Cognitive here and
AI in the Microsoft documentation which is why this is needed. I've also
added a more detailed example to the Azure retriever doc page.
There are more places that need a similar update but I'm breaking it up
so the PRs are not too big 😄 Fixing my errors from the previous PR.
Twitter: @marlene_zw
Two new tests added to test backward compatibility in
`libs/community/tests/integration_tests/retrievers/test_azure_cognitive_search.py`
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- Added missed providers
- Added links, descriptions in related examples
- Formatted in a consistent format
Co-authored-by: Erick Friis <erick@langchain.dev>
Updated a page with existing document loaders with links to examples.
Fixed formatting of one example.
Co-authored-by: Erick Friis <erick@langchain.dev>
Issue: The `graph` code was moved into the `community` package a long
ago. But the related documentation is still in the
[use_cases](https://python.langchain.com/docs/use_cases/graph/integrations/diffbot_graphtransformer)
section and not in the `integrations`.
Changes:
- moved the `use_cases/graph/integrations` notebooks into the
`integrations/graphs`
- renamed files and changed titles to follow the consistent format
- redirected old page URLs to new URLs in `vercel.json` and in several
other pages
- added descriptions and links when necessary
- formatted into the consistent format
Should hopefully avoid weird broken link edge cases.
Relative links now trip up the Docusaurus broken link checker, so this
PR also removes them.
Also snuck in a small addition about asyncio
**Description:** Update of Cohere documentation (main provider page)
**Issue:** After addition of the Cohere partner package, the
documentation was out of date
**Dependencies:** None
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
cohere: update imports and installs to langchain_cohere
---------
Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** adds integration with [Layerup
Security](https://uselayerup.com). Docs can be found
[here](https://docs.uselayerup.com). Integrates directly with our Python
SDK.
**Dependencies:**
[LayerupSecurity](https://pypi.org/project/LayerupSecurity/)
**Note**: all methods for our product require a paid API key, so I only
included 1 test which checks for an invalid API key response. I have
tested extensively locally.
**Twitter handle**: [@layerup_](https://twitter.com/layerup_)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
[Dria](https://dria.co/) is a hub of public RAG models for developers to
both contribute and utilize a shared embedding lake. This PR adds a
retriever that can retrieve documents from Dria.
Description: Update `ChatZhipuAI` to support the latest `glm-4` model.
Issue: N/A
Dependencies: httpx, httpx-sse, PyJWT
The previous `ChatZhipuAI` implementation requires the `zhipuai`
package, and cannot call the latest GLM model. This is because
- The old version `zhipuai==1.*` doesn't support the latest model.
- `zhipuai==2.*` requires `pydantic V2`, which is incompatible with
'langchain-community'.
This re-implementation invokes the GLM model by sending HTTP requests to
[open.bigmodel.cn](https://open.bigmodel.cn/dev/api) via the `httpx`
package, and uses the `httpx-sse` package to handle stream events.
---------
Co-authored-by: zR <2448370773@qq.com>
**Description:** An additional `U` argument was added for the
instructions to install the pip packages for the MediaWiki Dump Document
loader which was leading to error in installing the package. Removing
the argument fixed the command to install.
**Issue:** #19820
**Dependencies:** No dependency change requierd
**Twitter handle:** [@vardhaman722](https://twitter.com/vardhaman722)
- **Description:** Support reranking based on cross encoder models
available from HuggingFace.
- Added `CrossEncoder` schema
- Implemented `HuggingFaceCrossEncoder` and
`SagemakerEndpointCrossEncoder`
- Implemented `CrossEncoderReranker` that performs similar functionality
to `CohereRerank`
- Added `cross-encoder-reranker.ipynb` to demonstrate how to use it.
Please let me know if anything else needs to be done to make it visible
on the table-of-contents navigation bar on the left, or on the card list
on [retrievers documentation
page](https://python.langchain.com/docs/integrations/retrievers).
- **Issue:** N/A
- **Dependencies:** None other than the existing ones.
---------
Co-authored-by: Kenny Choe <kchoe@amazon.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description
This implementation adds functionality from the AlphaVantage API,
renowned for its comprehensive financial data. The class encapsulates
various methods, each dedicated to fetching specific types of financial
information from the API.
### Implemented Functions
- **`search_symbols`**:
- Searches the AlphaVantage API for financial symbols using the provided
keywords.
- **`_get_market_news_sentiment`**:
- Retrieves market news sentiment for a specified stock symbol from the
AlphaVantage API.
- **`_get_time_series_daily`**:
- Fetches daily time series data for a specific symbol from the
AlphaVantage API.
- **`_get_quote_endpoint`**:
- Obtains the latest price and volume information for a given symbol
from the AlphaVantage API.
- **`_get_time_series_weekly`**:
- Gathers weekly time series data for a particular symbol from the
AlphaVantage API.
- **`_get_top_gainers_losers`**:
- Provides details on top gainers, losers, and most actively traded
tickers in the US market from the AlphaVantage API.
### Issue:
- #11994
### Dependencies:
- 'requests' library for HTTP requests. (import requests)
- 'pytest' library for testing. (import pytest)
---------
Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Code written by following, the official documentation
of [Google Drive
Loader](https://python.langchain.com/docs/integrations/document_loaders/google_drive),
gives errors. I have opened an issue regarding this. See #14725. This is
a pull request for modifying the documentation to use an approach that
makes the code work. Basically, the change is that we need to always set
the GOOGLE_APPLICATION_CREDENTIALS env var to an emtpy string, rather
than only in case of RefreshError. Also, rewrote 2 paragraphs to make
the instructions more clear.
- **Issue:** See this related [issue #
14725](https://github.com/langchain-ai/langchain/issues/14725)
- **Dependencies:** NA
- **Tag maintainer:** @baskaryan
- **Twitter handle:** NA
Co-authored-by: Snehil <snehil@example.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "community: added support for llmsherpa library"
- [x] **Add tests and docs**:
1. Integration test:
'docs/docs/integrations/document_loaders/test_llmsherpa.py'.
2. an example notebook:
`docs/docs/integrations/document_loaders/llmsherpa.ipynb`.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
this pr also drops the community added action for checking broken links
in mdx. It does not work well for our use case, throwing errors for
local paths, plus the rest of the errors our in house solution had.
- **Description:**
1. Fix the BiliBiliLoader that can receive cookie parameters, it
requires 3 other parameters to run. The change is backward compatible.
2. Add test;
3. Add example in docs
- **Issue:** [#14213]
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- [x] **PR title**: "community: Support streaming in Azure ML and few
naming changes"
- [x] **PR message**:
- **Description:** Added support for streaming for azureml_endpoint.
Also, renamed and AzureMLEndpointApiType.realtime to
AzureMLEndpointApiType.dedicated. Also, added new classes
CustomOpenAIChatContentFormatter and CustomOpenAIContentFormatter and
updated the classes LlamaChatContentFormatter and LlamaContentFormatter
to now show a deprecated warning message when instantiated.
---------
Co-authored-by: Sachin Paryani <saparan@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Add our solar chat models, available model choices:
* solar-1-mini-chat
* solar-1-mini-translate-enko
* solar-1-mini-translate-koen
More documents and pricing can be found at
https://console.upstage.ai/services/solar.
The references to our solar model can be found at
* https://arxiv.org/abs/2402.17032
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
This is a follow up to #18371. These are the changes:
- New **Azure AI Services** toolkit and tools to replace those of
**Azure Cognitive Services**.
- Updated documentation for Microsoft platform.
- The image analysis tool has been rewritten to use the new package
`azure-ai-vision-imageanalysis`, doing a proper replacement of
`azure-ai-vision`.
These changes:
- Update outdated naming from "Azure Cognitive Services" to "Azure AI
Services".
- Update documentation to use non-deprecated methods to create and use
agents.
- Removes need to depend on yanked python package (`azure-ai-vision`)
There is one new dependency that is needed as a replacement to
`azure-ai-vision`:
- `azure-ai-vision-imageanalysis`. This is optional and declared within
a function.
There is a new `azure_ai_services.ipynb` notebook showing usage; Changes
have been linted and formatted.
I am leaving the actions of adding deprecation notices and future
removal of Azure Cognitive Services up to the LangChain team, as I am
not sure what the current practice around this is.
---
If this PR makes it, my handle is @galo@mastodon.social
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
- **Description**: `bigdl-llm` library has been renamed to
[`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR
migrates the `bigdl-llm` integration to `ipex-llm` .
- **Issue**: N/A. The original PR of `bigdl-llm` is
https://github.com/langchain-ai/langchain/pull/17953
- **Dependencies**: `ipex-llm` library
- **Contribution maintainer**: @shane-huang
Updated doc: docs/docs/integrations/llms/ipex_llm.ipynb
Updated test:
libs/community/tests/integration_tests/llms/test_ipex_llm.py
- **Description:** Add support for Intel Lab's [Visual Data Management
System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store
- **Dependencies:** `vdms` library which requires protobuf = "4.24.2".
There is a conflict with dashvector in `langchain` package but conflict
is resolved in `community`.
- **Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe)
- **Added tests:**
libs/community/tests/integration_tests/vectorstores/test_vdms.py
- **Added docs:** docs/docs/integrations/vectorstores/vdms.ipynb
- **Added cookbook:** cookbook/multi_modal_RAG_vdms.ipynb
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.6 and above. Adds embedders settings and embedder_name which are
now required.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Introduction
[Intel® Extension for
Transformers](https://github.com/intel/intel-extension-for-transformers)
is an innovative toolkit designed to accelerate GenAI/LLM everywhere
with the optimal performance of Transformer-based models on various
Intel platforms
Description
adding ITREX runtime embeddings using intel-extension-for-transformers.
added mdx documentation and example notebooks
added embedding import testing.
---------
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Update Azure Document Intelligence implementation by
Microsoft team and RAG cookbook with Azure AI Search
---------
Co-authored-by: Lu Zhang (AI) <luzhan@microsoft.com>
Co-authored-by: Yateng Hong <yatengh@microsoft.com>
Co-authored-by: teethache <hongyateng2006@126.com>
Co-authored-by: Lu Zhang <44625949+luzhang06@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Implemented try-except block for
`GCSDirectoryLoader`. Reason: Users processing large number of
unstructured files in a folder may experience many different errors. A
try-exception block is added to capture these errors. A new argument
`use_try_except=True` is added to enable *silent failure* so that error
caused by processing one file does not break the whole function.
- **Issue:** N/A
- **Dependencies:** no new dependencies
- **Twitter handle:** timothywong731
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding oracle autonomous database document loader
integration. This will allow users to connect to oracle autonomous
database through connection string or TNS configuration.
https://www.oracle.com/autonomous-database/
- **Issue:** None
- **Dependencies:** oracledb python package
https://pypi.org/project/oracledb/
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Unit test and doc are added.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:**
Update to correctly import Document class -
from langchain_core.documents import Document
- **Issue:**
Fixes the notebook and the hosted documentation
[here](https://python.langchain.com/docs/integrations/tools/apify)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Fixing some issues for AzureCosmosDBSemanticCache
- Added the entry for "AzureCosmosDBSemanticCache" which was missing in
langchain/cache.py
- Added application name when creating the MongoClient for the
AzureCosmosDBVectorSearch, for tracking purposes.
@baskaryan, can you please review this PR, we need this to go in asap.
These are just small fixes which we found today in our testing.
Description: Added support for AI21 Labs model - Segmentation, as a Text
Splitter
Dependencies: ai21, langchain-text-splitter
Twitter handle: https://github.com/AI21Labs
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Prem SDK integration in LangChain
This PR adds the integration with [PremAI's](https://www.premai.io/)
prem-sdk with langchain. User can now access to deployed models
(llms/embeddings) and use it with langchain's ecosystem. This PR adds
the following:
### This PR adds the following:
- [x] Add chat support
- [X] Adding embedding support
- [X] writing integration tests
- [X] writing tests for chat
- [X] writing tests for embedding
- [X] writing unit tests
- [X] writing tests for chat
- [X] writing tests for embedding
- [X] Adding documentation
- [X] writing documentation for chat
- [X] writing documentation for embedding
- [X] run `make test`
- [X] run `make lint`, `make lint_diff`
- [X] Final checks (spell check, lint, format and overall testing)
---------
Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This PR aims to enhance the documentation for TiDB integration, driven
by feedback from our users. It provides detailed introductions to key
features, ensuring developers can fully leverage TiDB for AI application
development.