* Adds BlobParsers for images. These implementations can take an image
and produce one or more documents per image. This interface can be used
for exposing OCR capabilities.
* Update PyMuPDFParser and Loader to standardize metadata, handle
images, improve table extraction etc.
- **Twitter handle:** pprados
This is one part of a larger Pull Request (PR) that is too large to be
submitted all at once.
This specific part focuses to prepare the update of all parsers.
For more details, see [PR
28970](https://github.com/langchain-ai/langchain/pull/28970).
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
## Description
- Responding to `NCP API Key` changes.
- To fix `ChatClovaX` `astream` function to raise `SSEError` when an
error event occurs.
- To add `token length` and `ai_filter` to ChatClovaX's
`response_metadata`.
- To update document for apply NCP API Key changes.
cc. @efriis @vbarda
Add tools to interact with Dappier APIs with an example notebook.
For `DappierRealTimeSearchTool`, the tool can be invoked with:
```python
from langchain_dappier import DappierRealTimeSearchTool
tool = DappierRealTimeSearchTool()
tool.invoke({"query": "What happened at the last wimbledon"})
```
```
At the last Wimbledon in 2024, Carlos Alcaraz won the title by defeating Novak Djokovic. This victory marked Alcaraz's fourth Grand Slam title at just 21 years old! 🎉🏆🎾
```
For DappierAIRecommendationTool the tool can be invoked with:
```python
from langchain_dappier import DappierAIRecommendationTool
tool = DappierAIRecommendationTool(
data_model_id="dm_01j0pb465keqmatq9k83dthx34",
similarity_top_k=3,
ref="sportsnaut.com",
num_articles_ref=2,
search_algorithm="most_recent",
)
```
```
[{"author": "Matt Weaver", "image_url": "https://images.dappier.com/dm_01j0pb465keqmatq9k83dthx34...", "pubdate": "Fri, 17 Jan 2025 08:04:03 +0000", "source_url": "https://sportsnaut.com/chili-bowl-thursday-bell-column/", "summary": "The article highlights the thrilling unpredictability... ", "title": "Thursday proves why every lap of Chili Bowl..."},
{"author": "Matt Higgins", "image_url": "https://images.dappier.com/dm_01j0pb465keqmatq9k83dthx34...", "pubdate": "Fri, 17 Jan 2025 02:48:42 +0000", "source_url": "https://sportsnaut.com/new-york-mets-news-pete-alonso...", "summary": "The New York Mets are likely parting ways with star...", "title": "MLB insiders reveal New York Mets’ last-ditch..."},
{"author": "Jim Cerny", "image_url": "https://images.dappier.com/dm_01j0pb465keqmatq9k83dthx34...", "pubdate": "Fri, 17 Jan 2025 05:10:39 +0000", "source_url": "https://www.foreverblueshirts.com/new-york-rangers-news...", "summary": "The New York Rangers achieved a thrilling 5-3 comeback... ", "title": "Rangers score 3 times in 3rd period for stirring 5-3..."}]
```
The integration package can be found over here -
https://github.com/DappierAI/langchain-dappier
Expanded the Amazon Neptune documentation with new sections detailing
usage of chat message history with the
`create_neptune_opencypher_qa_chain` and
`create_neptune_sparql_qa_chain` functions.
Title: community: add Financial Modeling Prep (FMP) API integration
Description: Adding LangChain integration for Financial Modeling Prep
(FMP) API to enable semantic search and structured tool creation for
financial data endpoints. This integration provides semantic endpoint
search using vector stores and automatic tool creation with proper
typing and error handling. Users can discover relevant financial
endpoints using natural language queries and get properly typed
LangChain tools for discovered endpoints.
Issue: N/A
Dependencies:
fmp-data>=0.3.1
langchain-core>=0.1.0
faiss-cpu
tiktoken
Twitter handle: @mehdizarem
Unit tests and example notebook have been added:
Tests are in tests/integration_tests/est_tools.py and
tests/unit_tests/test_tools.py
Example notebook is in docs/tools.ipynb
All format, lint and test checks pass:
pytest
mypy .
Dependencies are imported within functions and not added to
pyproject.toml. The changes are backwards compatible and only affect the
community package.
---------
Co-authored-by: mehdizare <mehdizare@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- [x] **PR title**: "docs: Fix typo in documentation"
- [x] **PR message**:
- **Description:** Fixed a typo in the documentation, changing "An
vectorstore" to "A vector store" for grammatical accuracy.
- **Issue:** N/A (no issue filed for this typo fix)
- **Dependencies:** None
- **Twitter handle:** N/A
- [x] **Add tests and docs**: This is a minor documentation fix that
doesn't require additional tests or example notebooks.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Related: https://github.com/langchain-ai/langchain-aws/pull/322
The legacy `NeptuneOpenCypherQAChain` and `NeptuneSparqlQAChain` classes
are being replaced by the new LCEL format chains
`create_neptune_opencypher_qa_chain` and
`create_neptune_sparql_qa_chain`, respectively, in the `langchain_aws`
package.
This PR adds deprecation warnings to all Neptune classes and functions
that have been migrated to `langchain_aws`. All relevant documentation
has also been updated to replace `langchain_community` usage with the
new `langchain_aws` implementations.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
### Description
This PR adds docs for the
[langchain-hyperbrowser](https://pypi.org/project/langchain-hyperbrowser/)
package. It includes a document loader that uses Hyperbrowser to scrape
or crawl any urls and return formatted markdown or html content as well
as relevant metadata.
[Hyperbrowser](https://hyperbrowser.ai) is a platform for running and
scaling headless browsers. It lets you launch and manage browser
sessions at scale and provides easy to use solutions for any webscraping
needs, such as scraping a single page or crawling an entire site.
### Issue
None
### Dependencies
None
### Twitter Handle
`@hyperbrowser`
Fixed a broken link in `document_loader_markdown.ipynb` to point to the
updated documentation page for the Unstructured package.
Issue: N/A
Dependencies: None
…angChain parser issue.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
Update presented model in `WatsonxRerank` documentation.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Update model for one notebook that specified `gpt-4`.
Otherwise just updating cassettes.
---------
Co-authored-by: Jacob Lee <jacoblee93@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
Adding voyage-3-large model to the .ipynb file (its just extending a
list, so not even a code change)
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
fix to word "can"
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Add upstage document parse loader to pdf loaders
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
## Langchain Kùzu
### Description
This PR adds docs for the `langchain-kuzu` package [on
PyPI](https://pypi.org/project/langchain-kuzu/) that was recently
published, allowing Kùzu users to more easily use and work with
LangChain QA chains. The package will also make it easier for the Kùzu
team to continue supporting and updating the integration over future
releases.
### Twitter Handle
Please tag [@kuzudb](https://x.com/kuzudb) on Twitter once this PR is
merged, so LangChain users can be notified!
---------
Co-authored-by: Erick Friis <erickfriis@gmail.com>
- [x] **PR title**: "docs: add langchain-pull-md Markdown loader"
- [x] **PR message**:
- **Description:** This PR introduces the `langchain-pull-md` package to
the LangChain community. It includes a new document loader that utilizes
the pull.md service to convert URLs into Markdown format, particularly
useful for handling web pages rendered with JavaScript frameworks like
React, Angular, or Vue.js. This loader helps in efficient and reliable
Markdown conversion directly from URLs without local rendering, reducing
server load.
- **Issue:** NA
- **Dependencies:** requests >=2.25.1
- **Twitter handle:** https://x.com/eugeneevstafev?s=21
- [x] **Add tests and docs**:
1. Added unit tests to verify URL checking and conversion
functionalities.
2. Created a comprehensive example notebook detailing the usage of the
new loader.
- [x] **Lint and test**:
- Completed local testing using `make format`, `make lint`, and `make
test` commands as per the LangChain contribution guidelines.
**Related Links:**
- [Package Repository](https://github.com/chigwell/langchain-pull-md)
- [PyPI Package](https://pypi.org/project/langchain-pull-md/)
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
spell check
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Add a retriever to interact with Dappier APIs with an example notebook.
The retriever can be invoked with:
```python
from langchain_dappier import DappierRetriever
retriever = DappierRetriever(
data_model_id="dm_01jagy9nqaeer9hxx8z1sk1jx6",
k=5
)
retriever.invoke("latest tech news")
```
To retrieve 5 documents related to latest news in the tech sector. The
included notebook also includes deeper details about controlling filters
such as selecting a data model, number of documents to return, site
domain reference, minimum articles from the reference domain, and search
algorithm, as well as including the retriever in a chain.
The integration package can be found over here -
https://github.com/DappierAI/langchain-dappier
Problem:
"Optional" object is used in one example without importing, which raises
the following error when copying the example into IDE or Jupyter Lab

Solution:
Just importing Optional from typing_extensions module, this solves the
problem!
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
## Description
This pull request updates the documentation for FAISS regarding filter
construction, following the changes made in commit `df5008f`.
## Issue
None. This is a follow-up PR for documentation of
[#28207](https://github.com/langchain-ai/langchain/pull/28207)
## Dependencies:
None.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit updates the documentation and package registry for the
FalkorDB Chat Message History integration.
**Changes:**
- Added a comprehensive example notebook
falkordb_chat_message_history.ipynb demonstrating how to use FalkorDB
for session-based chat message storage.
- Added a provider notebook for FalkorDB
- Updated libs/packages.yml to register FalkorDB as an integration
package, following LangChain's new guidelines for community
integrations.
**Notes:**
- This update aligns with LangChain's process for registering new
integrations via documentation updates and package registry
modifications.
- No functional or core package changes were made in this commit.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
This pull request updates the documentation in
`docs/docs/how_to/custom_tools.ipynb` to reflect the recommended
approach for generating JSON schemas in Pydantic. Specifically, it
replaces instances of the deprecated `schema()` method with the newer
and more versatile `model_json_schema()`.
## Description
To integrate ModelScope inference API endpoints for both Embeddings,
LLMs and ChatModels, install the package
`langchain-modelscope-integration` (as discussed in issue #28928 ). This
is necessary because the package name `langchain-modelscope` was already
registered by another party.
ModelScope is a premier platform designed to connect model checkpoints
with model applications. It provides the necessary infrastructure to
share open models and promote model-centric development. For more
information, visit GitHub page:
[ModelScope](https://github.com/modelscope).
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- **Description:** Update docs to add BoxBlobLoader and extra_fields to
all Box connectors.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** @BoxPlatform
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
- Title: Fix typo to correct "embedding" to "embeddings" in PGVector
initialization example
- Problem: There is a typo in the example code for initializing the
PGVector class. The current parameter "embedding" is incorrect as the
class expects "embeddings".
- Correction: The corrected code snippet is:
vector_store = PGVector(
embeddings=embeddings,
collection_name="my_docs",
connection="postgresql+psycopg://...",
)
Hi Erick. Coming back from a previous attempt, we now made a separate
package for the CrateDB adapter, called `langchain-cratedb`, as advised.
Other than registering the package within `libs/packages.yml`, this
patch includes a minimal amount of documentation to accompany the advent
of this new package. Let us know about any mistakes we made, or changes
you would like to see. Thanks, Andreas.
## About
- **Description:** Register a new database adapter package,
`langchain-cratedb`, providing traditional vector store, document
loader, and chat message history features for a start.
- **Addressed to:** @efriis, @eyurtsev
- **References:** GH-27710
- **Preview:** [Providers » More »
CrateDB](https://langchain-git-fork-crate-workbench-register-la-4bf945-langchain.vercel.app/docs/integrations/providers/cratedb/)
## Status
- **PyPI:** https://pypi.org/project/langchain-cratedb/
- **GitHub:** https://github.com/crate/langchain-cratedb
- **Documentation (CrateDB):**
https://cratedb.com/docs/guide/integrate/langchain/
- **Documentation (LangChain):** _This PR._
## Backlog?
Is this applicable for this kind of patch?
> - [ ] **Add tests and docs**: If you're adding a new integration,
please include
> 1. a test for the integration, preferably unit tests that do not rely
on network access,
> 2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
## Q&A
1. Notebooks that use the LangChain CrateDB adapter are currently at
[CrateDB LangChain
Examples](https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/llm-langchain),
and the documentation refers to them. Because they are derived from very
old blueprints coming from LangChain 0.0.x times, we guess they need a
refresh before adding them to `docs/docs/integrations`. Is it applicable
to merge this minimal package registration + documentation patch, which
already includes valid code snippets in `cratedb.mdx`, and add
corresponding notebooks on behalf of a subsequent patch later?
2. How would it work getting into the tabular list of _Integration
Packages_ enumerated on the [documentation entrypoint page about
Providers](https://python.langchain.com/docs/integrations/providers/)?
/cc Please also review, @ckurze, @wierdvanderhaar, @kneth,
@simonprickett, if you can find the time. Thanks!
- **Description:** The aload function, contrary to its name, is not an
asynchronous function, so it cannot work concurrently with other
asynchronous functions.
- **Issue:** #28336
- **Test: **: Done
- **Docs: **
[here](e0a95e5646/docs/docs/integrations/document_loaders/web_base.ipynb (L201))
- **Lint: ** All checks passed
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Issue**:
This tutorial depends on langgraph, however Langgraph is not mentioned
on the installation section for the tutorial, which raises an error when
copying and pasting the code snippets as following:

**Solution**:
Just adding langgraph package to installation section, for both pip and
Conda tabs as this tutorial requires it.
- *[x] **PR title**: "community: adding langchain-predictionguard
partner package documentation"
- *[x] **PR message**:
- **Description:** This PR adds documentation for the
langchain-predictionguard package to main langchain repo, along with
deprecating current Prediction Guard LLMs package. The LLMs package was
previously broken, so I also updated it one final time to allow it to
continue working from this point onward. . This enables users to chat
with LLMs through the Prediction Guard ecosystem.
- **Package Links**:
- [PyPI](https://pypi.org/project/langchain-predictionguard/)
- [Github
Repo](https://www.github.com/predictionguard/langchain-predictionguard)
- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** [@predictionguard](https://x.com/predictionguard)
- *[x] **Add tests and docs**: All docs have been added for the partner
package, and the current LLMs package test was updated to reflect
changes.
- *[x] **Lint and test**: Linting tests are all passing.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
Issue: several Google integrations are implemented on the
[github.com/googleapis](https://github.com/googleapis) organization
repos and these integrations are almost lost. But they are essential
integrations.
Change: added a list of all packages that have Google integrations.
Added a description of this situation.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
**Description:**
With current HTML splitters, they rely on secondary use of the
`RecursiveCharacterSplitter` to further chunk the document into
manageable chunks. The issue with this is it fails to maintain important
structures such as tables, lists, etc within HTML.
This Implementation of a HTML splitter, allows the user to define a
maximum chunk size, HTML elements to preserve in full, options to
preserve `<a>` href links in the output and custom handlers.
The core splitting begins with headers, similar to `HTMLHeaderSplitter`.
If these sections exceed the length of the `max_chunk_size` further
recursive splitting is triggered. During this splitting, elements listed
to preserve, will be excluded from the splitting process. This can cause
chunks to be slightly larger then the max size, depending on preserved
length. However, all contextual relevance of the preserved item remains
intact.
**Custom Handlers**: Sometimes, companies such as Atlassian have custom
HTML elements, that are not parsed by default with `BeautifulSoup`.
Custom handlers allows a user to provide a function to be ran whenever a
specific html tag is encountered. This allows the user to preserve and
gather information within custom html tags that `bs4` will potentially
miss during extraction.
**Dependencies:** User will need to install `bs4` in their project to
utilise this class
I have also added in `how_to` and unit tests, which require `bs4` to
run, otherwise they will be skipped.
Flowchart of process:

---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
**Description:**
Adding VoyageAI's text_embedding to 'integrations/text_embedding/'
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Issue: integrations related to a provider can be spread across several
packages and classes. It is very hard to find a provider using only
ToCs.
Fix: we have a very useful and helpful tool to search by provider name.
It is the `Search` field. So, I've added recommendations for using this
field. It seems obvious but it is not.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- "community: 1. add new parameter `default_headers` for oci model
deployments and oci chat model deployments. 2. updated k parameter in
OCIModelDeploymentLLM class."
- [x] **PR message**:
- **Description:** 1. add new parameters `default_headers` for oci model
deployments and oci chat model deployments. 2. updated k parameter in
OCIModelDeploymentLLM class.
- [x] **Add tests and docs**:
1. unit tests
2. notebook
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Added a link to make it easier to organize github
issues for langchain.
- **Issue:** After reading that there was a taxonomy of labels I had to
figure out how to find it.
- **Dependencies:** None
- **Twitter handle:** None
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
## Overview
This PR adds documentation for the `langchain-yt-dlp` package, a YouTube
document loader that uses `yt-dlp` for Youtube videos metadata
extraaction.
## Changes
- Added documentation notebook for YoutubeLoader
- Updated packages.yml to include langchain-yt-dlp
## Motivation
The existing LangChain YoutubeLoader was unable to fetch YouTube
metadata due to changes in YouTube's structure. This package resolves
those issues by leveraging the `yt-dlp` library.
## Features
- Reliable YouTube metadata extraction
## Related
- Package Repository: https://github.com/aqib0770/langchain-yt-dlp
- PyPI Package: https://pypi.org/project/langchain-yt-dlp/
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Hi, langchain team! I'm a maintainer of
[OceanBase](https://github.com/oceanbase/oceanbase).
With the integration guidance, I create a python lib named
[langchain-oceanbase](https://github.com/oceanbase/langchain-oceanbase)
to integrate `Oceanbase Vector Store` with `Langchain`.
So I'd like to add the required docs. I will appreciate your feedback.
Thank you!
---------
Signed-off-by: shanhaikang.shk <shanhaikang.shk@oceanbase.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
- [X] **PR title**:
community: Add new model and structured output support
- [X] **PR message**:
- **Description:** add support for meta llama 3.2 image handling, and
JSON mode for structured output
- **Issue:** NA
- **Dependencies:** NA
- **Twitter handle:** NA
- [x] **Add tests and docs**:
1. we have updated our unit tests,
2. no changes required for documentation.
- [x] **Lint and test**:
make format, make lint and make test we run successfully
---------
Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
**Description**: Some confluence instances don't support personal access
token, then cookie is a convenient way to authenticate. This PR adds
support for Confluence cookies.
**Twitter handle**: soulmachine
Issue: the current
[Cache](https://python.langchain.com/docs/integrations/llm_caching/)
page has an inconsistent heading.
Mixed terms are used; mixed casing; and mixed `selecting`. Excessively
long titles make right-side ToC hard to read and unnecessarily long.
Changes: consitent and more-readable ToC
**Description:** Added support for FalkorDB Vector Store, including its
implementation, unit tests, documentation, and an example notebook. The
FalkorDB integration allows users to efficiently manage and query
embeddings in a vector database, with relevance scoring and maximal
marginal relevance search. The following components were implemented:
- Core implementation for FalkorDBVector store.
- Unit tests ensuring proper functionality and edge case coverage.
- Example notebook demonstrating an end-to-end setup, search, and
retrieval using FalkorDB.
**Twitter handle:** @tariyekorogha
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Update to OpenLLM 0.6, which we decides to make use of OpenLLM's
OpenAI-compatible endpoint. Thus, OpenLLM will now just become a thin
wrapper around OpenAI wrapper.
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
---------
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: ccurme <chester.curme@gmail.com>
**Description:**
This PR addresses the `TypeError: sequence item 0: expected str
instance, FluentValue found` error when invoking `WikidataQueryRun`. The
root cause was an incompatible version of the
`wikibase-rest-api-client`, which caused the tool to fail when handling
`FluentValue` objects instead of strings.
The current implementation only supports `wikibase-rest-api-client<0.2`,
but the latest version is `0.2.1`, where the current implementation
breaks. Additionally, the error message advises users to install the
latest version: [code
reference](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/utilities/wikidata.py#L125C25-L125C32).
Therefore, this PR updates the tool to support the latest version of
`wikibase-rest-api-client`.
Key changes:
- Updated the handling of `FluentValue` objects to ensure compatibility
with the latest `wikibase-rest-api-client`.
- Removed the restriction to `wikibase-rest-api-client<0.2` and updated
to support the latest version (`0.2.1`).
**Issue:**
Fixes [#24093](https://github.com/langchain-ai/langchain/issues/24093) –
`TypeError: sequence item 0: expected str instance, FluentValue found`.
**Dependencies:**
- Upgraded `wikibase-rest-api-client` to the latest version to resolve
the issue.
---------
Co-authored-by: peiwen_zhang <peiwen_zhang@email.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Description: Add tool calling and structured output support for
SambaStudio chat models, docs included
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Fixes#26171:
- added some clarification text for the keyword argument
`breakpoint_threshold_amount`
- added min_chunk_size: together with `breakpoint_threshold_amount`, too
small/big chunk sizes can be avoided
Note: the langchain-experimental was moved to a separate repo, so only
the doc change stays here.
Thank you for contributing to LangChain!
- Added [full
text](https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/full-text-search)
and [hybrid
search](https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/hybrid-search)
support for Azure CosmosDB NoSql Vector Store
- Added a new enum called CosmosDBQueryType which supports the following
values:
- VECTOR = "vector"
- FULL_TEXT_SEARCH = "full_text_search"
- FULL_TEXT_RANK = "full_text_rank"
- HYBRID = "hybrid"
- User now needs to provide this query_type to the similarity_search
method for the vectorStore to make the correct query api call.
- Added a couple of work arounds as for the FULL_TEXT_RANK and HYBRID
query functions we don't support parameterized queries right now. I have
added TODO's in place, and will remove these work arounds by end of
January.
- Added necessary test cases and updated the
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Issue: Here is an ambiguity about W&B integrations. There are two
existing provider pages.
Fix: Added the "root" W&B provider page. Added there the references to
the documentation in the W&B site. Cleaned up formats in existing pages.
Added one more integration reference.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Added Langchain complete tutorial playlist from total technology zonne
channel .In this playlist every video is focusing one specific use case
and hands on demo.All tutorials are equally good for every levels .
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
~Note that this PR is now Draft, so I didn't add change to `aindex`
function and didn't add test codes for my change.
After we have an agreement on the direction, I will add commits.~
`batch_size` is very difficult to decide because setting a large number
like >10000 will impact VectorDB and RecordManager, while setting a
small number will delete records unnecessarily, leading to redundant
work, as the `IMPORTANT` section says.
On the other hand, we can't use `full` because the loader returns just a
subset of the dataset in our use case.
I guess many people are in the same situation as us.
So, as one of the possible solutions for it, I would like to introduce a
new argument, `scoped_full_cleanup`.
This argument will be valid only when `claneup` is Full. If True, Full
cleanup deletes all documents that haven't been updated AND that are
associated with source ids that were seen during indexing. Default is
False.
This change keeps backward compatibility.
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: community: add TablestoreVectorStore
- [x] **PR message**:
- **Description:** add TablestoreVectorStore
- **Dependencies:** none
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration: yes
2. an example notebook showing its use: yes
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR resolves the conflicting information in the Chat models
documentation regarding structured output support for ChatOllama.
- The Featured Providers table has been updated to reflect the correct
status.
- Structured output support for ChatOllama was introduced on Dec 6,
2024.
- A note has been added to ensure users update to the latest Ollama
version for structured outputs.
**Issue:** Fixes#28691
Description:
snowflake.py
Add _stream and _stream_content methods to enable streaming
functionality
fix pydantic issues and added functionality with the overall langchain
version upgrade
added bind_tools method for agentic workflows support through langgraph
updated the _generate method to account for agentic workflows support
through langgraph
cosmetic changes to comments and if conditions
snowflake.ipynb
Added _stream example
cosmetic changes to comments
fixed lint errors
check_pydantic.sh
Decreased counter from 126 to 125 as suggested when formatting
---------
Co-authored-by: Prathamesh Nimkar <prathamesh.nimkar@snowflake.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
- [ ] Main note
- **Description:** I added notes on the Jina and LocalAI pages telling
users that they must be using this integrations with openai sdk version
0.x, because if they dont they will get an error saying that "openai has
no attribute error". This PR was recommended by @efriis
- **Issue:** warns people about the issue in #28529
- **Dependencies:** None
- **Twitter handle:** JoaqCore
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description:**
- **Memgraph** no longer relies on `Neo4jGraphStore` but **implements
`GraphStore`**, just like other graph databases.
- **Memgraph** no longer relies on `GraphQAChain`, but implements
`MemgraphQAChain`, just like other graph databases.
- The refresh schema procedure has been updated to try using `SHOW
SCHEMA INFO`. The fallback uses Cypher queries (a combination of schema
and Cypher) → **LangChain integration no longer relies on MAGE
library**.
- The **schema structure** has been reformatted. Regardless of the
procedures used to get schema, schema structure is the same.
- The `add_graph_documents()` method has been implemented. It transforms
`GraphDocument` into Cypher queries and creates a graph in Memgraph. It
implements the ability to use `baseEntityLabel` to improve speed
(`baseEntityLabel` has an index on the `id` property). It also
implements the ability to include sources by creating a `MENTIONS`
relationship to the source document.
- Jupyter Notebook for Memgraph has been updated.
- **Issue:** /
- **Dependencies:** /
- **Twitter handle:** supe_katarina (DX Engineer @ Memgraph)
Closes#25606
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
Docs on how to do hybrid search with weaviate is covered
[here](https://python.langchain.com/docs/integrations/vectorstores/weaviate/)
@efriis
---------
Co-authored-by: pookam90 <pookam@microsoft.com>
Co-authored-by: Pooja Kamath <60406274+Pookam90@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** Adding Documentation for new SQL Server Vector Store
Package.
Changed files -
Added new Vector Store -
docs\docs\integrations\vectorstores\sqlserver.ipynb
FeatureTable.Js - docs\src\theme\FeatureTables.js
Microsoft.mdx - docs\docs\integrations\providers\microsoft.mdx
Detailed documentation on API -
https://python.langchain.com/api_reference/sqlserver/index.html
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
## Description
First of all, thanks for the great framework that is LangChain!
At [Linkup](https://www.linkup.so/) we're working on an API to connect
LLMs and agents to the internet and our partner sources. We'd be super
excited to see our API integrated in LangChain! This essentially
consists in adding a LangChain retriever and tool, which is done in our
own [package](https://pypi.org/project/langchain-linkup/). Here we're
simply following the [integration
documentation](https://python.langchain.com/docs/contributing/how_to/integrations/)
and update the documentation of LangChain to mention the Linkup
integration.
We do have tests (both units & integration) in our [source
code](https://github.com/LinkupPlatform/langchain-linkup), and tried to
follow as close as possible the [integration
documentation](https://python.langchain.com/docs/contributing/how_to/integrations/)
which specifically requests to focus on documentation changes for an
integration PR, so I'm not adding tests here, even though the PR
checklist seems to suggest so. Feel free to correct me if I got this
wrong!
By the way, we would be thrilled by being mentioned in the list of
providers which have standalone packages
[here](https://langchain-git-fork-linkupplatform-cj-doc-langchain.vercel.app/docs/integrations/providers/),
is there something in particular for us to do for that? 🙂
## Twitter handle
Linkup_platform
<!--
## PR Checklist
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
--!>
Thank you for contributing to LangChain!
- [ ] **PR message**:
- **Description:**: We have launched the new **Solar Pro** model, and
the documentation has been updated to include its details and features.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Hi Langchain team!
I'm the co-founder and mantainer at
[ScrapeGraphAI](https://scrapegraphai.com/).
By following the integration
[guide](https://python.langchain.com/docs/contributing/how_to/integrations/publish/)
on your site, I have created a new lib called
[langchain-scrapegraph](https://github.com/ScrapeGraphAI/langchain-scrapegraph).
With this PR I would like to integrate Scrapegraph as provider in
Langchain, adding the required documentation files.
Let me know if there are some changes to be made to be properly
integrated both in the lib and in the documentation.
Thank you 🕷️🦜
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Issue:** Added support for creating indexes in the SAP HANA Vector
engine.
**Changes**:
1. Introduced a new function `create_hnsw_index` in `hanavector.py` that
enables the creation of indexes for SAP HANA Vector.
2. Added integration tests for the index creation function to ensure
functionality.
3. Updated the documentation to reflect the new index creation feature,
including examples and output from the notebook.
4. Fix the operator issue in ` _process_filter_object` function and
change the array argument to a placeholder in the similarity search SQL
statement.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>