This PR adds documentation for the langchain-taiga Tool integration,
including an example notebook at
'docs/docs/integrations/tools/taiga.ipynb' and updates to
'libs/packages.yml' to track the new package.
Issue:
N/A
Dependencies:
None
Twitter handle:
N/A
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
PR Title:
langchain: add attachments support in OpenAIAssistantRunnable
PR Description:
This PR fixes an issue with the "retrieval" tool (internally named
"file_search") in the OpenAI Assistant by adding support for the
"attachments" parameter in the invoke method. This change allows files
to be linked to messages when they are inserted into threads, which is
essential for utilizing OpenAI's Retrieval Augmented Generation (RAG)
feature.
Issue:
N/A
Dependencies:
None
Twitter handle:
N/A
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Document refinement: optimize milvus server description. The description
of "milvus standalone", and "milvus server" is confusing, so I clarify
it with a detailed description.
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
- **Description:** Fix typo in code samples for max_tokens_for_prompt.
Code blocks had singular "token" but the method has plural "tokens".
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** N/A
**Description:**
5 fix of example from function with_alisteners() in
libs/core/langchain_core/runnables/base.py
Replace incoherent example output with workable example's output.
1. SyntaxError: unterminated string literal
print(f"on start callback starts at {format_t(time.time())}
correct as
print(f"on start callback starts at {format_t(time.time())}")
2. SyntaxError: unterminated string literal
print(f"on end callback starts at {format_t(time.time())}
correct as
print(f"on end callback starts at {format_t(time.time())}")
3. NameError: name 'Runnable' is not defined
Fix as
from langchain_core.runnables import Runnable
4. NameError: name 'asyncio' is not defined
Fix as
import asyncio
5. NameError: name 'format_t' is not defined.
Implement format_t() as
from datetime import datetime, timezone
def format_t(timestamp: float) -> str:
return datetime.fromtimestamp(timestamp, tz=timezone.utc).isoformat()
Thank you for contributing to LangChain!
- [x] **PR title**: "docs: added proper width to sidebar content"
- [x] **PR message**: added proper width to sidebar content
- **Description:** While accessing the [LangChain Python API
Reference](https://python.langchain.com/api_reference/index.html) the
sidebar content does not display correctly.
- **Issue:** Follow-up to #30061
- **Dependencies:** None
- **Twitter handle:** https://x.com/implicitdefcnc
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
add batch_size to fix oom when embed large amount texts
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Structured output will currently always raise a BadRequestError when
Claude 3.7 Sonnet's `thinking` is enabled, because we rely on forced
tool use for structured output and this feature is not supported when
`thinking` is enabled.
Here we:
- Emit a warning if `with_structured_output` is called when `thinking`
is enabled.
- Raise `OutputParserException` if no tool calls are generated.
This is arguably preferable to raising an error in all cases.
```python
from langchain_anthropic import ChatAnthropic
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
llm = ChatAnthropic(
model="claude-3-7-sonnet-latest",
max_tokens=5_000,
thinking={"type": "enabled", "budget_tokens": 2_000},
)
structured_llm = llm.with_structured_output(Person) # <-- this generates a warning
```
```python
structured_llm.invoke("Alice is 30.") # <-- works
```
```python
structured_llm.invoke("Hello!") # <-- raises OutputParserException
```
Took a "census" of models supported by init_chat_model-- of those that
return model names in response metadata, these were the only two that
had it keyed under `"model"` instead of `"model_name"`.
- [ ] **PR title**: [langchain_community.llms.xinference]: Add
asynchronous generate interface
- [ ] **PR message**: The asynchronous generate interface support stream
data and non-stream data.
chain = prompt | llm
async for chunk in chain.astream(input=user_input):
yield chunk
- [ ] **Add tests and docs**:
from langchain_community.llms import Xinference
from langchain.prompts import PromptTemplate
llm = Xinference(
server_url="http://0.0.0.0:9997", # replace your xinference server url
model_uid={model_uid} # replace model_uid with the model UID return from
launching the model
stream = True
)
prompt = PromptTemplate(input=['country'], template="Q: where can we
visit in the capital of {country}? A:")
chain = prompt | llm
async for chunk in chain.astream(input=user_input):
yield chunk
Thank you for contributing to LangChain!
- [ ] **PR title**: "docs: add xAI to ChatModelTabs"
- [ ] **PR message**:
- **Description:** Added `ChatXAI` to `ChatModelTabs` dropdown to
improve visibility of xAI chat models (e.g., "grok-2", "grok-3").
- **Issue:** Follow-up to #30010
- **Dependencies:** none
- **Twitter handle:** @tiestvangool
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- **Implementing the MMR algorithm for OLAP vector storage**:
- Support Apache Doris and StarRocks OLAP database.
- Example: "vectorstore.as_retriever(search_type="mmr",
search_kwargs={"k": 10})"
- **Implementing the MMR algorithm for OLAP vector storage**:
- **Apache Doris
- **StarRocks
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- **Add tests and docs**:
- Example: "vectorstore.as_retriever(search_type="mmr",
search_kwargs={"k": 10})"
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: fakzhao <fakzhao@cisco.com>
This pull request includes a change to the `TavilySearchResults` class
in the `tool.py` file, which updates the code block format in the
documentation.
Documentation update:
*
[`libs/community/langchain_community/tools/tavily_search/tool.py`](diffhunk://#diff-e3b6a980979268b639c6a86e9b182756b0f7c7e9e5605e613bc0a72ea6aa5301L54-R59):
Changed the code block format from Python to JSON in the example
provided in the docstring.Thank you for contributing to LangChain!
## **Description:**
When using the Tavily retriever with include_raw_content=True, the
retriever occasionally fails with a Pydantic ValidationError because
raw_content can be None.
The Document model in langchain_core/documents/base.py requires
page_content to be a non-None value, but the Tavily API sometimes
returns None for raw_content.
This PR fixes the issue by ensuring that even when raw_content is None,
an empty string is used instead:
```python
page_content=result.get("content", "")
if not self.include_raw_content
else (result.get("raw_content") or ""),
This pull request includes updates to the
`libs/community/langchain_community/callbacks/bedrock_anthropic_callback.py`
file to add a new model version to the list of supported models.
Updates to supported models:
* Added support for the `anthropic.claude-3-7-sonnet-20250219-v1:0`
model with a rate of `0.003` for 1000 input tokens.
* Added support for the `anthropic.claude-3-7-sonnet-20250219-v1:0`
model with a rate of `0.015` for 1000 output tokens.
AWS Bedrock pricing reference : https://aws.amazon.com/bedrock/pricing
- [x] **PR title**:
- [x] **PR message**:
- Added a new section for how to set up and use Milvus with Docker, and
added an example of how to instantiate Milvus for hybrid retrieval
- Fixed the documentation setup to run `make lint` and `make format`
- [x] **Add tests and docs**: If you're adding a new integration, please
include
N/A
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Mark Perfect <mark.anthony.perfect1@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
## PyMuPDF4LLM integration to LangChain for PDF content extraction in
Markdown format
### Description
[PyMuPDF4LLM](https://github.com/pymupdf/RAG) makes it easier to extract
PDF content in Markdown format, needed for LLM & RAG applications.
(License: GNU Affero General Public License v3.0)
[langchain-pymupdf4llm](https://github.com/lakinduboteju/langchain-pymupdf4llm)
integrates PyMuPDF4LLM to LangChain as a Document Loader.
(License: MIT License)
This pull request introduces the integration of
[PyMuPDF4LLM](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm) into
the LangChain project as an integration package:
[`langchain-pymupdf4llm`](https://github.com/lakinduboteju/langchain-pymupdf4llm).
The most important changes include adding new Jupyter notebooks to
document the integration and updating the package configuration file to
include the new package.
### Documentation:
* `docs/docs/integrations/providers/pymupdf4llm.ipynb`: Added a new
Jupyter notebook to document the integration of `PyMuPDF4LLM` with
LangChain, including installation instructions and class imports.
* `docs/docs/integrations/document_loaders/pymupdf4llm.ipynb`: Added a
new Jupyter notebook to document the usage of `langchain-pymupdf4llm` as
a LangChain integration package in detail.
### Package registration:
* `libs/packages.yml`: Updated the package configuration file to include
the `langchain-pymupdf4llm` package.
### Additional information
* Related to: https://github.com/langchain-ai/langchain/pull/29848
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- **Description:** Same changes as #26593 but for FileCallbackHandler
- **Issue:** Fixes#29941
- **Dependencies:** None
- **Twitter handle:** None
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc
Some fixes done for TC001,TC002 and TC003 but these rules are excluded
since they don't play well with Pydantic.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Issue**: This trigger can only be used by the first table created.
Cannot create additional triggers for other tables.
**fixed**: Update the trigger name so that it can be used for new
tables.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Description:**
Tavily search results returned from API include useful information like
title, score and (optionally) raw_content that is missed in wrapper
although it's documented there properly. Add this data to the result
structure.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>