Commit Graph

730 Commits

Author SHA1 Message Date
Satyam Kumar
90f7713399
refactor: improve docstring parsing logic for Google style (#28730)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


Description:  
Improved the `_parse_google_docstring` function in `langchain/core` to
support parsing multi-paragraph descriptions before the `Args:` section
while maintaining compliance with Google-style docstring guidelines.
This change ensures better handling of docstrings with detailed function
descriptions.

Issue:  
Fixes #28628

Dependencies:  
None.

Twitter handle:  
@isatyamks

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-18 09:35:19 -05:00
Bagatur
ac278cbe8b
core[patch]: export InjectedToolCallId (#28772) 2024-12-17 19:29:20 +00:00
Harrison Chase
de7996c2ca
core: add kwargs support to VectorStore (#25934)
has been missing the passthrough until now

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-16 18:57:57 +00:00
Erick Friis
1c120e9615
core: xml output parser tags docstring (#28745) 2024-12-16 18:25:16 +00:00
Keiichi Hirobe
67fd554512
core[patch]: throw exception indexing code if deletion fails in vectorstore (#28103)
The delete methods in the VectorStore and DocumentIndex interfaces
return a status indicating the result. Therefore, we can assume that
their implementations don't throw exceptions but instead return a result
indicating whether the delete operations have failed. The current
implementation doesn't check the returned value, so I modified it to
throw an exception when the operation fails.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-12-13 16:14:27 -05:00
Keiichi Hirobe
258b3be5ec
core[minor]: add new clean up strategy "scoped_full" to indexing (#28505)
~Note that this PR is now Draft, so I didn't add change to `aindex`
function and didn't add test codes for my change.
After we have an agreement on the direction, I will add commits.~

`batch_size` is very difficult to decide because setting a large number
like >10000 will impact VectorDB and RecordManager, while setting a
small number will delete records unnecessarily, leading to redundant
work, as the `IMPORTANT` section says.
On the other hand, we can't use `full` because the loader returns just a
subset of the dataset in our use case.

I guess many people are in the same situation as us.

So, as one of the possible solutions for it, I would like to introduce a
new argument, `scoped_full_cleanup`.
This argument will be valid only when `claneup` is Full. If True, Full
cleanup deletes all documents that haven't been updated AND that are
associated with source ids that were seen during indexing. Default is
False.

This change keeps backward compatibility.

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-12-13 20:35:25 +00:00
Eugene Yurtsev
ce90b25313
core[patch]: Update error message in indexing code for unreachable code assertion (#28712)
Minor update for error message that should never be triggered
2024-12-13 20:21:14 +00:00
Keiichi Hirobe
da28cf1f54
core[patch]: Reverts PR #25754 and add unit tests (#28702)
I reported the bug 2 weeks ago here:
https://github.com/langchain-ai/langchain/issues/28447

I believe this is a critical bug for the indexer, so I submitted a PR to
revert the change and added unit tests to prevent similar bugs from
being introduced in the future.

@eyurtsev Could you check this?
2024-12-13 15:13:06 -05:00
Bagatur
e6a62d8422
core,langchain,community[patch]: allow langsmith 0.2 (#28598) 2024-12-10 18:50:58 +00:00
Bagatur
e24f86e55f
core[patch]: return ToolMessage from tool (#28605) 2024-12-10 09:59:38 +00:00
Erick Friis
ef2f875dfb
core: deprecate PipelinePromptTemplate (#28644) 2024-12-10 03:56:48 +00:00
Filip Ratajczak
4e743b5427
Core: google docstring parsing fix (#28404)
Thank you for contributing to LangChain!

- [ ] **PR title**: "core: google docstring parsing fix"


- [x] **PR message**:
- **Description:** Added a solution for invalid parsing of google
docstring such as:
    Args:
net_annual_income (float): The user's net annual income (in current year
dollars).
- **Issue:** Previous code would return arg = "net_annual_income
(float)" which would cause exception in
_validate_docstring_args_against_annotations
    - **Dependencies:** None

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-10 00:27:25 +00:00
Mohammad Mohtashim
ec9b41431e
[Core]: Small Docstring Clarification for BaseTool (#28148)
- **Description:** `kwargs` are not being passed to `run` of the
`BaseTool` which has been fixed
- **Issue:** #28114

---------

Co-authored-by: Stevan Kapicic <kapicic.ste1@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-09 06:10:19 +00:00
Fahim Zaman
481c4bfaba
core[patch]: Fixed trim functions, and added corresponding unit test for the solved issue (#28429)
- **Description:** 
- Trim functions were incorrectly deleting nodes with more than 1
outgoing/incoming edge, so an extra condition was added to check for
this directly. A unit test "test_trim_multi_edge" was written to test
this test case specifically.
- **Issue:** 
  - Fixes #28411 
  - Fixes https://github.com/langchain-ai/langgraph/issues/1676
- **Dependencies:** 
  - No changes were made to the dependencies

- [x] Unit tests were added to verify the changes.
- [x] Updated documentation where necessary.
- [x] Ran make format, make lint, and make test to ensure compliance
with project standards.

---------

Co-authored-by: Tasif Hussain <tasif006@gmail.com>
2024-12-08 20:45:28 -08:00
Erick Friis
18386c16c7
core, tests: more tolerant _aget_relevant_documents function (#28462) 2024-12-06 00:49:30 +00:00
Erick Friis
478def8dcc
core: deprecation doc removal (#28553)
![ScreenShot 2024-12-05 at 02 33
43PM@2x](https://github.com/user-attachments/assets/e1ce495b-90ca-41c7-9a65-b403a934675c)
2024-12-05 15:35:28 -08:00
William FH
ecee41ab72
fix: Handle response metadata in merge_messages_runs (#28453) 2024-12-02 13:56:23 -08:00
Eugene Yurtsev
a813d11c14
core[patch]: Compat pydantic 2.10 (#28308)
pydantic 2.10 compat for langchain-core
2024-11-22 21:44:55 -05:00
ccurme
a433039a56
core[patch]: support final AIMessage responses in tool_example_to_messages (#28267)
We have a test
[test_structured_few_shot_examples](ad4333ca03/libs/standard-tests/langchain_tests/integration_tests/chat_models.py (L546))
in standard integration tests that implements a version of tool-calling
few shot examples that works with ~all tested providers. The formulation
supported by ~all providers is: `human message, tool call, tool message,
AI reponse`.

Here we update
`langchain_core.utils.function_calling.tool_example_to_messages` to
support this formulation.

The `tool_example_to_messages` util is undocumented outside of our API
reference. IMO, if we are testing that this function works across all
providers, it can be helpful to feature it in our guides. The structured
few-shot examples we document at the moment require users to implement
this function and can be simplified.
2024-11-22 15:38:49 +00:00
Erick Friis
b3ee1f8713
core: add space at end of error message link (#28270) 2024-11-21 22:19:59 +00:00
Eugene Yurtsev
5599a0a537
core[minor]: Add other langgraph packages to sys_info (#28190)
Add other langgraph packages to sys_info output
2024-11-19 09:20:25 -05:00
Vadym Barda
ed4952e475
core[patch]: add caching to get_function_nonlocals (#28131) 2024-11-15 07:53:53 -08:00
Eric Pinzur
eadc2f6a90
core: added DeleteResponse to the module (#28069)
Description:
* added `DeleteResponse` to the `langchain_core.indexing` module, for
implementing DocumentIndex classes.
2024-11-13 11:08:08 -05:00
ZhangShenao
c89e7ce8b5
core[patch]: Update doc-strings in callbacks (#28073)
- Fix api docs
2024-11-13 11:07:15 -05:00
Vadym Barda
48ee322a78
partners: add xAI chat integration (#28032) 2024-11-12 15:11:29 -05:00
ccurme
1538ee17f9
anthropic[major]: support python 3.13 (#27916)
Last week Anthropic released version 0.39.0 of its python sdk, which
enabled support for Python 3.13. This release deleted a legacy
`client.count_tokens` method, which we currently access during init of
the `Anthropic` LLM. Anthropic has replaced this functionality with the
[client.beta.messages.count_tokens()
API](https://github.com/anthropics/anthropic-sdk-python/pull/726).

To enable support for `anthropic >= 0.39.0` and Python 3.13, here we
drop support for the legacy token counting method, and add support for
the new method via `ChatAnthropic.get_num_tokens_from_messages`.

To fully support the token counting API, we update the signature of
`get_num_tokens_from_message` to accept tools everywhere.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-11-12 14:31:07 -05:00
takahashi
482c168b3e
langchain_core: add file_type option to make file type default as png (#27855)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"

- [ ] **description**
langchain_core.runnables.graph_mermaid.draw_mermaid_png calls this
function, but the Mermaid API returns JPEG by default. To be consistent,
add the option `file_type` with the default `png` type.

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
With this small change, I didn't add tests and docs.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more:
One long sentence was divided into two.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-11-06 22:37:07 +00:00
Bagatur
60123bef67
docs: fix trim_messages docstring (#27948) 2024-11-06 22:25:13 +00:00
Bagatur
67ce05a0a7
core[patch]: make oai tool description optional (#27756) 2024-11-06 18:06:47 +00:00
Jun Yamog
830cad7bc0
core: fix CommaSeparatedListOutputParser to handle columns that may contain commas in it (#26365)
- **Description:**
Currently CommaSeparatedListOutputParser can't handle strings that may
contain commas within a column. It would parse any commas as the
delimiter.
Ex. 
"foo, foo2", "bar", "baz"

It will create 4 columns: "foo", "foo2", "bar", "baz"

This should be 3 columns:

"foo, foo2", "bar", "baz"

- **Dependencies:**
Added 2 additional imports, but they are built in python packages.

import csv
from io import StringIO

- **Twitter handle:** @jkyamog

- [ ] **Add tests and docs**: 
1. added simple unit test test_multiple_items_with_comma

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-11-01 22:42:24 +00:00
William FH
b4cb2089a2
langchain[patch]: Add warning in react agent (#26980) 2024-10-31 22:29:34 +00:00
Ant White
e3ea365725
core: use friendlier names for duplicated nodes in mermaid output (#27747)
Thank you for contributing to LangChain!

- [x] **PR title**: "core: use friendlier names for duplicated nodes in
mermaid output"

- **Description:** When generating the Mermaid visualization of a chain,
if the chain had multiple nodes of the same type, the reid function
would replace their names with the UUID node_id. This made the generated
graph difficult to understand. This change deduplicates the nodes in a
chain by appending an index to their names.
- **Issue:** None
- **Discussion:**
https://github.com/langchain-ai/langchain/discussions/27714
- **Dependencies:** None

- [ ] **Add tests and docs**:  
- Currently this functionality is not covered by unit tests, happy to
add tests if you'd like


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

# Example Code:
```python
from langchain_core.runnables import RunnablePassthrough

def fake_llm(prompt: str) -> str: # Fake LLM for the example
    return "completion"

runnable = {
    'llm1':  fake_llm,
    'llm2':  fake_llm,
} | RunnablePassthrough.assign(
    total_chars=lambda inputs: len(inputs['llm1'] + inputs['llm2'])
)

print(runnable.get_graph().draw_mermaid(with_styles=False))
```

# Before
```mermaid
graph TD;
	Parallel_llm1_llm2_Input --> 0b01139db5ed4587ad37964e3a40c0ec;
	0b01139db5ed4587ad37964e3a40c0ec --> Parallel_llm1_llm2_Output;
	Parallel_llm1_llm2_Input --> a98d4b56bd294156a651230b9293347f;
	a98d4b56bd294156a651230b9293347f --> Parallel_llm1_llm2_Output;
	Parallel_total_chars_Input --> Lambda;
	Lambda --> Parallel_total_chars_Output;
	Parallel_total_chars_Input --> Passthrough;
	Passthrough --> Parallel_total_chars_Output;
	Parallel_llm1_llm2_Output --> Parallel_total_chars_Input;
```

# After
```mermaid
graph TD;
	Parallel_llm1_llm2_Input --> fake_llm_1;
	fake_llm_1 --> Parallel_llm1_llm2_Output;
	Parallel_llm1_llm2_Input --> fake_llm_2;
	fake_llm_2 --> Parallel_llm1_llm2_Output;
	Parallel_total_chars_Input --> Lambda;
	Lambda --> Parallel_total_chars_Output;
	Parallel_total_chars_Input --> Passthrough;
	Passthrough --> Parallel_total_chars_Output;
	Parallel_llm1_llm2_Output --> Parallel_total_chars_Input;
```
2024-10-31 16:52:00 -04:00
Bagatur
e4e2aa0b78
core[patch]: update image util err msg (#27803) 2024-10-31 10:56:43 -07:00
Bagatur
c1e742347f
core[patch]: rm image loading (#27797) 2024-10-31 10:34:51 -07:00
ZhangShenao
ad0387ac97
Improvement [docs] Improve api docs (#27787)
- Add missing param
- Remove unused param

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-31 16:56:44 +00:00
Bagatur
5d337326b0
core[patch]: make get_all_basemodel_annotations public (#27761) 2024-10-30 14:43:29 -07:00
Bagatur
94ea950c6c
core[patch]: support bedrock converse -> openai tool (#27754) 2024-10-30 12:20:39 -07:00
William FH
5a2cfb49e0
Support message trimming on single messages (#27729)
Permit trimming message lists of length 1
2024-10-30 04:27:52 +00:00
Harsimran-19
c1d8c33df6
core: JsonOutputParser UTF characters bug (#27306)
**Description:**
This PR fixes an issue where non-ASCII characters in Pydantic field
descriptions were being escaped to their Unicode representations when
using `JsonOutputParser`. The change allows non-ASCII characters to be
preserved in the output, which is especially important for multilingual
support and when working with non-English languages.

**Issue:** Fixes #27256

**Example Code:**
```python
from pydantic import BaseModel, Field
from langchain_core.output_parsers import JsonOutputParser

class Article(BaseModel):
    title: str = Field(description="科学文章的标题")

output_data_structure = Article
parser = JsonOutputParser(pydantic_object=output_data_structure)
print(parser.get_format_instructions())
```
**Previous Output**:
```... "title": {"description": "\\u79d1\\u5b66\\u6587\\u7ae0\\u7684\\u6807\\u9898", "title": "Title", "type": "string"}} ...```

**Current Output**:
```... "title": {"description": "科学文章的标题", "title": "Title", "type":
"string"}} ...```

**Changes made**:
- Modified `json.dumps()` call in
`langchain_core/output_parsers/json.py` to use `ensure_ascii=False`
- Added a unit test to verify Unicode handling

Co-authored-by: Harsimran-19 <harsimran1869@gmail.com>
2024-10-29 14:48:53 +00:00
Neil Vachharajani
eec35672a4
core[patch]: Improve type checking for the tool decorator (#27460)
**Description:**

When annotating a function with the @tool decorator, the symbol should
have type BaseTool. The previous type annotations did not convey that to
type checkers. This patch creates 4 overloads for the tool function for
the 4 different use cases.

1. @tool decorator with no arguments
2. @tool decorator with only keyword arguments
3. @tool decorator with a name argument (and possibly keyword arguments)
4. Invoking tool as function with a name and runnable positional
arguments

The main function is updated to match the overloads. The changes are
100% backwards compatible (all existing calls should continue to work,
just with better type annotations).

**Twitter handle:** @nvachhar

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-29 13:59:56 +00:00
Vincent Min
7bc4e320f1
core[patch]: improve performance of InMemoryVectorStore (#27538)
**Description:** We improve the performance of the InMemoryVectorStore.
**Isue:** Originally, similarity was computed document by document:
```
for doc in self.store.values():
            vector = doc["vector"]
            similarity = float(cosine_similarity([embedding], [vector]).item(0))
```
This is inefficient and does not make use of numpy vectorization.
This PR computes the similarity in one vectorized go:
```
docs = list(self.store.values())
similarity = cosine_similarity([embedding], [doc["vector"] for doc in docs])
```
**Dependencies:** None
**Twitter handle:** @b12_consulting, @Vincent_Min

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-25 17:07:04 -04:00
Erick Friis
600b7bdd61
all: test 3.13 ci (#27197)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-10-25 12:56:58 -07:00
Erick Friis
265e0a164a
core: add flake8-bandit (S) ruff rules to core (#27368)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-24 22:33:41 +00:00
Tibor Reiss
20b56a0233
core[patch]: fix repr and str for Serializable (#26786)
Fixes #26499

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-10-24 08:36:35 -07:00
Bagatur
968dccee04
core[patch]: convert_to_openai_tool Anthropic support (#27591) 2024-10-23 12:27:06 -07:00
Chun Kang Lu
380449a7a9
core: fix Image prompt template hardcoded template format (#27495)
Fixes #27411 

**Description:** Adds `template_format` to the `ImagePromptTemplate`
class and updates passing in the `template_format` parameter from
ChatPromptTemplate instead of the hardcoded "f-string".
Also updated docs and typing related to `template_format` to be more
up-to-date and specific.

**Dependencies:** None

**Add tests and docs**: Added unit tests to validate fix. Needed to
update `test_chat` snapshot due to adding new attribute
`template_format` in `ImagePromptTemplate`.

---------

Co-authored-by: Vadym Barda <vadym@langchain.dev>
2024-10-21 17:31:40 -04:00
Erick Friis
0ebddabf7d
docs, core: error messaging [wip] (#27397) 2024-10-17 03:39:36 +00:00
Bagatur
a4392b070d core[patch]: add convert_to_openai_messages util (#27263)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 17:10:10 +00:00
Eugene Yurtsev
5b9b8fe80f
core[patch]: Ignore ASYNC110 to upgrade to newest ruff version (#27229)
Ignoring ASYNC110 with explanation
2024-10-09 11:25:58 -04:00
Bagatur
e3e9ee8398
core[patch]: utils for adding/subtracting usage metadata (#27203) 2024-10-08 13:15:33 -07:00