Compare commits

...

691 Commits

Author SHA1 Message Date
William Fu-Hinthorn
79fa84a7ff Merge branch 'master' into wfh/add_tool_param_descripts 2024-06-20 13:53:21 -07:00
William Fu-Hinthorn
5a2f2bd615 Merge 2024-06-20 13:52:28 -07:00
Bagatur
07990b437d docs: fix chat model methods table (#23233)
rst table not md
![Screenshot 2024-06-20 at 12 37 46
PM](https://github.com/langchain-ai/langchain/assets/22008038/7a03b869-c1f4-45d0-8d27-3e16f4c6eb19)
2024-06-20 13:52:28 -07:00
Zheng Robert Jia
55003c5ada docs[minor],community[patch]: Minor tutorial docs improvement, minor import error quick fix. (#22725)
minor changes to module import error handling and minor issues in
tutorial documents.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
2ed41433da core[patch]: Fix doc-strings for code blocks (#23232)
Code blocks need extra space around them to be rendered properly by
sphinx
2024-06-20 13:52:28 -07:00
Luis Moros
3a44ec62e4 community[patch]: Fix sql_databse.from_databricks issue when ran from Job (#23224)
**Desscription**: When the ``sql_database.from_databricks`` is executed
from a Workflow Job, the ``context`` object does not have a
"browserHostName" property, resulting in an error. This change manages
the error so the "DATABRICKS_HOST" env variable value is used instead of
stoping the flow

Co-authored-by: lmorosdb <lmorosdb>
2024-06-20 13:52:28 -07:00
Cory Waddingham
1bb15e3ce4 pinecone[patch]: Update Poetry requirements for pinecone-client >=3.2.2 (#22094)
This change updates the requirements in
`libs/partners/pinecone/pyproject.toml` to allow all versions of
`pinecone-client` greater than or equal to 3.2.2.

This change resolves issue
[21955](https://github.com/langchain-ai/langchain/issues/21955).

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:28 -07:00
ccurme
412cd3f7e2 docs: clarify streaming with RunnableLambda (#23228) 2024-06-20 13:52:28 -07:00
ccurme
2cf9210999 docs: add serialization guide (#23223) 2024-06-20 13:52:28 -07:00
Eugene Yurtsev
aa10e39f12 core[patch]: Add clarification about streaming to RunnableLambda (#23227)
Add streaming clarification to runnable lambda docstring.
2024-06-20 13:52:28 -07:00
Jacob Lee
c2b2f1394b docs[patch]: Update Anthropic chat model docs (#23226)
CC @baskaryan
2024-06-20 13:52:28 -07:00
maang-h
9375705162 community[patch]: Update root_validators ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI (#22853)
This PR updates root validators for:

- ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat,
ChatSparkLLM, ChatZhipuAI

Issues #22819

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:28 -07:00
ChrisDEV
45db600671 Fix return value type of dumpd (#20123)
The return type of `json.loads` is `Any`.

In fact, the return type of `dumpd` must be based on `json.loads`, so
the correction here is understandable.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:28 -07:00
Guangdong Liu
a98f7df1b9 core(patch): Fix encoding problem of load_prompt method (#21559)
- description: Add encoding parameters.
- @baskaryan, @efriis, @eyurtsev, @hwchase17.


![54d25ac7b1d5c2e47741a56fe8ed8ba](https://github.com/langchain-ai/langchain/assets/48236177/ffea9596-2001-4e19-b245-f8a6e231b9f9)
2024-06-20 13:52:28 -07:00
Philippe PRADOS
3cb9c4c9bd core[minor]: Adds an in-memory implementation of RecordManager (#13200)
**Description:**
langchain offers three technologies to save data:
-
[vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/)
- [docstore](https://js.langchain.com/docs/api/schema/classes/Docstore)
- [record
manager](https://python.langchain.com/docs/modules/data_connection/indexing)

If you want to combine these technologies in a sample persistence
stategy you need a common implementation for each. `DocStore` propose
`InMemoryDocstore`.

We propose the class `MemoryRecordManager` to complete the system.

This is the prelude to another full-request, which needs a consistent
combination of persistence components.

**Tag maintainer:**
@baskaryan

**Twitter handle:**
@pprados

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
505f489071 docs: API reference remove Prev/Up/Next buttons (#23225)
These do not work anyway. Let's remove them for now for simplicity.
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
02b8e2e83d docs: Update clean up API reference (#23221)
- Fix bug with TypedDicts rendering inherited methods if inherting from
  typing_extensions.TypedDict rather than typing.TypedDict
- Do not surface inherited pydantic methods for subclasses of BaseModel
- Subclasses of RunnableSerializable will not how methods inherited from
  Runnable or from BaseModel
- Subclasses of Runnable that not pydantic models will include a link to
RunnableInterface (they still show inherited methods, we can fix this
later)
2024-06-20 13:52:28 -07:00
Leonid Ganeline
d9eafd42ea community: docstrings (#23202)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:28 -07:00
Julian Weng
5f7a1ea140 partners[minor]: Fix value error message for with_structured_output (#22877)
Currently, calling `with_structured_output()` with an invalid method
argument raises `Unrecognized method argument. Expected one of
'function_calling' or 'json_format'`, but the JSON mode option [is now
referred
to](https://python.langchain.com/v0.2/docs/how_to/structured_output/#the-with_structured_output-method)
by `'json_mode'`. This fixes that.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:28 -07:00
Qingchuan Hao
82a3d270e7 doc: replace function all with tool call (#23184)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:28 -07:00
Yahkeef Davis
d84052e6e5 Docs: Update Rag tutorial so it includes an additional notebook cell with pip installs of required langchain_chroma and langchain_community. (#23204)
Description: Update Rag tutorial notebook so it includes an additional
notebook cell with pip installs of required langchain_chroma and
langchain_community.

This fixes the issue with the rag tutorial gives you a 'missing modules'
error if you run code in the notebook as is.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:28 -07:00
Leonid Ganeline
e864baaf40 huggingface: docstrings (#23148)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:28 -07:00
ccurme
32953f42ba huggingface[patch]: fix CI for python 3.12 (#23197) 2024-06-20 13:52:28 -07:00
xyd
a6158c09f6 fix https://github.com/langchain-ai/langchain/issues/23215 (#23216)
fix bug 
The ZhipuAIEmbeddings class is not working.

Co-authored-by: xu yandong <shaonian@acsx1.onexmail.com>
2024-06-20 13:52:28 -07:00
Bagatur
38bf2ca14a standard-tests[patch]: test stop not stop_sequences (#23200) 2024-06-20 13:52:28 -07:00
Bagatur
34c3bab6a4 docs: standard params (#23199) 2024-06-20 13:52:28 -07:00
David DeCaprio
4c91188a6e core:Add optional max_messages to MessagePlaceholder (#16098)
- **Description:** Add optional max_messages to MessagePlaceholder
- **Issue:**
[16096](https://github.com/langchain-ai/langchain/issues/16096)
- **Dependencies:** None
- **Twitter handle:** @davedecaprio

Sometimes it's better to limit the history in the prompt itself rather
than the memory. This is needed if you want different prompts in the
chain to have different history lengths.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:28 -07:00
shaunakgodbole
039bd6b373 fireworks[patch]: fix api_key alias in Fireworks LLM (#23118)
Thank you for contributing to LangChain!

**Description**
The current code snippet for `Fireworks` had incorrect parameters. This
PR fixes those parameters.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
c37adacfb7 core[patch]: Document agent schema (#23194)
* Document agent schema
* Refer folks to langgraph for more information on how to create agents.
2024-06-20 13:52:28 -07:00
Bagatur
30e5e92d66 infra: run CI on large diffs (#23192)
currently we skip CI on diffs >= 300 files. think we should just run it
on all packages instead

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
6f994dbca3 core[patch]: Document messages namespace (#23154)
- Moved doc-strings below attribtues in TypedDicts -- seems to render
better on APIReference pages.
* Provided more description and some simple code examples
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
7c2ca1ef0d core[patch]: Add doc-strings to outputs, fix @root_validator (#23190)
- Document outputs namespace
- Update a vanilla @root_validator that was missed
2024-06-20 13:52:28 -07:00
Bagatur
02836772ce infra: add more formatter rules to openai (#23189)
Turns on
https://docs.astral.sh/ruff/settings/#format_docstring-code-format and
https://docs.astral.sh/ruff/settings/#format_skip-magic-trailing-comma

```toml
[tool.ruff.format]
docstring-code-format = true
skip-magic-trailing-comma = true
```
2024-06-20 13:52:27 -07:00
Michał Krassowski
7aa67d9dac community[patch]: restore compatibility with SQLAlchemy 1.x (#22546)
- **Description:** Restores compatibility with SQLAlchemy 1.4.x that was
broken since #18992 and adds a test run for this version on CI (only for
Python 3.11)
- **Issue:** fixes #19681
- **Dependencies:** None
- **Twitter handle:** `@krassowski_m`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:27 -07:00
Erick Friis
49441cdfeb upstage: move to external repo (#22506) 2024-06-20 13:52:27 -07:00
Bagatur
bd16729412 openai[patch]: image token counting (#23147)
Resolves #23000

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:27 -07:00
Jorge Piedrahita Ortiz
eeb5c1ba71 community[patch]: sambanova llm integration improvement (#23137)
- **Description:** sambanova sambaverse integration improvement: removed
input parsing that was changing raw user input, and was making to use
process prompt parameter as true mandatory
2024-06-20 13:52:27 -07:00
Jorge Piedrahita Ortiz
03503ec44a community[patch]: update sambastudio embeddings (#23133)
Description: update sambastudio embeddings integration, now compatible
with generic endpoints and CoE endpoints
2024-06-20 13:52:27 -07:00
Philippe PRADOS
ca8717afae langchain[small]: Change type to BasePromptTemplate (#23083)
```python
Change from_llm(
 prompt: PromptTemplate 
 ...
 )
```
 to
```python
Change from_llm(
 prompt: BasePromptTemplate 
 ...
 )
```
2024-06-20 13:52:27 -07:00
Sergey Kozlov
7cd38097f9 core[patch[: add exceptions propagation test for astream_events v2 (#23159)
**Description:** `astream_events(version="v2")` didn't propagate
exceptions in `langchain-core<=0.2.6`, fixed in the #22916. This PR adds
a unit test to check that exceptions are propagated upwards.

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2024-06-20 13:52:27 -07:00
Leonid Ganeline
54cc5dd1a2 prompty: docstring (#23152)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:27 -07:00
Qingchuan Hao
89fbe079cd docs: add bing search tool to ms platform (#23183)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:27 -07:00
chenxi
4fa60db884 fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176)
Close #23174

Co-authored-by: tianming <tianming@bytenew.com>
2024-06-20 13:52:27 -07:00
Bagatur
0b453f8112 core[patch]: fix chat history circular import (#23182) 2024-06-20 13:52:27 -07:00
Eugene Yurtsev
3debf68cac core[patch]: Add an example to the Document schema doc-string (#23131)
Add an example to the document schema
2024-06-20 13:52:27 -07:00
ccurme
63c8caabc6 core[patch]: update test to catch circular imports (#23172)
This raises ImportError due to a circular import:
```python
from langchain_core import chat_history
```

This does not:
```python
from langchain_core import runnables
from langchain_core import chat_history
```

Here we update `test_imports` to run each import in a separate
subprocess. Open to other ways of doing this!
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
e3b6ca35ce core[patch]: Add documentation to load namespace (#23143)
Document some of the modules within the load namespace
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
c09c8b79ae core[patch]: Add doc-string to document compressor (#23085) 2024-06-20 13:52:27 -07:00
Eugene Yurtsev
1843d85fce community[patch]: Prevent unit tests from making network requests (#23180)
* Prevent unit tests from making network requests
2024-06-20 13:52:27 -07:00
ccurme
3676f642dc community: move test to integration tests (#23178)
Tests failing on master with

> FAILED
tests/unit_tests/embeddings/test_ovhcloud.py::test_ovhcloud_embed_documents
- ValueError: Request failed with status code: 401, {"message":"Bad
token; invalid JSON"}
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
8f42570f23 core[patch]: Expand documentation in the indexing namespace (#23134) 2024-06-20 13:52:27 -07:00
Eugene Yurtsev
93179c26ef core[patch]: Document embeddings namespace (#23132)
Document embeddings namespace
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
eab3159589 core[patch]: Update documentation in LLM namespace (#23138)
Update documentation in lllm namespace.
2024-06-20 13:52:27 -07:00
Leonid Ganeline
53183c166e ai21: docstrings (#23142)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:27 -07:00
Jacob Lee
f26e6e58d4 docs[patch]: Standardize prerequisites in tutorial docs (#23150)
CC @baskaryan
2024-06-20 13:52:27 -07:00
bilk0h
3eed107f3a text-splitters: Fix/recursive json splitter data persistence issue (#21529)
Thank you for contributing to LangChain!

**Description:** Noticed an issue with when I was calling
`RecursiveJsonSplitter().split_json()` multiple times that I was getting
weird results. I found an issue where `chunks` list in the `_json_split`
method. If chunks is not provided when _json_split (which is the case
when split_json calls _json_split) then the same list is used for
subsequent calls to `_json_split`.


You can see this in the test case i also added to this commit.

Output should be: 
```
[{'a': 1, 'b': 2}]
[{'c': 3, 'd': 4}]
```

Instead you get:
```
[{'a': 1, 'b': 2}]
[{'a': 1, 'b': 2, 'c': 3, 'd': 4}]
```

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-20 13:52:27 -07:00
Yuki Watanabe
c34d4699c3 docs: Overhaul Databricks components documentation (#22884)
**Description:** Documentation at
[integrations/llms/databricks](https://python.langchain.com/v0.2/docs/integrations/llms/databricks/)
is not up-to-date and includes examples about chat model and embeddings,
which should be located in the different corresponding subdirectories.
This PR split the page into correct scope and overhaul the contents.

**Note**: This PR might be hard to review on the diffs view, please use
the following preview links for the changed pages.
- `ChatDatabricks`:
https://langchain-git-fork-b-step62-chat-databricks-doc-langchain.vercel.app/v0.2/docs/integrations/chat/databricks/
- `Databricks`:
https://langchain-git-fork-b-step62-chat-databricks-doc-langchain.vercel.app/v0.2/docs/integrations/llms/databricks/
- `DatabricksEmbeddings`:
https://langchain-git-fork-b-step62-chat-databricks-doc-langchain.vercel.app/v0.2/docs/integrations/text_embedding/databricks/

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2024-06-20 13:52:27 -07:00
鹿鹿鹿鲨
32628ec5f5 community: add **request_kwargs and expect TimeError AsyncHtmlLoader (#23068)
- **Description:** add `**request_kwargs` and expect `TimeError` in
`_fetch` function for AsyncHtmlLoader. This allows you to fill in the
kwargs parameter when using the `load()` method of the `AsyncHtmlLoader`
class.

Co-authored-by: Yucolu <yucolu@tencent.com>
2024-06-20 13:52:27 -07:00
Leonid Ganeline
8225f75102 ibm: docstrings (#23149)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:27 -07:00
Ryan Elston
f9b99411c0 text-splitters: Introduce Experimental Markdown Syntax Splitter (#22257)
#### Description
This MR defines a `ExperimentalMarkdownSyntaxTextSplitter` class. The
main goal is to replicate the functionality of the original
`MarkdownHeaderTextSplitter` which extracts the header stack as metadata
but with one critical difference: it keeps the whitespace of the
original text intact.

This draft reimplements the `MarkdownHeaderTextSplitter` with a very
different algorithmic approach. Instead of marking up each line of the
text individually and aggregating them back together into chunks, this
method builds each chunk sequentially and applies the metadata to each
chunk. This makes the implementation simpler. However, since it's
designed to keep white space intact its not a full drop in replacement
for the original. Since it is a radical implementation change to the
original code and I would like to get feedback to see if this is a
worthwhile replacement, should be it's own class, or is not a good idea
at all.

Note: I implemented the `return_each_line` parameter but I don't think
it's a necessary feature. I'd prefer to remove it.

This implementation also adds the following additional features:
- Splits out code blocks and includes the language in the `"Code"`
metadata key
- Splits text on the horizontal rule `---` as well
- The `headers_to_split_on` parameter is now optional - with sensible
defaults that can be overridden.

#### Issue
Keeping the whitespace keeps the paragraphs structure and the formatting
of the code blocks intact which allows the caller much more flexibility
in how they want to further split the individuals sections of the
resulting documents. This addresses the issues brought up by the
community in the following issues:
- https://github.com/langchain-ai/langchain/issues/20823
- https://github.com/langchain-ai/langchain/issues/19436
- https://github.com/langchain-ai/langchain/issues/22256

#### Dependencies
N/A

#### Twitter handle
@RyanElston

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-20 13:52:27 -07:00
Bagatur
38da833159 anthropic[patch]: test image input (#23155) 2024-06-20 13:52:27 -07:00
Leonid Ganeline
5fa5a976ba anthropic: docstrings (#23145)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:27 -07:00
Bagatur
86bea8a671 openai[patch], standard-tests[patch]: don't pass in falsey stop vals (#23153)
adds an image input test to standard-tests as well
2024-06-20 13:52:27 -07:00
Bagatur
d193bc89bb core[patch]: runnablewithchathistory from core.runnables (#23136) 2024-06-20 13:52:27 -07:00
Jacob Lee
5a15fc44e4 docs[patch]: Fix typo in feedback (#23146) 2024-06-20 13:52:27 -07:00
Jacob Lee
b7c92d109d docs[patch]: Adds feedback input after thumbs up/down (#23141)
CC @baskaryan
2024-06-20 13:52:27 -07:00
Bagatur
32cd043b4d docs: use trim_messages in chatbot how to (#23139) 2024-06-20 13:52:27 -07:00
Vadym Barda
1182004070 core[minor]: handle boolean data in draw_mermaid (#23135)
This change should address graph rendering issues for edges with boolean
data

Example from langgraph:

```python
from typing import Annotated, TypedDict

from langchain_core.messages import AnyMessage
from langgraph.graph import END, START, StateGraph
from langgraph.graph.message import add_messages


class State(TypedDict):
    messages: Annotated[list[AnyMessage], add_messages]


def branch(state: State) -> bool:
    return 1 + 1 == 3


graph_builder = StateGraph(State)
graph_builder.add_node("foo", lambda state: {"messages": [("ai", "foo")]})
graph_builder.add_node("bar", lambda state: {"messages": [("ai", "bar")]})

graph_builder.add_conditional_edges(
    START,
    branch,
    path_map={True: "foo", False: "bar"},
    then=END,
)

app = graph_builder.compile()
print(app.get_graph().draw_mermaid())
```

Previous behavior:

```python
AttributeError: 'bool' object has no attribute 'split'
```

Current behavior:

```python
%%{init: {'flowchart': {'curve': 'linear'}}}%%
graph TD;
	__start__[__start__]:::startclass;
	__end__[__end__]:::endclass;
	foo([foo]):::otherclass;
	bar([bar]):::otherclass;
	__start__ -. ('a',) .-> foo;
	foo --> __end__;
	__start__ -. ('b',) .-> bar;
	bar --> __end__;
	classDef startclass fill:#ffdfba;
	classDef endclass fill:#baffc9;
	classDef otherclass fill:#fad7de;
```
2024-06-20 13:52:27 -07:00
Bagatur
d9d60fdb14 core[patch]: Pin pydantic in py3.12.4 (#23130) 2024-06-20 13:52:27 -07:00
hmasdev
6bd195e6f1 langchain[patch]: fix OutputType of OutputParsers and fix legacy API in OutputParsers (#19792)
# Description

This pull request aims to address specific issues related to the
ambiguity and error-proneness of the output types of certain output
parsers, as well as the absence of unit tests for some parsers. These
issues could potentially lead to runtime errors or unexpected behaviors
due to type mismatches when used, causing confusion for developers and
users. Through clarifying output types, this PR seeks to improve the
stability and reliability.

Therefore, this pull request

- fixes the `OutputType` of OutputParsers to be the expected type;
- e.g. `OutputType` property of `EnumOutputParser` raises `TypeError`.
This PR introduce a logic to extract `OutputType` from its attribute.
- and fixes the legacy API in OutputParsers like `LLMChain.run` to the
modern API like `LLMChain.invoke`;
- Note: For `OutputFixingParser`, `RetryOutputParser` and
`RetryWithErrorOutputParser`, this PR introduces `legacy` attribute with
False as default value in order to keep the backward compatibility
- and adds the tests for the `OutputFixingParser` and
`RetryOutputParser`.

The following table shows my expected output and the actual output of
the `OutputType` of OutputParsers.
I have used this table to fix `OutputType` of OutputParsers.

| Class Name of OutputParser | My Expected `OutputType` (after this PR)|
Actual `OutputType` [evidence](#evidence) (before this PR)| Fix Required
|
|---------|--------------|---------|--------|
| BooleanOutputParser | `<class 'bool'>` | `<class 'bool'>` | NO |
| CombiningOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| DatetimeOutputParser | `<class 'datetime.datetime'>` | `<class
'datetime.datetime'>` | NO |
| EnumOutputParser(enum=MyEnum) | `MyEnum` | `TypeError` is raised | YES
|
| OutputFixingParser | The same type as `self.parser.OutputType` | `~T`
| YES |
| CommaSeparatedListOutputParser | `typing.List[str]` |
`typing.List[str]` | NO |
| MarkdownListOutputParser | `typing.List[str]` | `typing.List[str]` |
NO |
| NumberedListOutputParser | `typing.List[str]` | `typing.List[str]` |
NO |
| JsonOutputKeyToolsParser | `typing.Any` | `typing.Any` | NO |
| JsonOutputToolsParser | `typing.Any` | `typing.Any` | NO |
| PydanticToolsParser | `typing.Any` | `typing.Any` | NO |
| PandasDataFrameOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| PydanticOutputParser(pydantic_object=MyModel) | `<class
'__main__.MyModel'>` | `<class '__main__.MyModel'>` | NO |
| RegexParser | `typing.Dict[str, str]` | `TypeError` is raised | YES |
| RegexDictParser | `typing.Dict[str, str]` | `TypeError` is raised |
YES |
| RetryOutputParser | The same type as `self.parser.OutputType` | `~T` |
YES |
| RetryWithErrorOutputParser | The same type as `self.parser.OutputType`
| `~T` | YES |
| StructuredOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| YamlOutputParser(pydantic_object=MyModel) | `MyModel` | `~T` | YES |

NOTE: In "Fix Required", "YES" means that it is required to fix in this
PR while "NO" means that it is not required.

# Issue

No issues for this PR.

# Twitter handle

- [hmdev3](https://twitter.com/hmdev3)

# Questions:

1. Is it required to create tests for legacy APIs `LLMChain.run` in the
following scripts?
   - libs/langchain/tests/unit_tests/output_parsers/test_fix.py;
   - libs/langchain/tests/unit_tests/output_parsers/test_retry.py.

2. Is there a more appropriate expected output type than I expect in the
above table?
- e.g. the `OutputType` of `CombiningOutputParser` should be
SOMETHING...

# Actual outputs (before this PR)

<div id='evidence'></div>

<details><summary>Actual outputs</summary>

## Requirements

- Python==3.9.13
- langchain==0.1.13

```python
Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import langchain
>>> langchain.__version__
'0.1.13'
>>> from langchain import output_parsers
```

### `BooleanOutputParser`

```python
>>> output_parsers.BooleanOutputParser().OutputType
<class 'bool'>
```

### `CombiningOutputParser`

```python
>>> output_parsers.CombiningOutputParser(parsers=[output_parsers.DatetimeOutputParser(), output_parsers.CommaSeparatedListOutputParser()]).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable CombiningOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `DatetimeOutputParser`

```python
>>> output_parsers.DatetimeOutputParser().OutputType
<class 'datetime.datetime'>
```

### `EnumOutputParser`

```python
>>> from enum import Enum
>>> class MyEnum(Enum):
...     a = 'a'
...     b = 'b'
...
>>> output_parsers.EnumOutputParser(enum=MyEnum).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable EnumOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `OutputFixingParser`

```python
>>> output_parsers.OutputFixingParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```

### `CommaSeparatedListOutputParser`

```python
>>> output_parsers.CommaSeparatedListOutputParser().OutputType
typing.List[str]
```

### `MarkdownListOutputParser`

```python
>>> output_parsers.MarkdownListOutputParser().OutputType
typing.List[str]
```

### `NumberedListOutputParser`

```python
>>> output_parsers.NumberedListOutputParser().OutputType
typing.List[str]
```

### `JsonOutputKeyToolsParser`

```python
>>> output_parsers.JsonOutputKeyToolsParser(key_name='tool').OutputType
typing.Any
```

### `JsonOutputToolsParser`

```python
>>> output_parsers.JsonOutputToolsParser().OutputType
typing.Any
```

### `PydanticToolsParser`

```python
>>> from langchain.pydantic_v1 import BaseModel
>>> class MyModel(BaseModel):
...     a: int
...
>>> output_parsers.PydanticToolsParser(tools=[MyModel, MyModel]).OutputType
typing.Any
```

### `PandasDataFrameOutputParser`

```python
>>> output_parsers.PandasDataFrameOutputParser().OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable PandasDataFrameOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `PydanticOutputParser`

```python
>>> output_parsers.PydanticOutputParser(pydantic_object=MyModel).OutputType
<class '__main__.MyModel'>
```

### `RegexParser`

```python
>>> output_parsers.RegexParser(regex='$', output_keys=['a']).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable RegexParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `RegexDictParser`

```python
>>> output_parsers.RegexDictParser(output_key_to_format={'a':'a'}).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable RegexDictParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `RetryOutputParser`

```python
>>> output_parsers.RetryOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```

### `RetryWithErrorOutputParser`

```python
>>> output_parsers.RetryWithErrorOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```

### `StructuredOutputParser`

```python
>>> from langchain.output_parsers.structured import ResponseSchema
>>> response_schemas = [ResponseSchema(name="foo",description="a list of strings",type="List[string]"),ResponseSchema(name="bar",description="a string",type="string"), ]
>>> output_parsers.StructuredOutputParser.from_response_schemas(response_schemas).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable StructuredOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `YamlOutputParser`

```python
>>> output_parsers.YamlOutputParser(pydantic_object=MyModel).OutputType
~T
```


<div>

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:27 -07:00
Artem Mukhin
7b92579ecc docs: Fix URL formatting in deprecation warnings (#23075)
**Description**

Updated the URLs in deprecation warning messages. The URLs were
previously written as raw strings and are now formatted to be clickable
HTML links.

Example of a broken link in the current API Reference:
https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html

<img width="942" alt="Screenshot 2024-06-18 at 13 21 07"
src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">
2024-06-20 13:52:27 -07:00
Gabriel Petracca
329963c182 community[minor]: Implement Doctran async execution (#22372)
**Description**

The DoctranTextTranslator has an async transform function that was not
implemented because [the doctran
library](https://github.com/psychic-api/doctran) uses a sync version of
the `execute` method.

- I implemented the `DoctranTextTranslator.atransform_documents()`
method using `asyncio.to_thread` to run the function in a separate
thread.
- I updated the example in the Notebook with the new async version.
- The performance improvements can be appreciated when a big document is
divided into multiple chunks.

Relates to:
- Issue #14645: https://github.com/langchain-ai/langchain/issues/14645
- Issue #14437: https://github.com/langchain-ai/langchain/issues/14437
- https://github.com/langchain-ai/langchain/pull/15264

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:26 -07:00
Eugene Yurtsev
0b03c05c63 core[minor]: Support multiple keys in get_from_dict_or_env (#23086)
Support passing multiple keys for ge_from_dict_or_env
2024-06-20 13:52:26 -07:00
nold
1d858b6d27 community: add args_schema to SearxSearch (#22954)
This change adds args_schema (pydantic BaseModel) to SearxSearchRun for
correct schema formatting on LLM function calls

Issue: currently using SearxSearchRun with OpenAI function calling
returns the following error "TypeError: SearxSearchRun._run() got an
unexpected keyword argument '__arg1' ".

This happens because the schema sent to the LLM is "input:
'{"__arg1":"foobar"}'" while the method should be called with the
"query" parameter.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:26 -07:00
Bagatur
49b7eae997 core[patch]: Release 0.2.9 (#23091) 2024-06-20 13:52:26 -07:00
Finlay Macklon
63ef548e45 community: glob multiple patterns when using DirectoryLoader (#22852)
- **Description:** Updated
*community.langchain_community.document_loaders.directory.py* to enable
the use of multiple glob patterns in the `DirectoryLoader` class. Now,
the glob parameter is of type `list[str] | str` and still defaults to
the same value as before. I updated the docstring of the class to
reflect this, and added a unit test to
*community.tests.unit_tests.document_loaders.test_directory.py* named
`test_directory_loader_glob_multiple`. This test also shows an example
of how to use the new functionality.
- ~~Issue:~~**Discussion Thread:**
https://github.com/langchain-ai/langchain/discussions/18559
- **Dependencies:** None
- **Twitter handle:** N/a

- [x] **Add tests and docs**
    - Added test (described above)
    - Updated class docstring

- [x] **Lint and test**

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-20 13:52:26 -07:00
Eugene Yurtsev
e6ba8582fa core[patch]: Document BaseStore (#23082)
Add doc-string to BaseStore
2024-06-20 13:52:26 -07:00
Takuya Igei
3436770303 core[patch],community[patch],langchain[patch]: tenacity dependency to version >=8.1.0,<8.4.0 (#22973)
Fix https://github.com/langchain-ai/langchain/issues/22972.

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:26 -07:00
Raghav Dixit
252f0d137e LanceDB example minor change (#23069)
Removed package version `0.6.13` in the example.
2024-06-20 13:52:26 -07:00
Bagatur
f610d80d31 docs: add trim_messages to chatbot (#23061) 2024-06-20 13:52:26 -07:00
Lance Martin
072a6f7421 Update Fireworks link (#23058) 2024-06-20 13:52:26 -07:00
Leonid Ganeline
61a9e64fbb docs: AWS platform page update (#23063)
Added a reference to the `GlueCatalogLoader` new document loader.
2024-06-20 13:52:26 -07:00
Raviraj
5f30785b70 SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895)
```SemanticChunker``` currently provide three methods to split the texts semantically:
- percentile
- standard_deviation
- interquartile

I propose new method ```gradient```. In this method, the gradient of distance is used to split chunks along with the percentile method (technically) . This method is useful when chunks are highly correlated with each other or specific to a domain e.g. legal or medical. The idea is to apply anomaly detection on gradient array so that the distribution become wider and easy to identify boundaries in highly semantic data.
I have tested this merge on a set of 10 domain specific documents (mostly legal).

Details : 
    - **Issue:** Improvement
    - **Dependencies:** NA
    - **Twitter handle:** [x.com/prajapat_ravi](https://x.com/prajapat_ravi)


@hwchase17

---------

Co-authored-by: Raviraj Prajapat <raviraj.prajapat@sirionlabs.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-20 13:52:26 -07:00
Raghav Dixit
6a45913686 LanceDB integration update (#22869)
Added : 

- [x] relevance search (w/wo scores)
- [x] maximal marginal search
- [x] image ingestion
- [x] filtering support
- [x] hybrid search w reranking 

make test, lint_diff and format checked.
2024-06-20 13:52:26 -07:00
Chang Liu
93bf95b82a community: add KafkaChatMessageHistory (#22216)
Add chat history store based on Kafka.

Files added: 
`libs/community/langchain_community/chat_message_histories/kafka.py`
`docs/docs/integrations/memory/kafka_chat_message_history.ipynb`

New issue to be created for future improvement:
1. Async method implementation.
2. Message retrieval based on timestamp.
3. Support for other configs when connecting to cloud hosted Kafka (e.g.
add `api_key` field)
4. Improve unit testing & integration testing.
2024-06-20 13:52:26 -07:00
shimajiroxyz
b9010d6c75 langchain: add id_key option to EnsembleRetriever for metadata-based document merging (#22950)
**Description:**
- What I changed
- By specifying the `id_key` during the initialization of
`EnsembleRetriever`, it is now possible to determine which documents to
merge scores for based on the value corresponding to the `id_key`
element in the metadata, instead of `page_content`. Below is an example
of how to use the modified `EnsembleRetriever`:
    ```python
retriever = EnsembleRetriever(retrievers=[ret1, ret2], id_key="id") #
The Document returned by each retriever must keep the "id" key in its
metadata.
    ```

- Additionally, I added a script to easily test the behavior of the
`invoke` method of the modified `EnsembleRetriever`.

- Why I changed
- There are cases where you may want to calculate scores by treating
Documents with different `page_content` as the same when using
`EnsembleRetriever`. For example, when you want to ensemble the search
results of the same document described in two different languages.
- The previous `EnsembleRetriever` used `page_content` as the basis for
score aggregation, making the above usage difficult. Therefore, the
score is now calculated based on the specified key value in the
Document's metadata.

**Twitter handle:** @shimajiroxyz
2024-06-20 13:52:26 -07:00
mackong
2d0b478c5a langchain[patch]: add tool messages formatter for tool calling agent (#22849)
- **Description:** add tool_messages_formatter for tool calling agent,
make tool messages can be formatted in different ways for your LLM.
  - **Issue:** N/A
  - **Dependencies:** N/A
2024-06-20 13:52:26 -07:00
Lucas Tucker
d649ec5d33 docs: Standardize DocumentLoader docstrings (#22932)
**Standardizing DocumentLoader docstrings (of which there are many)**

This PR addresses issue #22866 and adds docstrings according to the
issue's specified format (in the appendix) for files csv_loader.py and
json_loader.py in langchain_community.document_loaders. In particular,
the following sections have been added to both CSVLoader and JSONLoader:
Setup, Instantiate, Load, Async load, and Lazy load. It may be worth
adding a 'Metadata' section to the JSONLoader docstring to clarify how
we want to extract the JSON metadata (using the `metadata_func`
argument). The files I used to walkthrough the various sections were
`example_2.json` from
[HERE](https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files)
and `hw_200.csv` from
[HERE](https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html).

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-20 13:52:26 -07:00
Leonid Ganeline
8004929315 docs: embeddings classes (#22927)
Added a table with all Embedding classes.
2024-06-20 13:52:26 -07:00
Mohammad Mohtashim
9544e26bd8 [Community]: Fixed DDG DuckDuckGoSearchResults Docstring (#22968)
- **Description:** A very small fix in the Docstring of
`DuckDuckGoSearchResults` identified in the following issue.
- **Issue:** #22961

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:26 -07:00
Eun Hye Kim
58a9a33516 community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community) (#22977)
- **PR title**: "community: Fix #22975 (Add SSL Verification Option to
Requests Class in langchain_community)"
- **PR message**: 
    - **Description:**
- Added an optional verify parameter to the Requests class with a
default value of True.
- Modified the get, post, patch, put, and delete methods to include the
verify parameter.
- Updated the _arequest async context manager to include the verify
parameter.
- Added the verify parameter to the GenericRequestsWrapper class and
passed it to the Requests class.
    - **Issue:** This PR fixes issue #22975.
- **Dependencies:** No additional dependencies are required for this
change.
    - **Twitter handle:** @lunara_x

You can check this change with below code.
```python
from langchain_openai.chat_models import ChatOpenAI
from langchain.requests import RequestsWrapper
from langchain_community.agent_toolkits.openapi import planner
from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec

with open("swagger.yaml") as f:
    data = yaml.load(f, Loader=yaml.FullLoader)
swagger_api_spec = reduce_openapi_spec(data)

llm = ChatOpenAI(model='gpt-4o')
swagger_requests_wrapper = RequestsWrapper(verify=False) # modified point
superset_agent = planner.create_openapi_agent(swagger_api_spec, swagger_requests_wrapper, llm, allow_dangerous_requests=True, handle_parsing_errors=True)

superset_agent.run(
    "Tell me the number and types of charts and dashboards available."
)
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:26 -07:00
Mohammad Mohtashim
e96444413b [Community]: FIxed the DocumentDBVectorSearch _similarity_search_without_score (#22970)
- **Description:** The PR #22777 introduced a bug in
`_similarity_search_without_score` which was raising the
`OperationFailure` error. The mistake was syntax error for MongoDB
pipeline which has been corrected now.
    - **Issue:** #22770
2024-06-20 13:52:26 -07:00
Nuno Campos
923dfaa0e5 Include "no escape" and "inverted section" mustache vars in Prompt.input_variables and Prompt.input_schema (#22981) 2024-06-20 13:52:26 -07:00
Bella Be
58cfb8048f docs: Update how to docs for pydantic compatibility (#22983)
Add missing imports in docs from langchain_core.tools  BaseTool

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-20 13:52:26 -07:00
Jacob Lee
2aa22542af docs[patch]: Adds evaluation sections (#23050)
Also want to add an index/rollup page to LangSmith docs to enable
linking to a how-to category as a group (e.g.
https://docs.smith.langchain.com/how_to_guides/evaluation/)

CC @agola11 @hinthornw
2024-06-20 13:52:26 -07:00
Jacob Lee
f3fc8e8251 docs[patch]: Update docs links (#23013) 2024-06-20 13:52:26 -07:00
Bagatur
7df84b5939 core[minor]: message transformer utils (#22752) 2024-06-20 13:52:26 -07:00
Qingchuan Hao
2d8550d415 docs: add bing search integration to agent (#22929)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:26 -07:00
Anders Swanson
40f866b690 community: OCI GenAI embedding batch size (#22986)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: OCI GenAI embedding batch size"



- [x] **PR message**:
    - **Issue:** #22985 


- [ ] **Add tests and docs**: N/A


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: Anders Swanson <anders.swanson@oracle.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:26 -07:00
Bagatur
e765755575 core[patch]: Release 0.2.8 (#23012) 2024-06-20 13:52:26 -07:00
Bagatur
d42dbe331b infra: test all dependents on any change (#22994) 2024-06-20 13:52:26 -07:00
Nuno Campos
6ef42daa81 core: run_in_executor: Wrap StopIteration in RuntimeError (#22997)
- StopIteration can't be set on an asyncio.Future it raises a TypeError
and leaves the Future pending forever so we need to convert it to a
RuntimeError
2024-06-20 13:52:26 -07:00
Bagatur
51044d708d standard-tests[patch]: Update chat model standard tests (#22378)
- Refactor standard test classes to make them easier to configure
- Update openai to support stop_sequences init param
- Update groq to support stop_sequences init param
- Update fireworks to support max_retries init param
- Update ChatModel.bind_tools to type tool_choice
- Update groq to handle tool_choice="any". **this may be controversial**

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:26 -07:00
Bob Lin
b30a5d4bd0 docs: Add some 3rd party tutorials (#22931)
Langchain is very popular among developers in China, but there are still
no good Chinese books or documents, so I want to add my own Chinese
resources on langchain topics, hoping to give Chinese readers a better
experience using langchain. This is not a translation of the official
langchain documentation, but my understanding.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:26 -07:00
Jacob Lee
04a31a7e31 docs[patch]: Reorder streaming guide, add tags (#22993)
CC @hinthornw
2024-06-20 13:52:26 -07:00
Oguz Vuruskaner
f27ce30470 community[minor]: add tool calling for DeepInfraChat (#22745)
DeepInfra now supports tool calling for supported models.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:26 -07:00
Bagatur
b2612351cd docs: update universal init title (#22990) 2024-06-20 13:52:26 -07:00
Lance Martin
c5b33ba2e4 Add RAG to conceptual guide (#22790)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
2024-06-20 13:52:26 -07:00
maang-h
97149c20bc community: Add Baichuan Embeddings batch size (#22942)
- **Support batch size** 
Baichuan updates the document, indicating that up to 16 documents can be
imported at a time

- **Standardized model init arg names**
    - baichuan_api_key -> api_key
    - model_name  -> model
2024-06-20 13:52:26 -07:00
ccurme
dc31cddb39 openai[patch]: add stream_usage parameter (#22854)
Here we add `stream_usage` to ChatOpenAI as:

1. a boolean attribute
2. a kwarg to _stream and _astream.

Question: should the `stream_usage` attribute be `bool`, or `bool |
None`?

Currently I've kept it `bool` and defaulted to False. It was implemented
on
[ChatAnthropic](e832bbb486/libs/partners/anthropic/langchain_anthropic/chat_models.py (L535))
as a bool. However, to maintain support for users who access the
behavior via OpenAI's `stream_options` param, this ends up being
possible:
```python
llm = ChatOpenAI(model_kwargs={"stream_options": {"include_usage": True}})
assert not llm.stream_usage
```
(and this model will stream token usage).

Some options for this:
- it's ok
- make the `stream_usage` attribute bool or None
- make an \_\_init\_\_ for ChatOpenAI, set a `._stream_usage` attribute
and read `.stream_usage` from a property

Open to other ideas as well.
2024-06-20 13:52:26 -07:00
Shubham Pandey
85c8622465 community[minor]: add ChatSnowflakeCortex chat model (#21490)
**Description:** This PR adds a chat model integration for [Snowflake
Cortex](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions),
which gives an instant access to industry-leading large language models
(LLMs) trained by researchers at companies like Mistral, Reka, Meta, and
Google, including [Snowflake
Arctic](https://www.snowflake.com/en/data-cloud/arctic/), an open
enterprise-grade model developed by Snowflake.

**Dependencies:** Snowflake's
[snowpark](https://pypi.org/project/snowflake-snowpark-python/) library
is required for using this integration.

**Twitter handle:** [@gethouseware](https://twitter.com/gethouseware)

- [x] **Add tests and docs**:
1. integration tests:
`libs/community/tests/integration_tests/chat_models/test_snowflake.py`
2. unit tests:
`libs/community/tests/unit_tests/chat_models/test_snowflake.py`
  3. example notebook: `docs/docs/integrations/chat/snowflake.ipynb`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:26 -07:00
Lance Martin
db01043faa docs: Update llamacpp ntbk (#22907)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:26 -07:00
Bagatur
1d0508b028 standard-tests[patch]: Release 0.1.1 (#22984) 2024-06-20 13:52:26 -07:00
Hakan Özdemir
f8cc285fe1 [Partner]: Add metadata to stream response (#22716)
Adds `response_metadata` to stream responses from OpenAI. This is
returned with `invoke` normally, but wasn't implemented for `stream`.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:26 -07:00
Baskar Gopinath
c887ecbb86 docs: Standardise formatting (#22948)
Standardised formatting 


![image](https://github.com/langchain-ai/langchain/assets/73015364/ea3b5c5c-e7a6-4bb7-8c6b-e7d8cbbbf761)
2024-06-20 13:52:26 -07:00
Ikko Eltociear Ashimine
8597358868 docs: update databricks.ipynb (#22949)
arbitary -> arbitrary
2024-06-20 13:52:26 -07:00
Baskar Gopinath
81427a7108 Update sql_qa.ipynb (#22966)
fixes #22798 
fixes #22963
2024-06-20 13:52:26 -07:00
Bagatur
c6c5c36236 standard-tests[patch]: don't require str chunk contents (#22965) 2024-06-20 13:52:25 -07:00
Daniel Glogowski
edf4095bed docs: nim model name update (#22943)
NIM Model name change in a notebook and mdx file.

Thanks!
2024-06-20 13:52:25 -07:00
Christopher Tee
1d38cb1918 community(you): Better support for You.com News API (#22622)
## Description
While `YouRetriever` supports both You.com's Search and News APIs, news
is supported as an afterthought.
More specifically, not all of the News API parameters are exposed for
the user, only those that happen to overlap with the Search API.

This PR:
- improves support for both APIs, exposing the remaining News API
parameters while retaining backward compatibility
- refactor some REST parameter generation logic
- updates the docstring of `YouSearchAPIWrapper`
- add input validation and warnings to ensure parameters are properly
set by user
- 🚨 Breaking: Limit the news results to `k` items

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:25 -07:00
ccurme
1c27f8945a infra: update integration test workflow (#22945) 2024-06-20 13:52:25 -07:00
Tomaz Bratanic
5a77f48124 Improve llm graph transformer docstring (#22939) 2024-06-20 13:52:25 -07:00
maang-h
5dee640c9d docs: update ZhipuAI ChatModel docstring (#22934)
- **Description:** Update ZhipuAI ChatModel rich docstring
- **Issue:** the issue #22296
2024-06-20 13:52:25 -07:00
Appletree24
3e7ebbe770 docs:Fix mispelling in streaming doc (#22936)
Description: Fix mispelling
Issue: None
Dependencies: None
Twitter handle: None

Co-authored-by: qcloud <ubuntu@localhost.localdomain>
2024-06-20 13:52:25 -07:00
Bitmonkey
41252beaae Update ollama.py with optional raw setting. (#21486)
Ollama has a raw option now. 

https://github.com/ollama/ollama/blob/main/docs/api.md

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-20 13:52:25 -07:00
caiyueliang
c7d4a0e1ec community: 'Solve the issue where the _search function in ElasticsearchStore supports passing a query_vector parameter, but the parameter does not take effect. (#21532)
**Issue:**
When using the similarity_search_with_score function in
ElasticsearchStore, I expected to pass in the query_vector that I have
already obtained. I noticed that the _search function does support the
query_vector parameter, but it seems to be ineffective. I am attempting
to resolve this issue.

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-20 13:52:25 -07:00
Erick Friis
3345085cf9 docs: add ollama json mode (#22926)
fixes #22910
2024-06-20 13:52:25 -07:00
Erick Friis
2d667bb166 experimental: release 0.0.61 (#22924) 2024-06-20 13:52:25 -07:00
BuxianChen
785026c7af cli[minor]: remove redefined DEFAULT_GIT_REF (#21471)
remove redefined DEFAULT_GIT_REF

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-20 13:52:25 -07:00
Erick Friis
976d181b2a community: release 0.2.5 (#22923) 2024-06-20 13:52:25 -07:00
Jiejun Tan
9072d5831d text-splitters[patch]: Fix HTMLSectionSplitter (#22812)
Update former pull request:
https://github.com/langchain-ai/langchain/pull/22654.

Modified `langchain_text_splitters.HTMLSectionSplitter`, where in the
latest version `dict` data structure is used to store sections from a
html document, in function `split_html_by_headers`. The header/section
element names serve as dict keys. This can be a problem when duplicate
header/section element names are present in a single html document.
Latter ones can replace former ones with the same name. Therefore some
contents can be miss after html text splitting is conducted.

Using a list to store sections can hopefully solve the problem. A Unit
test considering duplicate header names has been added.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:25 -07:00
Erick Friis
ff7f03db60 langchain: release 0.2.5 (#22922) 2024-06-20 13:52:25 -07:00
Erick Friis
7dd4def4a9 templates: remove lockfiles (#22920)
poetry will default to latest versions without
2024-06-20 13:52:25 -07:00
Baskar Gopinath
c980581af2 docs: Fix wrongly referenced class name in confluence.py (#22879)
Fixes #22542

Changed ConfluenceReader to ConfluenceLoader
2024-06-20 13:52:25 -07:00
ccurme
50f15a5c2d infra: remove nvidia from monorepo scheduled tests (#22915)
Scheduled tests run in
https://github.com/langchain-ai/langchain-nvidia/tree/main
2024-06-20 13:52:25 -07:00
Erick Friis
b8e16c9429 core: release 0.2.7 (#22917) 2024-06-20 13:52:25 -07:00
Nuno Campos
030f01d026 core: in astream_events v2 always await task even if already finished (#22916)
- this ensures exceptions propagate to the caller
2024-06-20 13:52:25 -07:00
Istvan/Nebulinq
09dc485af1 experimental: LLMGraphTransformer - added relationship properties. (#21856)
- **Description:** 
The generated relationships in the graph had no properties, but the
Relationship class was properly defined with properties. This made it
very difficult to transform conditional sentences into a graph. Adding
properties to relationships can solve this issue elegantly.
The changes expand on the existing LLMGraphTransformer implementation
but add the possibility to define allowed relationship properties like
this: LLMGraphTransformer(llm=llm, relationship_properties=["Condition",
"Time"],)
- **Issue:** 
    no issue found
 - **Dependencies:**
    n/a
- **Twitter handle:** 
    @IstvanSpace


-Quick Test
=================================================================
from dotenv import load_dotenv
import os
from langchain_community.graphs import Neo4jGraph
from langchain_experimental.graph_transformers import
LLMGraphTransformer
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.documents import Document

load_dotenv()
os.environ["NEO4J_URI"] = os.getenv("NEO4J_URI")
os.environ["NEO4J_USERNAME"] = os.getenv("NEO4J_USERNAME")
os.environ["NEO4J_PASSWORD"] = os.getenv("NEO4J_PASSWORD")
graph = Neo4jGraph()
llm = ChatOpenAI(temperature=0, model_name="gpt-4o")
llm_transformer = LLMGraphTransformer(llm=llm)
#text = "Harry potter likes pies, but only if it rains outside"
text = "Jack has a dog named Max. Jack only walks Max if it is sunny
outside."
documents = [Document(page_content=text)]
llm_transformer_props = LLMGraphTransformer(
    llm=llm,
    relationship_properties=["Condition"],
)
graph_documents_props =
llm_transformer_props.convert_to_graph_documents(documents)
print(f"Nodes:{graph_documents_props[0].nodes}")
print(f"Relationships:{graph_documents_props[0].relationships}")
graph.add_graph_documents(graph_documents_props)

---------

Co-authored-by: Istvan Lorincz <istvan.lorincz@pm.me>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:25 -07:00
ccurme
9c3de35a80 docs: add groq to chatmodeltabs (#22913) 2024-06-20 13:52:25 -07:00
Eugene Yurtsev
5865324631 dcos: Add admonition to PythonREPL tool (#22909)
Add admonition to the documentation to make sure users are aware that
the tool allows execution of code on the host machine using a python
interpreter (by design).
2024-06-20 13:52:25 -07:00
kiarina
d085917fc5 core[patch]: Fix FunctionCallbackHandler._on_tool_end (#22908)
If the global `debug` flag is enabled, the agent will get the following
error in `FunctionCallbackHandler._on_tool_end` at runtime.

```
Error in ConsoleCallbackHandler.on_tool_end callback: AttributeError("'list' object has no attribute 'strip'")
```

By calling str() before strip(), the error was avoided.
This error can be seen at
[debugging.ipynb](https://github.com/langchain-ai/langchain/blob/master/docs/docs/how_to/debugging.ipynb).

- Issue: NA
- Dependencies: NA
- Twitter handle: https://x.com/kiarina37
2024-06-20 13:52:25 -07:00
Philippe PRADOS
44472e9de4 community[minor]: Fix long_context_reorder.py async (#22839)
Implement `async def atransform_documents( self, documents:
Sequence[Document], **kwargs: Any ) -> Sequence[Document]` for
`LongContextReorder`
2024-06-20 13:52:25 -07:00
Eugene Yurtsev
b699d81d10 community[major], experimental[patch]: Remove Python REPL from community (#22904)
Remove the REPL from community, and suggest an alternative import from
langchain_experimental.

Fix for this issue:
https://github.com/langchain-ai/langchain/issues/14345

This is not a bug in the code or an actual security risk. The python
REPL itself is behaving as expected.

The PR is done to appease blanket security policies that are just
looking for the presence of exec in the code.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:25 -07:00
Eugene Yurtsev
73c4230674 community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) (#22903)
This PR restricts the depth to which the sitemap can be parsed.

Fix for: CVE-2024-2965
2024-06-20 13:52:25 -07:00
Eugene Yurtsev
acc210cd6c core[patch]: fix validation of @deprecated decorator (#22513)
This PR moves the validation of the decorator to a better place to avoid
creating bugs while deprecating code.

Prevent issues like this from arising:
https://github.com/langchain-ai/langchain/issues/22510

we should replace with a linter at some point that just does static
analysis
2024-06-20 13:52:25 -07:00
Jacob Lee
95661aacba anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.

One thing - Anthropic events look like this:

```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```

Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.

We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?

If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?

CC @ccurme @efriis @baskaryan
2024-06-20 13:52:25 -07:00
ccurme
3b5542ec91 fireworks[patch]: add usage_metadata to (a)invoke and (a)stream (#22906) 2024-06-20 13:52:25 -07:00
Mohammad Mohtashim
24438144a9 [Community]: HuggingFaceCrossEncoder score accounting for <not-relevant score,relevant score> pairs. (#22578)
- **Description:** Some of the Cross-Encoder models provide scores in
pairs, i.e., <not-relevant score (higher means the document is less
relevant to the query), relevant score (higher means the document is
more relevant to the query)>. However, the `HuggingFaceCrossEncoder`
`score` method does not currently take into account the pair situation.
This PR addresses this issue by modifying the method to consider only
the relevant score if score is being provided in pair. The reason for
focusing on the relevant score is that the compressors select the top-n
documents based on relevance.
    - **Issue:** #22556 
- Please also refer to this
[comment](https://github.com/UKPLab/sentence-transformers/issues/568#issuecomment-729153075)
2024-06-20 13:52:25 -07:00
Baskar Gopinath
d1e66ce6d3 docs: Fix typo in tutorial about structured data extraction (#22888)
[Fixed typo](docs: Fix typo in tutorial about structured data
extraction)
2024-06-20 13:52:25 -07:00
Thanh Nguyen
ad6dade13f community[minor]: add chat model llamacpp (#22589)
- **PR title**: [community] add chat model llamacpp


- **PR message**:
- **Description:** This PR introduces a new chat model integration with
llamacpp_python, designed to work similarly to the existing ChatOpenAI
model.
      + Work well with instructed chat, chain and function/tool calling.
+ Work with LangGraph (persistent memory, tool calling), will update
soon

- **Dependencies:** This change requires the llamacpp_python library to
be installed.
    
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:25 -07:00
Bagatur
7b0b3944b3 docs: doc loader feat table alignment (#22900) 2024-06-20 13:52:25 -07:00
Isaac Francisco
cc4231061d docs: generate table for document loaders (#22871)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:25 -07:00
Jacob Lee
99d5952af1 docs[patch]: Expand embeddings docs (#22881) 2024-06-20 13:52:25 -07:00
ccurme
9ab922b91f anthropic[patch]: always add tool_result type to ToolMessage content (#22721)
Anthropic tool results can contain image data, which are typically
represented with content blocks having `"type": "image"`. Currently,
these content blocks are passed as-is as human/user messages to
Anthropic, which raises BadRequestError as it expects a tool_result
block to follow a tool_use.

Here we update ChatAnthropic to nest the content blocks inside a
tool_result content block.

Example:
```python
import base64

import httpx
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import AIMessage, HumanMessage, ToolMessage
from langchain_core.pydantic_v1 import BaseModel, Field


# Fetch image
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")


class FetchImage(BaseModel):
    should_fetch: bool = Field(..., description="Whether an image is requested.")


llm = ChatAnthropic(model="claude-3-sonnet-20240229").bind_tools([FetchImage])

messages = [
    HumanMessage(content="Could you summon a beautiful image please?"),
    AIMessage(
        content=[
            {
                "type": "tool_use",
                "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
                "name": "FetchImage",
                "input": {"should_fetch": True},
            },
        ],
        tool_calls=[
            {
                "name": "FetchImage",
                "args": {"should_fetch": True},
                "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
            },
        ],
    ),
    ToolMessage(
        name="FetchImage",
        content=[
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/jpeg",
                    "data": image_data,
                },
            },
        ],
        tool_call_id="toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
    ),
]

llm.invoke(messages)
```

Trace:
https://smith.langchain.com/public/d27e4fc1-a96d-41e1-9f52-54f5004122db/r
2024-06-20 13:52:25 -07:00
Lucas Tucker
d1c6958868 docs: Standardize ChatGroq (#22751)
Updated ChatGroq doc string as per issue
https://github.com/langchain-ai/langchain/issues/22296:"langchain_groq:
updated docstring for ChatGroq in langchain_groq to match that of the
description (in the appendix) provided in issue
https://github.com/langchain-ai/langchain/issues/22296. "

Issue: This PR is in response to issue
https://github.com/langchain-ai/langchain/issues/22296, and more
specifically the ChatGroq model. In particular, this PR updates the
docstring for langchain/libs/partners/groq/langchain_groq/chat_model.py
by adding the following sections: Instantiate, Invoke, Stream, Async,
Tool calling, Structured Output, and Response metadata. I used the
template from the Anthropic implementation and referenced the Appendix
of the original issue post. I also noted that: `usage_metadata `returns
none for all ChatGroq models I tested; there is no mention of image
input in the ChatGroq documentation; unlike that of ChatHuggingFace,
`.stream(messages)` for ChatGroq returned blocks of output.

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:25 -07:00
Anush
f7bc198118 qdrant[patch]: Use collection_exists API instead of exceptions (#22764)
## Description

Currently, the Qdrant integration relies on exceptions raised by
[`get_collection`
](https://qdrant.tech/documentation/concepts/collections/#collection-info)
to check if a collection exists.

Using
[`collection_exists`](https://qdrant.tech/documentation/concepts/collections/#check-collection-existence)
is recommended to avoid missing any unhandled exceptions. This PR
addresses this.

## Testing
All integration and unit tests pass. No user-facing changes.
2024-06-20 13:52:25 -07:00
Anindyadeep
5a1b63cd65 community[minor]: Prem Templates (#22783)
This PR adds the feature add Prem Template feature in ChatPremAI.
Additionally it fixes a minor bug for API auth error when API passed
through arguments.
2024-06-20 13:52:25 -07:00
Stefano Lottini
1af81c12a1 docs: Astra DB vectorstore, adjust syntax for automatic-embedding example (#22833)
Description: Adjusting the syntax for creating the vectorstore
collection (in the case of automatic embedding computation) for the most
idiomatic way to submit the stored secret name.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:25 -07:00
maang-h
25c4336740 community[minor]: Implement ZhipuAIEmbeddings interface (#22821)
- **Description:** Implement ZhipuAIEmbeddings interface, include:
     - The `embed_query` method
     - The `embed_documents` method

refer to [ZhipuAI
Embedding-2](https://open.bigmodel.cn/dev/api#text_embedding)

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-20 13:52:25 -07:00
Leonid Ganeline
b8d305096c docs: ReAct reference (#22830)
The `ReAct` is used all across LangChain but it is not referenced
properly.
Added references to the original paper.
2024-06-20 13:52:25 -07:00
Giacomo Berardi
aae831f84a docs: fixes for Elasticsearch integrations, cache doc and providers list (#22817)
Some minor fixes in the documentation:
 - ElasticsearchCache initilization is now correct
 - List of integrations for ES updated
2024-06-20 13:52:25 -07:00
Isaac Francisco
2f69c78da0 infra: lint new docs to match doc loader template (#22867) 2024-06-20 13:52:24 -07:00
Bagatur
ce253bf211 cli[patch]: Release 0.0.25 (#22876) 2024-06-20 13:52:24 -07:00
Isaac Francisco
a136b5103a docs, cli[patch]: document loaders doc template (#22862)
From: https://github.com/langchain-ai/langchain/pull/22290

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:24 -07:00
Hayden Wolff
7668f3f469 docs: update NVIDIA Riva tool to use NVIDIA NIM for LLM (#22873)
**Description:**
Update the NVIDIA Riva tool documentation to use NVIDIA NIM for the LLM.
Show how to use NVIDIA NIMs and link to documentation for LangChain with
NIM.

---------

Co-authored-by: Hayden Wolff <hwolff@nvidia.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-20 13:52:24 -07:00
Zeeshan Qureshi
5a45e5e645 docs: s/path_images/images/ for ImageCaptionLoader keyword arguments (#22857)
Quick update to `ImageCaptionLoader` documentation to reflect what's in
code.
2024-06-20 13:52:24 -07:00
liuzc9
2f79986a15 Fix typo in vearch.md (#22840)
Fix typo
2024-06-20 13:52:24 -07:00
Kagura Chen
4b19384d4f Fix: lint errors and update Field alias in models.py and AutoSelectionScorer initialization (#22846)
This PR addresses several lint errors in the core package of LangChain.
Specifically, the following issues were fixed:

1.Unexpected keyword argument "required" for "Field"  [call-arg]
2.tests/integration_tests/chains/test_cpal.py:263: error: Unexpected
keyword argument "narrative_input" for "QueryModel" [call-arg]
2024-06-20 13:52:24 -07:00
Erick Friis
cd523dde2f langchain: release 0.2.4 (#22872) 2024-06-20 13:52:24 -07:00
Erick Friis
c1d5a01638 core: release 0.2.6 (#22868) 2024-06-20 13:52:24 -07:00
Jacob Lee
2a1a6324be core[patch]: Treat type as a special field when merging lists (#22750)
Should we even log a warning? At least for Anthropic, it's expected to
get e.g. `text_block` followed by `text_delta`.

@ccurme @baskaryan @efriis
2024-06-20 13:52:24 -07:00
Nuno Campos
69700c2ae6 core: In astream_events v2 propagate cancel/break to the inner astream call (#22865)
- previous behavior was for the inner astream to continue running with
no interruption
- also propagate break in core runnable methods
2024-06-20 13:52:24 -07:00
Eugene Yurtsev
1a28084858 experimental[patch]/docs[patch]: Update links to security docs (#22864)
Minor update to newest version of security docs (content should be
identical).
2024-06-20 13:52:24 -07:00
Eugene Yurtsev
addbc3a8b1 ci: Add script to check for pickle usage in community (#22863)
Add script to check for pickle usage in community.
2024-06-20 13:52:24 -07:00
Eugene Yurtsev
b0322a8fa5 community[patch]: FAISS VectorStore deserializer should be opt-in (#22861)
FAISS deserializer uses pickle module. Users have to opt-in to
de-serialize.
2024-06-20 13:52:24 -07:00
Eugene Yurtsev
969e6a4dc5 experimental[major]: Force users to opt-in into code that relies on the python repl (#22860)
This should make it obvious that a few of the agents in langchain
experimental rely on the python REPL as a tool under the hood, and will
force users to opt-in.
2024-06-20 13:52:24 -07:00
Isaac Francisco
ce135dae5f [docs]: added info for TavilySearchResults (#22765) 2024-06-20 13:52:24 -07:00
ccurme
3f16050891 partners: fix numpy dep (#22858)
Following https://github.com/langchain-ai/langchain/pull/22813, which
added python 3.12 to CI, here we update numpy accordingly in partner
packages.
2024-06-20 13:52:24 -07:00
Isaac Francisco
aa0636f69b minor functionality change: adding API functionality to tavilysearch (#22761) 2024-06-20 13:52:24 -07:00
Isaac Francisco
80474210a4 docs: improved recursive url loader docs (#22648)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:24 -07:00
Isaac Francisco
e0feda2a38 [docs]: bind tools (#22831) 2024-06-20 13:52:24 -07:00
ccurme
f3c0ceffd1 groq[patch]: add usage_metadata to (a)invoke and (a)stream (#22834) 2024-06-20 13:52:24 -07:00
Jacob Lee
07686db3a5 docs[patch]: Improve Groq integration page (#22844)
Was bare bones and got marked by folks as unhelpful.

CC @efriis @colemccracken
2024-06-20 13:52:24 -07:00
Jacob Lee
3465f5f32a docs[patch]: Readd Pydantic compatibility docs (#22836)
As a how-to guide.

CC @eyurtsev @hwchase17
2024-06-20 13:52:24 -07:00
Jacob Lee
768fb441ab docs[patch]: Adds multimodal column to chat models table, move up in concepts (#22837)
CC @hwchase17 @baskaryan
2024-06-20 13:52:24 -07:00
James Braza
9e8bb1a0f4 core[patch]: allowing latest packaging versions (#22792)
Allowing version 24 of https://github.com/pypa/packaging

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:24 -07:00
Jacob Lee
20055808c5 docs[patch]: Add structured output to conceptual docs (#22791)
This downgrades `Function/tool calling` from a h3 to an h4 which means
it'll no longer show up in the right sidebar, but any direct links will
still work. I think that is ok, but LMK if you disapprove.

CC @hwchase17 @eyurtsev @rlancemartin
2024-06-20 13:52:24 -07:00
Karim Lalani
a2bbf37439 [experimental][llms][OllamaFunctions] tool calling related fixes (#22339)
Fixes issues with tool calling to handle tool objects correctly. Added
support to handle ToolMessage correctly.
Added additional checks for error conditions.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:24 -07:00
Christophe Bornet
db71ea8b73 ci: add testing with Python 3.12 (#22813)
We need to use a different version of numpy for py3.8 and py3.12 in
pyproject.
And so do projects that use that Python version range and import
langchain.

    - **Twitter handle:** _cbornet
2024-06-20 13:52:24 -07:00
HyoJin Kang
215504f564 community[patch]: fix database uri type in SQLDatabase (#22661)
**Description**
sqlalchemy uses "sqlalchemy.engine.URL" type for db uri argument.
Added 'URL' type for compatibility.

**Issue**: None

**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:24 -07:00
Eugene Yurtsev
359733a6df core[patch]: Update remaining root_validators (#22829)
This PR updates the remaining root_validators in core to either be explicit pre-init or post-init validators.
2024-06-20 13:52:24 -07:00
Eugene Yurtsev
83da58c2c2 community[patch]: Update root_validators embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub, Toolkits: Connery, ChatModels: PAI_EAS, (#22828)
This PR updates root validators for:

* Embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub
* Toolkits: Connery
* ChatModels: PAI_EAS

Following this issue:
https://github.com/langchain-ai/langchain/issues/22819
2024-06-20 13:52:24 -07:00
JonZeolla
a6a02b3095 community[minor]: implement huggingface show_progress consistently (#22682)
- **Description:** This implements `show_progress` more consistently
(i.e. it is also added to the `HuggingFaceBgeEmbeddings` object).
- **Issue:** This implements `show_progress` more consistently in the
embeddings huggingface classes. Previously this could have been set via
`encode_kwargs`.
 - **Dependencies:** None
 - **Twitter handle:** @jonzeolla
2024-06-20 13:52:24 -07:00
Eugene Yurtsev
42a6f0698d core[patch]: update some root_validators (#22787)
Update some of the @root_validators to be explicit pre=True or
pre=False, skip_on_failure=True for pydantic 2 compatibility.
2024-06-20 13:52:24 -07:00
bincat
f489bd1fce docs: fix function name in tutorials/agents.ipynb (#22809)
the function called in the flowing example is `create_react_agent`, not
`create_tool_calling_executor `
2024-06-20 13:52:24 -07:00
mrhbj
e70e0e65ef community[patch]: fix hunyuan message include chinese signature error (#22795) (#22796)
… (#22795)

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:24 -07:00
Kagura Chen
a38c09fbd5 docs: update repo_structure.mdx to reflect latest code changes (#22810)
**Description:** This PR updates the documentation to reflect the recent
code changes.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:24 -07:00
Mr. Lance E Sloan «UMich»
c6d6befbef community[patch]: bugfix for YoutubeLoader's LINES format (#22815)
- **Description:** A change I submitted recently introduced a bug in
`YoutubeLoader`'s `LINES` output format. In those conditions, curly
braces ("`{}`") creates a set, not a dictionary. This bugfix explicitly
specifies that a dictionary is created.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter:** lsloan_umich
- **Mastodon:**
[lsloan@mastodon.social](https://mastodon.social/@lsloan)
2024-06-20 13:52:24 -07:00
Philippe PRADOS
6aee08ca4d langchain[minor]: Make EmbeddingsFilters async (#22737)
Add native async implementation for EmbeddingsFilter
2024-06-20 13:52:24 -07:00
endrajeet
a77882a9b0 Update index.mdx (#22818)
changed "# 🌟Recognition" to "### 🌟 Recognition" to match the rest of the
subheadings.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:24 -07:00
Bagatur
50df123c4c infra: lint new docs to match templates (#22786) 2024-06-20 13:52:24 -07:00
ccurme
99f2167ea7 mistral[patch]: add usage_metadata to (a)invoke and (a)stream (#22781) 2024-06-20 13:52:24 -07:00
Jiří Spilka
a045a67330 docs: Correct code examples in the Apify's notebooks (#22768)
**Description:** Correct code examples in the Apify document load
notebook and Apify Dataset notebook

**Issue**: None
**Dependencies**: None
**Twitter handle**: None
2024-06-20 13:52:24 -07:00
mrhbj
bf5a5814b7 community[patch]: fix hunyuan client json analysis (#22452) (#22767)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:23 -07:00
Rohan Aggarwal
53d56e0057 community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
Support for old clients (Thin and Thick) Oracle Vector Store


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
Support for old clients (Thin and Thick) Oracle Vector Store

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
Have our own local tests

---------

Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>
2024-06-20 13:52:23 -07:00
Jacob Lee
40fc60d0fd docs[patch]: Adds streaming conceptual doc (#22760)
CC @hwchase17 @baskaryan
2024-06-20 13:52:23 -07:00
Mr. Lance E Sloan «UMich»
c0cbc5de99 community[patch]: Load YouTube transcripts (captions) as fixed-duration chunks with start times (#21710)
- **Description:** Add a new format, `CHUNKS`, to
`langchain_community.document_loaders.youtube.YoutubeLoader` which
creates multiple `Document` objects from YouTube video transcripts
(captions), each of a fixed duration. The metadata of each chunk
`Document` includes the start time of each one and a URL to that time in
the video on the YouTube website.
  
I had implemented this for UMich (@umich-its-ai) in a local module, but
it makes sense to contribute this to LangChain community for all to
benefit and to simplify maintenance.

- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter:** lsloan_umich
- **Mastodon:**
[lsloan@mastodon.social](https://mastodon.social/@lsloan)

With regards to **tests and documentation**, most existing features of
the `YoutubeLoader` class are not tested. Only the
`YoutubeLoader.extract_video_id()` static method had a test. However,
while I was waiting for this PR to be reviewed and merged, I had time to
add a test for the chunking feature I've proposed in this PR.

I have added an example of using chunking to the
`docs/docs/integrations/document_loaders/youtube_transcript.ipynb`
notebook.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:23 -07:00
Aayush Kataria
0e7adc6c61 community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676)
This PR add supports for Azure Cosmos DB for NoSQL vector store.

Summary:

Description: added vector store integration for Azure Cosmos DB for
NoSQL Vector Store,
Dependencies: azure-cosmos dependency,
Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:23 -07:00
Mohammad Mohtashim
da21e401d9 [Community]: Added Metadata filter support for DocumentDB Vector Store (#22777)
- **Description:** As pointed out in this issue #22770, DocumentDB
`similarity_search` does not support filtering through metadata which
this PR adds by passing in the parameter `filter`. Also this PR fixes a
minor Documentation error.
- **Issue:** #22770

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:23 -07:00
Dmitry Stepanov
18d087ef0f Ollama vision support (#22734)
**Description:** Ollama vision with messages in OpenAI-style support `{
"image_url": { "url": ... } }`
**Issue:** #22460 

Added flexible solution for ChatOllama to support chat messages with
images. Works when you provide either `image_url` as a string or as a
dict with "url" inside (like OpenAI does). So it makes available to use
tuples with `ChatPromptTemplate.from_messages()`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:23 -07:00
Philippe PRADOS
9725e1046a langchain[minor]: Add native async implementation to LLMFilter, add concurrency to both sync and async paths (#22739)
Thank you for contributing to LangChain!

- [ ] **PR title**: "langchain: Fix chain_filter.py to be compatible
with async"


- [ ] **PR message**: 
    - **Description:** chain_filter is not compatible with async.
    - **Twitter handle:** pprados


- [X ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Signed-off-by: zhangwangda <zhangwangda94@163.com>
Co-authored-by: Prakul <discover.prakul@gmail.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Gin <ictgtvt@gmail.com>
Co-authored-by: wangda <38549158+daziz@users.noreply.github.com>
Co-authored-by: Max Mulatz <klappradla@posteo.net>
2024-06-20 13:52:23 -07:00
Jaeyeon Kim(김재연)
8c7967905c community[minor]: fix redis store docstring and streamline initialization code (#22730)
Thank you for contributing to LangChain!

### Description

Fix the example in the docstring of redis store.
Change the initilization logic and remove redundant check, enhance error
message.

### Issue

The example in docstring of how to use redis store was wrong.

![image](https://github.com/langchain-ai/langchain/assets/37469330/78c5d9ce-ee66-45b3-8dfe-ea29f125e6e9)

### Dependencies
Nothing



- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:23 -07:00
am-kinetica
87c2df84dc community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724)
- [ ] **Miscellaneous updates and fixes**: 
- **Description:** Handled error in querying; quotes in table names;
updated gpudb API
- **Issue:** Threw an error with an error message difficult to
understand if a query failed or returned no records
    - **Dependencies:** Updated GPUDB API version to `7.2.0.9`


@baskaryan @hwchase17
2024-06-20 13:52:23 -07:00
NithinBairapaka
ccc8e9d41b docs: Updated integration docs with required package installations (#22392)
**Title:** Updated integration docs with required package installations
   **Issue:**  #22005
2024-06-20 13:52:23 -07:00
Albert Gil López
0783805716 docs: correct path in readme (#22383)
Description: Fix incorrect path in README instructions.
Issue: N/A
Dependencies: None
Twitter handle: @jddam

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-20 13:52:23 -07:00
Greg Tracy
1eeb6c9619 docs: Fix pixelation in stack graphic (#21554)
This change updates the stack graphic displayed in the top-level README.
The LangChain tile is pixelated in the current graphic.
2024-06-20 13:52:23 -07:00
Leonid Ganeline
d2be0d2677 docs: integrations cache: added class table (#22368)
Added a table with the cache classes. See [this table
here](https://langchain-rnpqvikie-langchain.vercel.app/v0.2/docs/integrations/llm_caching/#cache-classes-summary-table).
2024-06-20 13:52:23 -07:00
Jacob Lee
d53a3d6bf3 docs: Adds pointers from LLM pages to equivalent chat model pages (#22759)
@baskaryan
2024-06-20 13:52:23 -07:00
Qingchuan Hao
1af2eb87d2 docs: fix langchain expression language link (#22683) 2024-06-20 13:52:23 -07:00
Mathis Joffre
2f22d84dbc community[minor]: Add support for OVHcloud AI Endpoints Embedding (#22667)
**Description:** Add support for [OVHcloud AI
Endpoints](https://endpoints.ai.cloud.ovh.net/) Embedding models.

Inspired by:
https://gist.github.com/gmasse/e1f99339e161f4830df6be5d0095349a

Signed-off-by: Joffref <mariusjoffre@gmail.com>
2024-06-20 13:52:23 -07:00
Erick Friis
8b2a00f40c core: fix mustache falsy cases (#22747) 2024-06-20 13:52:23 -07:00
Eugene Yurtsev
4025edce4a core[patch]: Add missing type annotations (#22756)
Add missing type annotations.

The missing type annotations will raise exceptions with pydantic 2.
2024-06-20 13:52:23 -07:00
Eugene Yurtsev
90bb01e1b2 community[patch]: Add missing type annotations (#22758)
Add missing type annotations to objects in community.
These missing type annotations will raise type errors in pydantic 2.
2024-06-20 13:52:23 -07:00
Naka Masato
241d1d375e langchain[patch]: allow to use partial variables in create_sql_query_chain (#22688)
- **Description:** allow to use partial variables to pass `top_k` and
`table_info`
- **Issue:** no
- **Dependencies:** no
- **Twitter handle:** @gymnstcs

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:23 -07:00
Bharat Ramanathan
bd14a6d9d1 community[patch]: fix WandbTracer to work with new "RunV2" API (#22673)
- **Description:** This PR updates the `WandbTracer` to work with the
new RunV2 API so that wandb Traces logging works correctly for new
LangChain versions. Here's an example
[run](https://wandb.ai/parambharat/langchain-tracing/runs/wpm99ftq) from
the existing tests
- **Issue:** https://github.com/wandb/wandb/issues/7762
- **Twitter handle:** @ParamBharat

_If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17._
2024-06-20 13:52:23 -07:00
Oguz Vuruskaner
e901b1df2a community[patch]: fix deepinfra inference (#22680)
This PR includes:

1. Update of default model to LLama3.
2. Handle some 400x errors with more user friendly error messages.
3. Handle user errors.
2024-06-20 13:52:23 -07:00
Lucas Tucker
b3a0b52806 docs: standardize ChatHuggingFace (#22693)
**Updated ChatHuggingFace doc string as per issue #22296**:
"langchain_huggingface: updated docstring for ChatHuggingFace in
langchain_huggingface to match that of the description (in the appendix)
provided in issue #22296. "

**Issue:** This PR is in response to issue #22296, and more specifically
ChatHuggingFace model. In particular, this PR updates the docstring for
langchain/libs/partners/hugging_face/langchain_huggingface/chat_models/huggingface.py
by adding the following sections: Instantiate, Invoke, Stream, Async,
Tool calling, and Response metadata. I used the template from the
Anthropic implementation and referenced the Appendix of the original
issue post. I also noted that: langchain_community hugging face llms do
not work with langchain_huggingface's ChatHuggingFace model (at least
for me); the .stream(messages) functionality of ChatHuggingFace only
returned a block of response.

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:23 -07:00
Erick Friis
1da3ff4807 docs: couchbase partner package (#22757) 2024-06-20 13:52:23 -07:00
Tomaz Bratanic
55f8d785bd community[patch]: Add function response to graph cypher qa chain (#22690)
LLMs struggle with Graph RAG, because it's different from vector RAG in
a way that you don't provide the whole context, only the answer and the
LLM has to believe. However, that doesn't really work a lot of the time.
However, if you wrap the context as function response the accuracy is
much better.

btw... `union[LLMChain, Runnable]` is linting fun, that's why so many
ignores
2024-06-20 13:52:23 -07:00
X-HAN
da656899dc community[minor]: add Volcengine Rerank (#22700)
**Description:** this PR adds Volcengine Rerank capability to Langchain,
you can find Volcengine Rerank API from
[here](https://www.volcengine.com/docs/84313/1254474) &
[here](https://www.volcengine.com/docs/84313/1254605).
[Volcengine](https://www.volcengine.com/) is a cloud service platform
developed by ByteDance, the parent company of TikTok. You can obtain
Volcengine API AK/SK from
[here](https://www.volcengine.com/docs/84313/1254553).

**Dependencies:** VolcengineRerank depends on `volcengine` python
package.

**Twitter handle:** my twitter/x account is https://x.com/LastMonopoly
and I'd like a mention, thank you!


**Tests and docs**
  1. integration test: `test_volcengine_rerank.py`
  2. example notebook: `volcengine_rerank.ipynb`

**Lint and test**: I have run `make format`, `make lint` and `make test`
from the root of the package I've modified.
2024-06-20 13:52:23 -07:00
Prakul
1e07c923f1 docs:Update reference to langchain-mongodb (#22705)
**Description**: Update reference to langchain-mongodb
2024-06-20 13:52:23 -07:00
Ikko Eltociear Ashimine
3c9f69f605 docs: update azure_container_apps_dynamic_sessions_data_analyst.ipynb (#22718)
colum -> column
2024-06-20 13:52:23 -07:00
Jacob Lee
d617b4d6c4 docs[patch]: Add caution on OpenAI LLMs integration page (#22754)
@baskaryan do we like?

<img width="1040" alt="Screenshot 2024-06-10 at 12 16 45 PM"
src="https://github.com/langchain-ai/langchain/assets/6952323/8893063f-1acf-4a56-9ee5-a8a2b1560277">
2024-06-20 13:52:23 -07:00
Mohammad Mohtashim
4363e100df community[patch]: Small Fix in OutlookMessageLoader (Close the Message once Open) (#22744)
- **Description:** A very small fix where we close the message when it
opened
- **Issue:** #22729
2024-06-20 13:52:23 -07:00
Bagatur
4eed0fb8f5 docs: standardize ChatVertexAI (#22686)
Part of #22296. Part two of
https://github.com/langchain-ai/langchain-google/pull/287
2024-06-20 13:52:23 -07:00
ccurme
4cb5d9e377 openai: add parallel_tool_calls to api ref (#22746)
![Screenshot 2024-06-10 at 1 41 24
PM](https://github.com/langchain-ai/langchain/assets/26529506/2626bf9c-41c6-4431-b2e1-f59de1e4e468)
2024-06-20 13:52:23 -07:00
Max Mulatz
e125464e12 Community[minor]: Add language parser for Elixir (#22742)
Hi 👋 

First off, thanks a ton for your work on this 💚 Really appreciate what
you're providing here for the community.

## Description

This PR adds a basic language parser for the
[Elixir](https://elixir-lang.org/) programming language. The parser code
is based upon the approach outlined in
https://github.com/langchain-ai/langchain/pull/13318: it's using
`tree-sitter` under the hood and aligns with all the other `tree-sitter`
based parses added that PR.

The `CHUNK_QUERY` I'm using here is probably not the most sophisticated
one, but it worked for my application. It's a starting point to provide
"core" parsing support for Elixir in LangChain. It enables people to use
the language parser out in real world applications which may then lead
to further tweaking of the queries. I consider this PR just the ground
work.

- **Dependencies:** requires `tree-sitter` and `tree-sitter-languages`
from the extended dependencies
- **Twitter handle:**`@bitcrowd`

## Checklist

- [x] **PR title**: "package: description"
- [x] **Add tests and docs**
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->
2024-06-20 13:52:23 -07:00
wangda
5d57d06b57 docs:Correcting spelling mistakes in readme (#22664)
Signed-off-by: zhangwangda <zhangwangda94@163.com>
2024-06-20 13:52:23 -07:00
Gin
308021c742 docs: Add a missing dot in concepts.mdx (#22677) 2024-06-20 13:52:23 -07:00
Philippe PRADOS
5ccf8d5d94 langchain[minor]: Add pgvector to list of supported vectorstores in self query retriever (#22678)
The fact that we outsourced pgvector to another project has an
unintended effect. The mapping dictionary found by
`_get_builtin_translator()` cannot recognize the new version of pgvector
because it comes from another package.
`SelfQueryRetriever` no longer knows `PGVector`.

I propose to fix this by creating a global dictionary that can be
populated by various database implementations. Thus, importing
`langchain_postgres` will allow the registration of the `PGvector`
mapping.

But for the moment I'm just adding a lazy import

Furthermore, the implementation of _get_builtin_translator()
reconstructs the BUILTIN_TRANSLATORS variable with each invocation,
which is not very efficient. A global map would be an optimization.

- **Twitter handle:** pprados

@eyurtsev, can you review this PR? And unlock the PR [Add async mode for
pgvector](https://github.com/langchain-ai/langchain-postgres/pull/32)
and PR [community[minor]: Add SQL storage
implementation](https://github.com/langchain-ai/langchain/pull/22207)?

Are you in favour of a global dictionary-based implementation of
Translator?
2024-06-20 13:52:23 -07:00
Lei Zhang
535f07bd13 infra: Scheduled GitHub Actions to run only on the upstream repository (#22707)
**Description:** Scheduled GitHub Actions to run only on the upstream
repository

**Issue:** Fixes #22706 

**Twitter handle:** @coolbeevip
2024-06-20 13:52:23 -07:00
Prakul
91e7598003 docs: Update MongoDB information in llm_caching (#22708)
**Description:**: Update MongoDB information in llm_caching
2024-06-20 13:52:23 -07:00
fzowl
f8532e2ffe docs: VoyageAI new embedding and reranking models (#22719) 2024-06-20 13:52:23 -07:00
Enzo Poggio
fad582ea1d community[patch]: Use Custom Logger Instead of Root Logger in get_user_agent Function (#22691)
## Description
This PR addresses a logging inconsistency in the `get_user_agent`
function. Previously, the function was using the root logger to log a
warning message when the "USER_AGENT" environment variable was not set.
This bypassed the custom logger `log` that was created at the start of
the module, leading to potential inconsistencies in logging behavior.

Changes:
- Replaced `logging.warning` with `log.warning` in the `get_user_agent`
function to ensure that the custom logger is used.

This change ensures that all logging in the `get_user_agent` function
respects the configurations of the custom logger, leading to more
consistent and predictable logging behavior.

## Dependencies

None

## Issue 

None

## Tests and docs

☝🏻 see description


## `make format`, `make lint` & `cd libs/community; make test`

```shell
> make format 
poetry run ruff format docs templates cookbook
1417 files left unchanged
poetry run ruff check --select I --fix docs templates cookbook
All checks passed!
```

```shell
> make lint
poetry run ruff check docs templates cookbook
All checks passed!
poetry run ruff format docs templates cookbook --diff
1417 files already formatted
poetry run ruff check --select I docs templates cookbook
All checks passed!
git grep 'from langchain import' docs/docs templates cookbook | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
```

~cd libs/community; make test~ too much dependencies for integration ...

```shell
>  poetry run pytest tests/unit_tests   
....
==== 884 passed, 466 skipped, 4447 warnings in 15.93s ====
```

I choose you randomly : @ccurme
2024-06-20 13:52:23 -07:00
Philippe PRADOS
807ec09687 community[minor]: Add SQL storage implementation (#22207)
Hello @eyurtsev

- package: langchain-comminity
- **Description**: Add SQL implementation for docstore. A new
implementation, in line with my other PR ([async
PGVector](https://github.com/langchain-ai/langchain-postgres/pull/32),
[SQLChatMessageMemory](https://github.com/langchain-ai/langchain/pull/22065))
- Twitter handler: pprados

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Piotr Mardziel <piotrm@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:23 -07:00
Nithish Raghunandanan
6af0388c0b couchbase: Add the initial version of Couchbase partner package (#22087)
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:22 -07:00
Cahid Arda Öz
e8787ef536 community[minor]: Add UpstashRatelimitHandler (#21885)
Adding `UpstashRatelimitHandler` callback for rate limiting based on
number of chain invocations or LLM token usage.

For more details, see [upstash/ratelimit-py
repository](https://github.com/upstash/ratelimit-py) or the notebook
guide included in this PR.

Twitter handle: @cahidarda

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:22 -07:00
Erick Friis
59e80b4663 docs: remove nonexistent headings (#22685) 2024-06-20 13:52:22 -07:00
Erick Friis
c2749da90d core: add error message for non-structured llm to StructuredPrompt (#22684)
previously was the blank `NotImplementedError` from
`BaseLanguageModel.with_structured_output`
2024-06-20 13:52:22 -07:00
Jacob Lee
d0c838ca9b docs[patch]: Adds LangGraph and LangSmith links, adds more crosslinks between pages (#22656)
@baskaryan @hwchase17
2024-06-20 13:52:22 -07:00
Mateusz Szewczyk
bf651780b0 docs: Updated product version in Embeddings notebook (#22062) 2024-06-20 13:52:22 -07:00
ccurme
6fa7244a05 anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.

There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
    for text in stream.text_stream:
        snapshot = stream.current_message_snapshot
        print(f"{count}: {snapshot.usage} -- {text}")
        count = count + 1

final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) --  How
4: Usage(input_tokens=8, output_tokens=1) --  can
5: Usage(input_tokens=8, output_tokens=1) --  I
6: Usage(input_tokens=8, output_tokens=1) --  assist
7: Usage(input_tokens=8, output_tokens=1) --  you
8: Usage(input_tokens=8, output_tokens=1) --  today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.

2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
    print(f"{count}: {event}")
    count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```

Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.

This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.

Usage:
```python
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-3-haiku-20240307",
    temperature=0
)

full = None
for chunk in model.stream("hi"):
    full = chunk if full is None else full + chunk
    print(chunk)

print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}

Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-20 13:52:22 -07:00
Bagatur
a0f55a3038 community[patch]: Release 0.2.4 (#22643) 2024-06-20 13:52:22 -07:00
Francesco Kruk
e0d0052be0 docs: Update jina embedding notebook to include multimodal capability (#22594)
After merging the [PR #22416 to include Jina AI multimodal
capabilities](https://github.com/langchain-ai/langchain/pull/22416), we
updated the Jina AI embedding notebook accordingly.
2024-06-20 13:52:22 -07:00
William FH
d5995c7e67 [Core] Unified Enable/Disable Tracing (#22576) 2024-06-20 13:52:22 -07:00
Leonid Ganeline
ca43d82ad1 docs: arxiv page update (#22574)
Added a link to search the arXiv papers with references to LangChain.
Updated table: better format (no horizontal scroll in table anymore).
2024-06-20 13:52:22 -07:00
Bagatur
0b542e304a langchain[patch]: Release 0.2.3 (#22644) 2024-06-20 13:52:22 -07:00
Erick Friis
90468a960b multiple: get rid of pyproject extras (#22581)
They cause `poetry lock` to take a ton of time, and `uv pip install` can
resolve the constraints from these toml files in trivial time
(addressing problem with #19153)

This allows us to properly upgrade lockfile dependencies moving forward,
which revealed some issues that were either fixed or type-ignored (see
file comments)
2024-06-20 13:52:22 -07:00
Bagatur
a0e9205d0a core[patch]: Release 0.2.5 (#22642) 2024-06-20 13:52:22 -07:00
Eugene Yurtsev
29113872f2 core[patch]: Correctly order parent ids in astream events (from root to immediate parent), add defensive check for cycles (#22637)
This PR makes two changes:

1. Fixes the order of parent IDs to be from root to immediate parent
2. Adds a simple defensive check for cycles
2024-06-20 13:52:22 -07:00
Satyam Kumar
4e7cf0067e updated oracleai_demo.ipynb (#22635)
The outer try/except block handles connection errors, and the inner
try/except block handles SQL execution errors, providing detailed error
messages for both.
try:
    conn = oracledb.connect(user=username, password=password, dsn=dsn)
    print("Connection successful!")

    cursor = conn.cursor()
    try:
        cursor.execute(
            """
            begin
                -- Drop user
                begin
                    execute immediate 'drop user testuser cascade';
                exception
                    when others then
dbms_output.put_line('Error dropping user: ' || SQLERRM);
                end;

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:22 -07:00
Eugene Yurtsev
306ad68c40 core[minor]: Add parent_ids to astream_events API (#22563)
Include a list of parent ids for each event in astream events.
2024-06-20 13:52:22 -07:00
Tomaz Bratanic
15896f3bee docs[patch]: Fix diffbot docs (#22584) 2024-06-20 13:52:22 -07:00
Eugene Yurtsev
daa04e98d7 docs: Add information about run time binding values to tools (#22623)
Add how-to guide that shows a design pattern for creating tools at run time
2024-06-20 13:52:22 -07:00
CharlesCNorton
62819f723b docs[patch]: typo in AutoGPT example notebook (#22631)
Corrected a typo in the AutoGPT example notebook. Changed "Needed synce
jupyter runs an async eventloop" to "Needed since Jupyter runs an async
event loop".

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:22 -07:00
CharlesCNorton
c8116d053f docs: typo in dev container documentation (#22630)
removed an extra space before the period in the "Click **Create
codespace on master**." line.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:22 -07:00
Nicolas Nkiere
3254931b8d core[minor]: Add an async root listener and with_alisteners method (#22151)
- [x] **Adding AsyncRootListener**: "langchain_core: Adding
AsyncRootListener"

- **Description:** Adding an AsyncBaseTracer, AsyncRootListener and
`with_alistener` function. This is to enable binding async root listener
to runnables. This currently only supported for sync listeners.
- **Issue:** None
- **Dependencies:** None

- [x] **Add tests and docs**: Added units tests and example snippet code
within the function description of `with_alistener`


- [x] **Lint and test**: Run make format_diff, make lint_diff and make
test
2024-06-20 13:52:22 -07:00
seyf97
0085137f41 openai[patch]: correct grammar in exception message in embeddings/base.py (#22629)
Correct the grammar error for missing transformers package ValueError
2024-06-20 13:52:22 -07:00
Anush
e8298c1cc2 qdrant[patch]: Make path optional in from_existing_collection() (#21875)
## Description

The `path` param is used to specify the local persistence directory,
which isn't required if using Qdrant server.

This is a breaking but necessary change.
2024-06-20 13:52:22 -07:00
ccurme
94c00af674 multiple: implement ls_params (#22621)
implement ls_params for ai21, fireworks, groq.
2024-06-20 13:52:22 -07:00
Xiangrui Meng
6e1903146b community: support Databricks Unity Catalog functions as LangChain tools (#22555)
This PR adds support for using Databricks Unity Catalog functions as
LangChain tools, which runs inside a Databricks SQL warehouse.

* An example notebook is provided.
2024-06-20 13:52:22 -07:00
ccurme
df610d06c8 anthropic: update attribute name and alias (#22625)
update name to `stop_sequences` and alias to `stop` (instead of the
other way around), since `stop_sequences` is the name used by anthropic.
2024-06-20 13:52:22 -07:00
lucasiscovici
9929f9e6fe community[patch]: pgvector replace nin_ by not_in (#22619)
- [ ] **community**: "pgvector: replace nin_ by not_in"

- [ ] **PR message**: nin_ do not exist in sqlalchemy orm, it's not_in
2024-06-20 13:52:22 -07:00
ccurme
1e7ce454cf multiple: add stop attribute (#22573) 2024-06-20 13:52:22 -07:00
ccurme
912ffb4afb Revert "anthropic: stream token usage" (#22624)
Reverts langchain-ai/langchain#20180
2024-06-20 13:52:22 -07:00
Bagatur
129de1590b anthropic: stream token usage (#20180)
open to other ideas
<img width="1181" alt="Screenshot 2024-04-08 at 5 34 08 PM"
src="https://github.com/langchain-ai/langchain/assets/22008038/03eb11c4-5eb5-43e3-9109-a13f76098fa4">

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:22 -07:00
liuzc9
620f27555c docs: Fix typo in llmonitor.md (#22590) 2024-06-20 13:52:22 -07:00
Bagatur
63b72544ba docs: Add ChatGoogleGenerativeAI to model feat table (#22617) 2024-06-20 13:52:22 -07:00
Satyam Kumar
154e177bbe openai, azure: update model_name in ChatResult to use name from API response (#22569)
The response.get("model", self.model_name) checks if the model key
exists in the response dictionary. If it does, it uses that value;
otherwise, it uses self.model_name.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:22 -07:00
Suganth Solamanraja
34cb5a7bed docs: Correct return type in docstring (#22597)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: 
- **Description:** This PR corrects the return type in the docstring of
the `docs/api_reference/create_api_rst.py/_load_package_modules`
function. The return type was previously described as a list of

Co-authored-by: suganthsolamanraja <suganth.solamanraja@techjays..com>
2024-06-20 13:52:22 -07:00
svmpsp-rc
ad01792ed4 docs: correct typos in Italian words (#22606)
**Description**

Fix typos in Italian words.
2024-06-20 13:52:22 -07:00
Gabriele Ghisleni
4e583b87e8 docs: ElasticsearchCacheStore in stores integrations documentation (#22612)
The package for LangChain integrations with Elasticsearch
https://github.com/langchain-ai/langchain-elastic contains a
Elasticsearch byte store cache integration (see
https://github.com/langchain-ai/langchain-elastic/pull/27). This is the
documentation contribution on the page dedicated to stores integrations

Co-authored-by: Gabriele Ghisleni <gabriele.ghisleni@spaziodati.eu>
2024-06-20 13:52:22 -07:00
Christophe Bornet
327ab264fb core[patch]: Use explicit classes for InMemoryByteStore and InMemoryStore (#22608)
The current implementation doesn't work well with type checking.
Instead replace with class definition that correctly works with type
checking.
2024-06-20 13:52:22 -07:00
andyjessen
c72b3326c9 docs: Fix description (#22611)
This commit fixes the description of the hair_color field.
2024-06-20 13:52:22 -07:00
ccurme
732af24313 together: bump langchain-core (#22616)
langchain-together depends on langchain-openai ^0.1.8
langchain-openai 0.1.8 has langchain-core >= 0.2.2

Here we bump langchain-core to 0.2.2, just to pass minimum dependency
version tests.
2024-06-20 13:52:22 -07:00
ccurme
7397b7f20a together[patch]: Release 0.1.3 (#22615) 2024-06-20 13:52:22 -07:00
Asi Greenholts
b7c552506c docs: Fix typo (#22596)
Fix typo
2024-06-20 13:52:22 -07:00
CharlesCNorton
078cce9292 fix: typo in Agents section of README (#22599)
Corrected the phrase "complete done" to "completely done" for better
grammatical accuracy and clarity in the Agents section of the README.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:22 -07:00
Kirushikesh DB
42286e31fd docs: Removed unwanted cell in refine segment (#22604)
**Description:**
There is one unwanted duplicate cell in refine section of summarization
documentation, i have removed it.
2024-06-20 13:52:22 -07:00
andyjessen
96da77cb26 docs: Fix typo (#22603)
This commit changes minor typo in the field description.
2024-06-20 13:52:21 -07:00
Isaac Francisco
e8de5f9178 community[patch]: recursive url loader fix and unit tests (#22521)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:21 -07:00
Jacob Lee
d003350322 docs[minor]: Add "Build a PDF ingestion and Question/Answering system" tutorial (#22570)
More direct entrypoint for a common use-case. Meant to give people a
more hands-on intro to document loaders/loading data from different data
sources as well.

Some duplicate content for RAG and extraction (to show what you can do
with the loaded documents), but defers to the appropriate sections
rather than going too in-depth.

@baskaryan @hwchase17
2024-06-20 13:52:21 -07:00
Jeffrey Mak
e37d7ad66e community[patch]:Support filter for AzureAISearchRetriever (#22303)
**Description**: 
The AzureAISearchRetriever does not support the "$filter" argument
offered in the AISearch API:
https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-get?view=rest-searchservice-2023-11-01&tabs=HTTP
The $filter allows filtering of indexes based on values in metadata.

**Issue**: 
https://github.com/langchain-ai/langchain/issues/19885

**Dependencies**: 
No

**Twitter handle**: 
@Jeffreym9M
 

- [ ] **Add tests and docs**: Not relevant


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:21 -07:00
Isaac Francisco
035992e8fc docs: duckduckgosearch options listed (#22568)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:21 -07:00
Mikhail Khludnev
1b410bb6e5 docs: mentioning query_instruction with regards to BGE-M3 (#22405)
see
https://github.com/langchain-ai/langchain/pull/18017#issuecomment-2143942760
https://huggingface.co/BAAI/bge-m3#faq

Co-authored-by: mikhail-khludnev <mikhail_khludnev@rntgroup.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:21 -07:00
X-HAN
c517ce0c8d community[minor]: add DashScope Rerank (#22403)
**Description:** this PR adds DashScope Rerank capability to Langchain,
you can find DashScope Rerank API from
[here](https://help.aliyun.com/document_detail/2780058.html?spm=a2c4g.2780059.0.0.6d995024FlrJ12)
&
[here](https://help.aliyun.com/document_detail/2780059.html?spm=a2c4g.2780058.0.0.63f75024cr11N9).
[DashScope](https://dashscope.aliyun.com/) is the generative AI service
from Alibaba Cloud (Aliyun). You can create DashScope API key from
[here](https://bailian.console.aliyun.com/?apiKey=1#/api-key).

**Dependencies:** DashScopeRerank depends on `dashscope` python package.

**Twitter handle:** my twitter/x account is https://x.com/LastMonopoly
and I'd like a mention, thanks you!


**Tests and docs**
  1. integration test: `test_dashscope_rerank.py`
  2. example notebook: `dashscope_rerank.ipynb`

**Lint and test**: I have run `make format`, `make lint` and `make test`
from the root of the package I've modified.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:21 -07:00
Ethan Yang
1f85b55db2 [Community]add option to delete the prompt from HF output (#22225)
This will help to solve pattern mismatching issue when parsing the
output in Agent.

https://github.com/langchain-ai/langchain/issues/21912
2024-06-20 13:52:21 -07:00
Jacob Lee
a07f3ac0a6 docs[patch]: Adds heading keywords to concepts page (#22577)
@efriis @baskaryan
2024-06-20 13:52:21 -07:00
Erick Friis
b41d805992 docs: update agentexecutor title to legacy (#22575) 2024-06-20 13:52:21 -07:00
Bagatur
bbc8819d0c community[patch]: AzureSearch async functions (#22075) 2024-06-20 13:52:21 -07:00
Bagatur
a1df71ad8e langchain[minor]: add universal init_model (#22039)
decisions to discuss
- only chat models
- model_provider isn't based on any existing values like llm-type,
package names, class names
- implemented as function not as a wrapper ChatModel
- function name (init_model)
- in langchain as opposed to community or core
- marked beta
2024-06-20 13:52:21 -07:00
Isaac Francisco
edca3e33dd docs: deprecation of max_length parameter used in Exa search (#22567) 2024-06-20 13:52:21 -07:00
ccurme
15450bdef5 community: update how OpenAIAssistantV2Runnable creates threads with tool_resources (#22549)
https://github.com/langchain-ai/langchain/issues/22503
2024-06-20 13:52:21 -07:00
Bagatur
24abcc60e8 community[patch]: Release 0.2.3 (#22562) 2024-06-20 13:52:21 -07:00
Bagatur
9837bc92b3 nomic[patch]: Release 0.1.2 (#22561) 2024-06-20 13:52:21 -07:00
Zach Nussbaum
4edd6af4fb embeddings: nomic embed vision (#22482)
Thank you for contributing to LangChain!

**Description:** Adds Langchain support for Nomic Embed Vision
**Twitter handle:** nomic_ai,zach_nussbaum


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:21 -07:00
leila-messallem
3d1fff0cdd community[patch]: improve test setup to accurately test filtering of labels in neo4j (#22531)
**Description:** This PR addresses an issue with an existing test that
was not effectively testing the intended functionality. The previous
test setup did not adequately validate the filtering of the labels in
neo4j, because the nodes and relationship in the test data did not have
any properties set. Without properties these labels would not have been
returned, regardless of the filtering.

---------

Co-authored-by: Oskar Hane <oh@oskarhane.com>
2024-06-20 13:52:21 -07:00
Mohammad Mohtashim
f4c6b05497 [Experimental]: Async agenerate method ollama functions (#21682)
- **Description:** :
Added Async method for Generate for OllamaFunctions which was missing
and was raising errors for the users.
   
- **Issue:** 
#21422
2024-06-20 13:52:21 -07:00
Stefano Lottini
db9a7df552 community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548)
This PR adds a constructor `metadata_indexing` parameter to the
Cassandra vector store to allow optional fine-tuning of which fields of
the metadata are to be indexed.

This is a feature supported by the underlying CassIO library. Indexing
mode of "all", "none" or deny- and allow-list based choices are
available.

The rationale is, in some cases it's advisable to programmatically
exclude some portions of the metadata from the index if one knows in
advance they won't ever be used at search-time. this keeps the index
more lightweight and performant and avoids limitations on the length of
_indexed_ strings.

I added a integration test of the feature. I also added the possibility
of running the integration test with Cassandra on an arbitrary IP
address (e.g. Dockerized), via
`CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or
similar.

While I was at it, I added a line to the `.gitignore` since the mypy
_test_ cache was not ignored yet.

My X (Twitter) handle: @rsprrs.
2024-06-20 13:52:21 -07:00
Emilien Chauvet
09492f78fa community[minor]: add user agent for web scraping loaders (#22480)
**Description:** This PR adds a `USER_AGENT` env variable that is to be
used for web scraping. It creates a util to get that user agent and uses
it in the classes used for scraping in [this piece of
doc](https://python.langchain.com/v0.1/docs/use_cases/web_scraping/).
Identifying your scraper is considered a good politeness practice, this
PR aims at easing it.
**Issue:** `None`
**Dependencies:** `None`
**Twitter handle:** `None`
2024-06-20 13:52:21 -07:00
Philippe PRADOS
f71ce8fd76 community[minor]: Add native async support to SQLChatMessageHistory (#22065)
# package community: Fix SQLChatMessageHistory

## Description
Here is a rewrite of `SQLChatMessageHistory` to properly implement the
asynchronous approach. The code circumvents [issue
22021](https://github.com/langchain-ai/langchain/issues/22021) by
accepting a synchronous call to `def add_messages()` in an asynchronous
scenario. This bypasses the bug.

For the same reasons as in [PR
22](https://github.com/langchain-ai/langchain-postgres/pull/32) of
`langchain-postgres`, we use a lazy strategy for table creation. Indeed,
the promise of the constructor cannot be fulfilled without this. It is
not possible to invoke a synchronous call in a constructor. We
compensate for this by waiting for the next asynchronous method call to
create the table.

The goal of the `PostgresChatMessageHistory` class (in
`langchain-postgres`) is, among other things, to be able to recycle
database connections. The implementation of the class is problematic, as
we have demonstrated in [issue
22021](https://github.com/langchain-ai/langchain/issues/22021).

Our new implementation of `SQLChatMessageHistory` achieves this by using
a singleton of type (`Async`)`Engine` for the database connection. The
connection pool is managed by this singleton, and the code is then
reentrant.

We also accept the type `str` (optionally complemented by `async_mode`.
I know you don't like this much, but it's the only way to allow an
asynchronous connection string).

In order to unify the different classes handling database connections,
we have renamed `connection_string` to `connection`, and `Session` to
`session_maker`.

Now, a single transaction is used to add a list of messages. Thus, a
crash during this write operation will not leave the database in an
unstable state with a partially added message list. This makes the code
resilient.

We believe that the `PostgresChatMessageHistory` class is no longer
necessary and can be replaced by:
```
PostgresChatMessageHistory = SQLChatMessageHistory
```
This also fixes the bug.


## Issue
- [issue 22021](https://github.com/langchain-ai/langchain/issues/22021)
  - Bug in _exit_history()
  - Bugs in PostgresChatMessageHistory and sync usage
  - Bugs in PostgresChatMessageHistory and async usage
- [issue
36](https://github.com/langchain-ai/langchain-postgres/issues/36)
 ## Twitter handle:
pprados

## Tests
- libs/community/tests/unit_tests/chat_message_histories/test_sql.py
(add async test)

@baskaryan, @eyurtsev or @hwchase17 can you check this PR ?
And, I've been waiting a long time for validation from other PRs. Can
you take a look?
- [PR 32](https://github.com/langchain-ai/langchain-postgres/pull/32)
- [PR 15575](https://github.com/langchain-ai/langchain/pull/15575)
- [PR 13200](https://github.com/langchain-ai/langchain/pull/13200)

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:21 -07:00
Vincent Min
d17efe37f1 community[minor]: Improve InMemoryVectorStore with ability to persist to disk and filter on metadata. (#22186)
- **Description:** The InMemoryVectorStore is a nice and simple vector
store implementation for quick development and debugging. The current
implementation is quite limited in its functionalities. This PR extends
the functionalities by adding utility function to persist the vector
store to a json file and to load it from a json file. We choose the json
file format because it allows inspection of the database contents in a
text editor, which is great for debugging. Furthermore, it adds a
`filter` keyword that can be used to filter out documents on their
`page_content` or `metadata`.
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** @Vincent_Min
2024-06-20 13:52:21 -07:00
Christophe Bornet
dacd50d0b9 core[patch]: Improve VectorStore API doc (#22547) 2024-06-20 13:52:21 -07:00
maang-h
882b6cdbca community[patch]: add detailed paragraph and example for BaichuanTextEmbeddings (#22031)
- **Description:** add detailed paragraph and example for
BaichuanTextEmbeddings
   - **Issue:** the issue #21983
2024-06-20 13:52:21 -07:00
Anthony Bernabeu
dd8fdfa375 community[minor]: Added filter search for LanceDB (#22461)
- [ ] **community**: "vectorstore: added filtering support for LanceDB
vector store"

- [ ] **This PR adds filtering capabilities to LanceDB**:
- **Description:** In LanceDB filtering can be applied when searching
for data into the vectorstore. It is using the SQL language as mentioned
in the LanceDB documentation.
    - **Issue:** #18235 
    - **Dependencies:** No

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:21 -07:00
Erick Friis
a3ba5c0048 huggingface: remove text-generation dep (#22543) 2024-06-20 13:52:21 -07:00
Erick Friis
dcf10b7a7f ai21: fix core version (#22544) 2024-06-20 13:52:21 -07:00
Asaf Joseph Gardin
758fad6d03 ai21: fix ai21 unittests (#22526)
Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:21 -07:00
Erick Friis
bc2f874835 community: fix huggingface deprecations (#22522) 2024-06-20 13:52:21 -07:00
Jacob Lee
1ee9926af3 docs[patch]: Adds links to deprecations page (#22514)
@baskaryan
2024-06-20 13:52:21 -07:00
William FH
704a9d4955 [Docs] Structured output Keywords (#22511) 2024-06-20 13:52:21 -07:00
Christophe Bornet
7e28598358 core[patch]: Add similarity_score_threshold to VectorStore search types (#22477) 2024-06-20 13:52:21 -07:00
Eugene Yurtsev
a46ac08183 core[patch]: Deduplicate of callback handlers in merge_configs (#22478)
This PR adds deduplication of callback handlers in merge_configs.

Fix for this issue:
https://github.com/langchain-ai/langchain/issues/22227

The issue appears when the code is:

1) running python >=3.11
2) invokes a runnable from within a runnable
3) binds the callbacks to the child runnable from the parent runnable
using with_config

In this case, the same callbacks end up appearing twice: (1) the first
time from with_config, (2) the second time with langchain automatically
propagating them on behalf of the user.


Prior to this PR this will emit duplicate events:

```python
@tool
async def get_items(question: str, callbacks: Callbacks):  # <--- Accept callbacks
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model.with_config(
        {
            "callbacks": callbacks,  # <-- Propagate callbacks
        }
    )
    return await chain.ainvoke({"question": question})
```

Prior to this PR this will work work correctly (no duplicate events):

```python
@tool
async def get_items(question: str, callbacks: Callbacks):  # <--- Accept callbacks
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model
    return await chain.ainvoke({"question": question}, {"callbacks": callbacks})
```

This will also work (as long as the user is using python >= 3.11) -- as
langchain will automatically propagate callbacks

```python
@tool
async def get_items(question: str,):  
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model
    return await chain.ainvoke({"question": question})
```
2024-06-20 13:52:21 -07:00
Jacob Lee
1b56ca3f84 docs[patch]: Update quickstart tutorial (#22504)
Mentions LCEL more, hopefully flags it to more people as a simple
entrypoint

@baskaryan @hwchase17
2024-06-20 13:52:21 -07:00
Ofer Mendelevitch
b12e1ae568 community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334)
Thank you for contributing to LangChain!

**Description:** update to the Vectara / Langchain integration to
integrate new Vectara capabilities:
- Full RAG implemented as a Runnable with as_rag()
- Vectara chat supported with as_chat()
- Both support streaming response
- Updated documentation and example notebook to reflect all the changes
- Updated Vectara templates

**Twitter handle:** ofermend

**Add tests and docs**: no new tests or docs, but updated both existing
tests and existing docs
2024-06-20 13:52:21 -07:00
Bagatur
95e8cc361e docs: update anthropic chat model (#22483)
Related to #22296

And update anthropic to accept base_url
2024-06-20 13:52:21 -07:00
Erick Friis
6cb5075ce2 robocorp: typo (#22509) 2024-06-20 13:52:21 -07:00
Erick Friis
73308012dc robocorp: release 0.0.9.post1 (#22507) 2024-06-20 13:52:21 -07:00
Erick Friis
6a8b77d30a ai21: release 0.1.6 (#22508) 2024-06-20 13:52:21 -07:00
ccurme
80977fa0bd together, upstage: bump minimum langchain-openai version (#22505) 2024-06-20 13:52:21 -07:00
Erick Friis
450e4af347 docs: fix api ref link generation (#22438)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:21 -07:00
Bagatur
607ea7da83 mongodb[patch]: Release 0.1.6 (#22501) 2024-06-20 13:52:20 -07:00
Bagatur
605fc224ba groq[patch]: Release 0.1.5 (#22500) 2024-06-20 13:52:20 -07:00
Bagatur
612558b251 milvus[patch]: Release 0.1.1 (#22499) 2024-06-20 13:52:20 -07:00
Bagatur
6d8ba31896 upstage[patch]: Release 0.1.6 (#22498) 2024-06-20 13:52:20 -07:00
Bagatur
f44dc02d7e experimental[patch]: Release 0.0.60 (#22497) 2024-06-20 13:52:20 -07:00
Bagatur
cd6d7bfc10 community[patch]: Release 0.2.2 (#22496) 2024-06-20 13:52:20 -07:00
Bagatur
2043480211 langchain[patch]: Release 0.2.2 (#22495) 2024-06-20 13:52:20 -07:00
Bagatur
ef5684ef31 mistralai[patch]: Release 0.1.8 (#22494) 2024-06-20 13:52:20 -07:00
Bagatur
72170ca991 huggingface[patch]: release 0.0.2 (#22493) 2024-06-20 13:52:20 -07:00
Jacob Lee
4ece7d9ecd docs[patch]: Add robots.txt and root sitemap (#22492)
CC @efriis @baskaryan
2024-06-20 13:52:20 -07:00
Bagatur
32a47d0e68 text-splitters[patch]: Release 0.2.1 (#22490) 2024-06-20 13:52:20 -07:00
Bagatur
817018c410 core[patch]: Release 0.2.4 (#22489) 2024-06-20 13:52:20 -07:00
Ragul Kachiappan
814c43f2d2 docs: Update chroma docs link for collection reference (#22472)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: 
- **Description:** Updated dead link referencing chroma docs in Chroma
notebook under vectorstores
2024-06-20 13:52:20 -07:00
nareshnagpal06
c9c909bf7a docs: Added Semantic Cache Example with BedrockChat using Bedrock Embedding… (#22190)
…s and Opensearch Semantic Cache

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:20 -07:00
Joydeep Banik Roy
ee55b59b07 community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271)
- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of https://github.com/langchain-ai/langchain/issues/18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:20 -07:00
KyrianC
25d50e05c9 community[minor]: Add tools calls to ChatEdenAI (#22320)
### Description  
Add tools implementation to `ChatEdenAI`:
- `bind_tools()`
- `with_structured_output()`

### Documentation 
Updated `docs/docs/integrations/chat/edenai.ipynb`

### Notes
We don´t support stream with tools as of yet. If stream is called with
tools we directly yield the whole message from `generate` (implemented
the same way as Anthropic did).
2024-06-20 13:52:20 -07:00
pranavvuppala
57ec452c09 docs : Update docstrings for OpenAI base.py (#22221)
- [x] **PR title**: Update docstrings for OpenAI base.py
-**Description:** Updated the docstring of few OpenAI functions for a
better understanding of the function.
    - **Issue:** #21983

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:20 -07:00
Anindyadeep
8f59c47c2c communty[patch]: Native RAG Support in Prem AI langchain (#22238)
This PR adds native RAG support in langchain premai package. The same
has been added in the docs too.
2024-06-20 13:52:20 -07:00
Rahul Triptahi
87ef2458b5 community[minor]: Enable retrieval api calls in PebbloRetrievalQA (#21958)
Description: Enable app discovery and Prompt/Response apis in
PebbloSafeRetrieval
Documentation: NA
Unit test: N/A

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-06-20 13:52:20 -07:00
liugz18
6b05e24e0e experimental[patch]: Fix graph_transformers llms #21482 (#22417)
Fix AttributeError on calling
LLMGraphTransformer.convert_to_graph_documents #21482

 since raw_schema is always a str

@baskaryan
2024-06-20 13:52:20 -07:00
ccurme
be75c749e8 core[patch]: bump langsmith (#22476)
Noticing errors logged in some situations when tracing with Langsmith:
```python
from langchain_core.pydantic_v1 import BaseModel
from langchain_anthropic import ChatAnthropic


class AnswerWithJustification(BaseModel):
    """An answer to the user question along with justification for the answer."""
    answer: str
    justification: str


llm = ChatAnthropic(model="claude-3-haiku-20240307")
structured_llm = llm.with_structured_output(AnswerWithJustification)

list(structured_llm.stream("What weighs more a pound of bricks or a pound of feathers"))
```
```
Error in LangChainTracer.on_chain_end callback: AttributeError("'NoneType' object has no attribute 'append'")
[AnswerWithJustification(answer='A pound of bricks and a pound of feathers weigh the same amount.', justification='This is because a pound is a unit of mass, not volume. By definition, a pound of any material, whether bricks or feathers, will weigh the same - one pound. The physical size or volume of the materials does not matter when measuring by mass. So a pound of bricks and a pound of feathers both weigh exactly one pound.')]
```
2024-06-20 13:52:20 -07:00
Bagatur
4c546856e2 community[patch]: deprecate all HF classes (#22444) 2024-06-20 13:52:20 -07:00
Nuno Campos
82522f4735 Use immutable sequence type for batch/batch_as_completed types (#22433)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:20 -07:00
Christophe Bornet
ef6315d721 community[minor]: Improve Cassandra VectorStore as_retriever (#22465)
The Vectorstore's API `as_retriever` doesn't expose explicitly the
parameters `search_type` and `search_kwargs` and so these are not well
documented.
This PR improves `as_retriever` for the Cassandra VectorStore by making
these parameters explicit.

NB: An alternative would have been to modify `as_retriever` in
`Vectorstore`. But there's probably a good reason these were not exposed
in the first place ? Is it because implementations may decide to not
support them and have fixed values when creating the
VectorStoreRetriever ?
2024-06-20 13:52:20 -07:00
Christophe Bornet
b471d8d020 core[patch]: Fix VectorStore's as_retriever mutating tags param (#22470)
The current VectorStore `as_retriever` implementation mutates the `tags`
param when it's passed in kwargs.
This fix ensures that a copy is done.
2024-06-20 13:52:20 -07:00
Michal Gregor
40c02bee0f huggingface[patch]: Support for HuggingFacePipeline in ChatHuggingFace. (#22194)
- **Description:** Added support for using HuggingFacePipeline in
ChatHuggingFace (previously it was only usable with API endpoints,
probably by oversight).
- **Issue:** #19997 
- **Dependencies:** none
- **Twitter handle:** none

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:20 -07:00
Fahreddin Özcan
aa2de46241 community[patch]: Upstash Vector Store Namespace Support (#22251)
This PR introduces namespace support for Upstash Vector Store, which
would allow users to partition their data in the vector index.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:20 -07:00
Isaac Francisco
333f272f89 docs: rag tutorial small fixes (#22450) 2024-06-20 13:52:20 -07:00
Jacob Lee
febf52853a docs[patch]: Adds search keywords for common queries (#22449)
CC @baskaryan @efriis @ccurme
2024-06-20 13:52:20 -07:00
Guangdong Liu
fce1b63203 core(patch):fix partial_variables not working with SystemMessagePromptTemplate (#20711)
- **Issue:**  close #17560
- @baskaryan, @eyurtsev
2024-06-20 13:52:20 -07:00
Martin Kolb
392332d171 docs: Fix doc issue for HANA Cloud Vector Engine (#22260)
- **Description:**
This PR fixes a rendering issue in the docs (Python notebook) of HANA
Cloud Vector Engine.

  - **Issue:** N/A
  - **Dependencies:** no new dependencies added

File of the fixed notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
2024-06-20 13:52:20 -07:00
Dristy Srivastava
d6ab3746b0 community[minor]: Updating payload for pebblo discover API (#22309)
**Description:** Updating response for pebblo discover API. Also
updating filed name case type
**Documentation:** N/A
**Unit tests:** N/A
2024-06-20 13:52:20 -07:00
Miroslav
95ec5e6991 huggingface[patch]: Skip Login to HuggingFaceHub when token is not set (#22365) 2024-06-20 13:52:20 -07:00
Stefano Lottini
dd530b2a4a docs: Astra DB vectorstore, add automatic-embedding example (#22350)
Description: Adding an example showcasing the newly-introduced API-side
embedding computation option for the Astra DB vector store
2024-06-20 13:52:20 -07:00
bhardwaj-vipul
4473ffb3a4 langchain[patch]: Fix MongoDBAtlasVectorSearch reference in self query retriever (#22401)
**Description:** 
SelfQuery Retriever with MongoDBAtlasVectorSearch (from
langchain_mongodb import MongoDBAtlasVectorSearch) and
Chroma (from langchain_chroma import Chroma) is not supported.
The imports in the [builtin
translators](8cbce684d4/libs/langchain/langchain/retrievers/self_query/base.py (L73))
points to the
[deprecated](acaf214a45/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L36))
vectorstore.

**Issue:** 
#22272

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:20 -07:00
ccurme
1c119a4b93 community: add standard chat model params to Ollama (#22446) 2024-06-20 13:52:20 -07:00
Isaac Francisco
ae9da0cff0 docs: agents tutorial wording (#22447) 2024-06-20 13:52:20 -07:00
Ethan Yang
01d48500ad community[patch]: Update OpenVINO embedding and reranker to support static input shape (#22171)
It can help to deploy embedding models on NPU device
2024-06-20 13:52:20 -07:00
Tom Clelford
b303f3eecd text-splitters[patch]: fix HTMLSectionSplitter parsing of xslt paths (#22176)
## Description
This PR allows passing the HTMLSectionSplitter paths to xslt files. It
does so by fixing two trivial bugs with how passed paths were being
handled. It also changes the default value of the param `xslt_path` to
`None` so the special case where the file was part of the langchain
package could be handled.

## Issue
#22175
2024-06-20 13:52:20 -07:00
maang-h
3da42deaca community[minor]: Implement MiniMaxChat interface (#22391)
- **Description:** Implement MiniMaxChat interface, include:
    - No longer inherits the LLM class (like other chat model)
    - Update request parameters (v1 -> v2)
        - update `base url`
        - update message role (system, user, assistant)
        - add `stream` function
        - no longer use `group id`
    - Implement the `_stream`, `_agenerate`, and `_astream` interfaces

[minimax v2 api
document](https://platform.minimaxi.com/document/guides/chat-model/V2?id=65e0736ab2845de20908e2dd)
2024-06-20 13:52:20 -07:00
Brandon Sharp
a25d7658dc community[patch]: Airtable to allow for addtl params (#22092)
- [X] **PR title**: "community: added optional params to Airtable
table.all()"


- [X] **PR message**: 
- **Description:** Add's **kwargs to AirtableLoader to allow for kwargs:
https://pyairtable.readthedocs.io/en/latest/api.html#pyairtable.Table.all
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** parakoopa88


- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:20 -07:00
Harichandan Roy
f71eaf2452 community[patch]: update embeddings/oracleai.py (#22240)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"

"community/embeddings: update oracleai.py"

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

Adding oracle VECTOR_ARRAY_T support.

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Tests are not impacted.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Done.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:20 -07:00
maang-h
78c65ac1df community[patch]: Update the default api_url and reqeust_body of sparkllm embedding (#22136)
- **Description:** When I was running the SparkLLMTextEmbeddings,
app_id, api_key and api_secret are all correct, but it cannot run
normally using the current URL.

    ```python
    # example
    from langchain_community.embeddings import SparkLLMTextEmbeddings

    embedding= SparkLLMTextEmbeddings(
        spark_app_id="my-app-id",
        spark_api_key="my-api-key",
        spark_api_secret="my-api-secret"
    )
    embedding= "hello"
    print(spark.embed_query(text1))
    ```

![sparkembedding](https://github.com/langchain-ai/langchain/assets/55082429/11daa853-4f67-45b2-aae2-c95caa14e38c)
   
So I updated the url and request body parameters according to
[Embedding_api](https://www.xfyun.cn/doc/spark/Embedding_api.html), now
it is runnable.
2024-06-20 13:52:20 -07:00
Yuwen Hu
fbc1d01ac1 community[minor]: Add IPEX-LLM BGE embedding support on both Intel CPU and GPU (#22226)
**Description:** [IPEX-LLM](https://github.com/intel-analytics/ipex-llm)
is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local
PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low
latency. This PR adds ipex-llm integrations to langchain for BGE
embedding support on both Intel CPU and GPU.
**Dependencies:** `ipex-llm`, `sentence-transformers`
**Contribution maintainer**: @Oscilloscope98 
**tests and docs**: 
- langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb
- langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb
-
langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py

---------

Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
2024-06-20 13:52:20 -07:00
Jacob Lee
8e7e3c452a core[patch]: RFC: Allow concatenation of messages with multi part content (#22002)
Anthropic's streaming treats tool calls as different content parts
(streamed back with a different index) from normal content in the
`content`.

This means that we need to update our chunk-merging logic to handle
chunks with multi-part content. The alternative is coerceing Anthropic's
responses into a string, but we generally like to preserve model
provider responses faithfully when we can. This will also likely be
useful for multimodal outputs in the future.

This current PR does unfortunately make `index` a magic field within
content parts, but Anthropic and OpenAI both use it at the moment to
determine order anyway. To avoid cases where we have content arrays with
holes and to simplify the logic, I've also restricted merging to chunks
in order.

TODO: tests

CC @baskaryan @ccurme @efriis
2024-06-20 13:52:20 -07:00
Dan
a1cd4a6f1e community: fix AzureSearch delete documents (#22315)
**Description**

Fix AzureSearch delete documents method by using FIELDS_ID variable
instead of the hard coded "id" value

**Issue:** 

This is linked to this issue:
https://github.com/langchain-ai/langchain/issues/22314

Co-authored-by: dseban <dan.seban@neoxia.com>
2024-06-20 13:52:20 -07:00
Harrison Chase
d3d0109cee fix error message (#22437)
Was confusing when language is in Enum but not implemented
2024-06-20 13:52:20 -07:00
Bagatur
322bb9cc03 infra: bump anthropic mypy 1 (#22373) 2024-06-20 13:52:20 -07:00
Nuno Campos
c64dd4caea core: In BaseRetriever make get_relevant_docs delegate to invoke (#22434)
- This fixes all the tracing issues with people still using
get_relevant_docs, and a change we need for 0.3 anyway

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:20 -07:00
Zheng Robert Jia
5147e23abe docs: resolve minor syntax error. (#22375)
Used the correct magic command. 
Changed from `% pip...` to `%pip`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:20 -07:00
Charles John
0004a40654 community: fix missing apify_api_token field in ApifyWrapper (#22421)
- **Description:** The `ApifyWrapper` class expects `apify_api_token` to
be passed as a named parameter or set as an environment variable. But
the corresponding field was missing in the class definition causing the
argument to be ignored when passed as a named param. This patch fixes
that.
2024-06-20 13:52:20 -07:00
Klaudia Lemiec
eb88968dc7 docs: notebook loader: change .html to .ipynb (#22407)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:20 -07:00
Joan Fontanals
c6a3f251eb add embed_image API to JinaEmbedding (#22416)
- **Description:** Add `embed_image` to JinaEmbedding to embed images
 - **Twitter handle:** https://x.com/JinaAI_
2024-06-20 13:52:20 -07:00
Qingchuan Hao
8f4925eeec docs: add Microsoft Azure to ChatModelTabs (#22367)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:19 -07:00
Nuno Campos
a62ffac8c7 core: In RunnableSequence pass kwargs to the first step (#22393)
- This is a pattern that shows up occasionally in langgraph questions,
people chain a graph to something else after, and want to pass the graph
some kwargs (eg. stream_mode)
2024-06-20 13:52:19 -07:00
Jeffrey Morgan
6a28580ffd Update Ollama instructions (#22394) 2024-06-20 13:52:19 -07:00
Harrison Chase
5371d707b9 update agent docs (#22370)
to use create_react_agent

---------

Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2024-06-20 13:52:19 -07:00
Jacob Lee
7a36bb1fb6 👥 Update LangChain people data (#22388)
👥 Update LangChain people data

Co-authored-by: github-actions <github-actions@github.com>
2024-06-20 13:52:19 -07:00
Jacob Lee
44b6a23c19 docs[patch]: Fix typo (#22377) 2024-06-20 13:52:19 -07:00
Bagatur
51519f9cf3 docs: fix llm caches redirect (#22371) 2024-06-20 13:52:19 -07:00
Bagatur
6c5f8f5a9b anthropic[patch]: Release 0.1.15, fix sdk tools break (#22369) 2024-06-20 13:52:19 -07:00
Erick Friis
76811a4b8c ai21: fix text-splitters version (#22366) 2024-06-20 13:52:19 -07:00
Erick Friis
87ca667387 docs: redirect integration links to 0.2 (#22326) 2024-06-20 13:52:19 -07:00
ccurme
83de34339b docs: update retriever how-to content (#22362)
- [x] How to: use a vector store to retrieve data
- [ ] How to: generate multiple queries to retrieve data for
- [x] How to: use contextual compression to compress the data retrieved
- [x] How to: write a custom retriever class
- [x] How to: add similarity scores to retriever results
^ done last month
- [x] How to: combine the results from multiple retrievers
- [x] How to: reorder retrieved results to mitigate the "lost in the
middle" effect
- [x] How to: generate multiple embeddings per document
^ this PR
- [ ] How to: retrieve the whole document for a chunk
- [ ] How to: generate metadata filters
- [ ] How to: create a time-weighted retriever
- [ ] How to: use hybrid vector and keyword retrieval
^ todo
2024-06-20 13:52:19 -07:00
Jacob Lee
078c5f7a38 docs: Fix Solar and OCI integration page typos (#22343)
@efriis @baskaryan
2024-06-20 13:52:19 -07:00
Bagatur
0491c0d1f2 docs: list tool calling models (#22334) 2024-06-20 13:52:19 -07:00
Bagatur
08b3c5a75c infra: run scheduled tests on aws, google, cohere, nvidia (#22328)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:19 -07:00
Harrison Chase
5549a9117f add simpler agent tutorial (#22249)
1/ added section at start with full code
2/ removed retriever tool (was just distracting)
3/ added section on starting a new conversation

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:19 -07:00
Bagatur
76ff6d6f89 core[patch]: Release 0.2.3 (#22329) 2024-06-20 13:52:19 -07:00
Harrison Chase
b181779fa7 core[patch]: fix runnable history and add docs (#22283) 2024-06-20 13:52:19 -07:00
William FH
0446b0e191 [Core] Update Tracing Interops (#22318)
LangSmith and LangChain context var handling evolved in parallel since
originally we didn't expect people to want to interweave the decorator
and langchain code.

Once we get a new langsmith release, this PR will let you seemlessly
hand off between @traceable context and runnable config context so you
can arbitrarily nest code.

It's expected that this fails right now until we get another release of
the SDK
2024-06-20 13:52:19 -07:00
ccurme
0e21889488 openai: update ChatOpenAI api ref (#22324)
Update to reflect that token usage is no longer default in streaming
mode.

Add detail for streaming context under Token Usage section.
2024-06-20 13:52:19 -07:00
ChengZi
ea0a45ffe3 docs: fix milvus import and update template (#22306)
docs: fix milvus import problem
update milvus-rag template with milvus-lite

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
2024-06-20 13:52:19 -07:00
WU LIFU
b073f92fc6 doc: fix wrong documentation on FAISS load_local function (#22310)
### Issue: #22299 

### descriptions
The documentation appears to be wrong. When the user actually sets this
parameter "asynchronous" to be True, it fails because the __init__
function of FAISS class doesn't allow this parameter. In fact, most of
the class/instance functions of this class have both the sync/async
version, so it looks like what we need is just to remove this parameter
from the doc.

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
2024-06-20 13:52:19 -07:00
maang-h
44efc0392d community[patch]: Standardize qianfan model init args name (#22322)
- **Description:**  
    - Standardize qianfan chat model intialization arguments name
        - qianfan_ak (qianfan api key)  -> api_key
        - qianfan_sk (qianfan secret key)  ->  secret_key
       
    - Delete unuse variable
- **Issue:** #20085
2024-06-20 13:52:19 -07:00
KhoPhi
5f20dd6203 Docs: Ollama (LLM, Chat Model & Text Embedding) (#22321)
- [x] Docs Update: Ollama
  - llm/ollama 
- Switched to using llama3 as model with reference to templating and
prompting
      - Added concurrency notes to llm/ollama docs
  - chat_models/ollama
      - Added concurrency notes to llm/ollama docs
  - text_embedding/ollama
     - include example for specific embedding models from Ollama
2024-06-20 13:52:19 -07:00
Dobiichi-Origami
f85f56927b community: adding tool_call_id for every ToolCall (#22323)
- **Description:** This PR contains a bugfix which result in malfunction
of multi-turn conversation in QianfanChatEndpoint and adaption for
ToolCall and ToolMessage
2024-06-20 13:52:19 -07:00
Bagatur
7b9cd616fe docs: link GH org (#22308) 2024-06-20 13:52:19 -07:00
Bagatur
0bb35e8e8f docs: make llm cache its own section (#22301) 2024-06-20 13:52:19 -07:00
Bagatur
cb43e32041 docs: add v0.2 links to README (#22300) 2024-06-20 13:52:19 -07:00
ccurme
4c53b1d5c4 community, docs: update token usage tracking callback + how-to guides (#22145) 2024-06-20 13:52:19 -07:00
Bagatur
1ed742d5ca docs, cli[patch]: chat model template nit (#22294) 2024-06-20 13:52:19 -07:00
Bagatur
9fa7e53f53 cli[patch]: Release 0.0.24 (#22293) 2024-06-20 13:52:19 -07:00
Bagatur
9e3d7f9c81 docs, cli[patch]: chat model doc template (#22290)
Update ChatModel integration doc template, integration docstring, and
adds langchain-cli command to easily create just doc (for updating
existing integrations):

```bash
langchain-cli integration create-doc --name "foo-bar"
```
2024-06-20 13:52:19 -07:00
Wu Enze
2be9efcee6 docs : Added integrations for memory with langchain_community (#22265)
PR title: Integration Docs enhancement

Description: Adding installation instructions for integrations requiring
langchain-community package since 0.2
Issue: [#22005](https://github.com/langchain-ai/langchain/issues/22005)
2024-06-20 13:52:19 -07:00
ccurme
021ae5fb58 openai[patch]: Release 0.1.8 (#22291) 2024-06-20 13:52:19 -07:00
ccurme
f7ea72374f core[patch]: Release 0.2.2 (#22289) 2024-06-20 13:52:19 -07:00
William FH
2a3e3ad64a Update sequence.ipynb (#22288) 2024-06-20 13:52:19 -07:00
Daniel Glogowski
c48bf03387 docs: updating NIM documentation (#22258)
Updating NVIDIA NIM notebooks and readme file.

Thanks!
Daniel
2024-06-20 13:52:19 -07:00
Bagatur
918d3e134a docs: revamp ChatOpenAI (#22253)
Can build API ref docs by running
```bash
make api_docs_clean; make api_docs_quick_preview API_PKG=openai
```
only builds openai ref, takes ~20 sec
2024-06-20 13:52:19 -07:00
Erick Friis
e95f7ed8e4 robocorp: release 0.0.9 (#22282) 2024-06-20 13:52:19 -07:00
Mikko Korpela
efd0378a2e langchain-robocorp: Fix parsing of Union types (such as Optional). (#22277) 2024-06-20 13:52:19 -07:00
ccurme
619a2c9f8c openai: don't override stream_options default (#22242)
ChatOpenAI supports a kwarg `stream_options` which can take values
`{"include_usage": True}` and `{"include_usage": False}`.

Setting include_usage to True adds a message chunk to the end of the
stream with usage_metadata populated. In this case the final chunk no
longer includes `"finish_reason"` in the `response_metadata`. This is
the current default and is not yet released. Because this could be
disruptive to workflows, here we remove this default. The default will
now be consistent with OpenAI's API (see parameter
[here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)).

Examples:
```python
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()

for chunk in llm.stream("hi"):
    print(chunk)
```
```
content='' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='Hello' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='!' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='' response_metadata={'finish_reason': 'stop'} id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
```

```python
for chunk in llm.stream("hi", stream_options={"include_usage": True}):
    print(chunk)
```
```
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='Hello' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='!' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' response_metadata={'finish_reason': 'stop'} id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```

```python
llm = ChatOpenAI().bind(stream_options={"include_usage": True})

for chunk in llm.stream("hi"):
    print(chunk)
```
```
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='Hello' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='!' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' response_metadata={'finish_reason': 'stop'} id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```
2024-06-20 13:52:19 -07:00
Karim Lalani
aec459e4a1 [experimental][llms][ollama_functions] Update OllamaFunctions to send tool_calls attribute (#21625)
Update OllamaFunctions to return `tool_calls` for AIMessages when used
for tool calling.
2024-06-20 13:52:19 -07:00
Bagatur
79b503723a core[patch]: allow access RunnableWithFallbacks.runnable attrs (#22139)
RFC, candidate fix for #13095 #22134
2024-06-20 13:52:19 -07:00
SteveLiao
f2461049e3 Update parent_document_retriever.py about **kwargs (#22219)
Add kwargs in add_documents function

**langchain**: Add **kwargs in parent_document_retriever"
 - **Add kwargs for `add_document` in `parent_document_retriever.py`** 


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:19 -07:00
Mark Cusack
81a7cbd2ed Update/fix docs to list Yellowbrick as a supported indexed vectorstore (#22235)
Update/fix docs to list Yellowbrick as a supported indexed vectorstore
and fix the Jupyter notebook.
2024-06-20 13:52:19 -07:00
Erick Friis
31f300dcc5 milvus: fix core dep (#22239) 2024-06-20 13:52:19 -07:00
Erick Friis
e530feffa0 infra: allow first releases 2 (#22237) 2024-06-20 13:52:19 -07:00
Erick Friis
bb34baaa47 infra: allow first releases (#22236) 2024-06-20 13:52:19 -07:00
ChengZi
15d4e1ad76 milvus: New langchain_milvus package and new milvus features (#21077)
New features:

- New langchain_milvus package in partner
- Milvus collection hybrid search retriever
- Zilliz cloud pipeline retriever
- Milvus Local guid
- Rag-milvus template

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jackson <jacksonxie612@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-06-20 13:52:19 -07:00
Leonid Ganeline
4384c07320 docs: arxiv page, added cookbooks (#22215)
Issue: The `arXiv` page is missing the arxiv paper references from the
`langchain/cookbook`.
PR: Added the cookbook references.
Result: `Found 29 arXiv references in the 3 docs, 21 API Refs, 5
Templates, and 18 Cookbooks.` - much more references are visible now.
2024-06-20 13:52:18 -07:00
Leonid Ganeline
ea6a95bcbd ai21[patch]: added license (#22153)
The `pyproject.toml` missed the `license` parameter. I've added it as
`MIT`
2024-06-20 13:52:18 -07:00
Maddy Adams
18c4c730bb infra: update langchainhub and add integration test (#22154)
**Description:** Update langchainhub integration test dependency and add
an integration test for pulling private prompt
**Dependencies:** langchainhub 0.1.16
2024-06-20 13:52:18 -07:00
Will Higgins
cc6c3b3156 community[patch]: Update firecrawl api key name (#22183)
Change 'FIREWALL' to 'FIRECRAWL' as I believe this may have been in
error. Other docs refer to 'FIRECRAWL_API_KEY'.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:18 -07:00
hmasdev
3d869fc343 core[patch]: Add TypeError handler into get_graph of Runnable (#19856)
# Description

## Problem

`Runnable.get_graph` fails when `InputType` or `OutputType` property
raises `TypeError`.

-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L250-L274)
-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L394-L396)

This problem prevents getting a graph of `Runnable` objects whose
`InputType` or `OutputType` property raises `TypeError` but whose
`invoke` works well, such as `langchain.output_parsers.RegexParser`,
which I have already pointed out in #19792 that a `TypeError` would
occur.

## Solution

- Add `try-except` syntax to handle `TypeError` to the codes which get
`input_node` and `output_node`.

# Issue
- #19801 

# Twitter Handle
- [hmdev3](https://twitter.com/hmdev3)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:18 -07:00
acho98
150e14096a docs: Fix Clova embeddings example document (#22181)
- [ ] **PR title**: "Fix list handling in Clova embeddings example
documentation"
  - Description:
Fixes a bug in the Clova Embeddings example documentation where
document_text was incorrectly wrapped in an additional list.
   - Rationale
The embed_documents method expects a list, but the previous example
wrapped document_text in an unnecessary additional list, causing an
error. The updated example correctly passes document_text directly to
the method, ensuring it functions as intended.
2024-06-20 13:52:18 -07:00
Mohammad Mohtashim
7d69ddcbdb mistralai[patch]: Added Json Mode for ChatMistralAI (#22213)
- **Description:** Powered
[ChatMistralAI.with_structured_output](fbfed65fb1/libs/partners/mistralai/langchain_mistralai/chat_models.py (L609))
via json mode
 

-  **Issue:** #22081

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:18 -07:00
Pranith
1c69742f93 docs : Added integrations for tools with langchain_community (#22188)
PR title: Docs enhancement

Description: Adding installation instructions for integrations requiring
langchain-community package since 0.2
Issue: https://github.com/langchain-ai/langchain/issues/22005
2024-06-20 13:52:18 -07:00
Ibrahim
37249e3619 Update llm_chain.ipynb text (#22198)
Added the missing verb "is" and a comma to the text in the Prompt
Templates description within the Build a Simple LLM Application tutorial
for more clarity.
2024-06-20 13:52:18 -07:00
Aditya
d2c5d74fe5 docs:updated documentation for llama, falcon and gemma on Vertex AI Model garden (#22201)
- **Description:** updated documentation for llama, falcona and gemma on
Vertex AI Model garden
    - **Issue:** NA
    - **Dependencies:** NA
    - **Twitter handle:** NA

@lkuligin for review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-06-20 13:52:18 -07:00
Pavlo Paliychuk
8258d31f46 community[minor]: Add Zep Cloud components + docs + examples (#21671)
Thank you for contributing to LangChain!

- [x] **PR title**: community: Add Zep Cloud components + docs +
examples

- [x] **PR message**: 
We have recently released our new zep-cloud sdks that are compatible
with Zep Cloud (not Zep Open Source). We have also maintained our Cloud
version of langchain components (ChatMessageHistory, VectorStore) as
part of our sdks. This PRs goal is to port these components to langchain
community repo, and close the gap with the existing Zep Open Source
components already present in community repo (added
ZepCloudMemory,ZepCloudVectorStore,ZepCloudRetriever).
Also added a ZepCloudChatMessageHistory components together with an
expression language example ported from our repo. We have left the
original open source components intact on purpose as to not introduce
any breaking changes.
    - **Issue:** -
- **Dependencies:** Added optional dependency of our new cloud sdk
`zep-cloud`
    - **Twitter handle:** @paulpaliychuk51


- [x] **Add tests and docs**


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-06-20 13:52:18 -07:00
Jan Soubusta
7bb2098fda community[patch]: DuckDB VS - expose similarity, improve performance of from_texts (#20971)
3 fixes of DuckDB vector store:
- unify defaults in constructor and from_texts (users no longer have to
specify `vector_key`).
- include search similarity into output metadata (fixes #20969)
- significantly improve performance of `from_documents`

Dependencies: added Pandas to speed up `from_documents`.
I was thinking about CSV and JSON options, but I expect trouble loading
JSON values this way and also CSV and JSON options require storing data
to disk.
Anyway, the poetry file for langchain-community already contains a
dependency on Pandas.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:18 -07:00
Surya Pratap Singh Shekhawat
07fb011fcc Update agent_executor.ipynb (#22104)
fixed typos in the doc.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:18 -07:00
Erick Friis
a83794ba42 docs: edit links, direct for notebooks (#22051) 2024-06-20 13:52:18 -07:00
Erick Friis
f822c583a6 anthropic: release 0.1.14rc2, test release note gen (#22147) 2024-06-20 13:52:18 -07:00
Erick Friis
8d4dd5427b infra: auto-generated release notes based on git log (#22141)
Generates release notes based on a `git log` command with title names

Aiming to improve to splitting out features vs. bugfixes using
conventional commits in the coming weeks.

Will work for any monorepo packages
2024-06-20 13:52:18 -07:00
Ameya Shenoy
4571e35334 community[minor]: clickhouse -- ability to use secure connection (#22108)
- **Description:** this PR gives clickhouse client the ability to use a
secure connection to the clickhosue server
- **Issue:** fixes #22082
- **Dependencies:** -
- **Twitter handle:** `_codingcoffee_`

Signed-off-by: Ameya Shenoy <shenoy.ameya@gmail.com>
Co-authored-by: Shresth Rana <shresth@grapevine.in>
2024-06-20 13:52:18 -07:00
ccurme
a3c6c2b02d openai: read stream_options (#21548)
OpenAI recently added a `stream_options` parameter to its chat
completions API (see [release
notes](https://platform.openai.com/docs/changelog/added-chat-completions-stream-usage)).
When this parameter is set to `{"usage": True}`, an extra "empty"
message is added to the end of a stream containing token usage. Here we
propagate token usage to `AIMessage.usage_metadata`.

We enable this feature by default. Streams would now include an extra
chunk at the end, **after** the chunk with
`response_metadata={'finish_reason': 'stop'}`.

New behavior:
```
[AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='Hello', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='!', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17})]
```

Old behavior (accessible by passing `stream_options={"include_usage":
False}` into (a)stream:
```
[AIMessageChunk(content='', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='Hello', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='!', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-1312b971-c5ea-4d92-9015-e6604535f339')]
```

From what I can tell this is not yet implemented in Azure, so we enable
only for ChatOpenAI.
2024-06-20 13:52:18 -07:00
Patrick Zhang
99b58c1439 docs: update the name of the tool passio_nutrition_ai (#22116)
Updating the name of the Passion Nutrition AI tool so that the name of
the tool is correctly displayed in the sidebar menu.

Currently the name of the tool says "Quickstart" in the side bar.
The patch fixed the name to be Passio Nutrition AI.

<img width="681" alt="image"
src="https://github.com/langchain-ai/langchain/assets/4603110/9609975e-78ea-4032-9024-10c4f838170a">
2024-06-20 13:52:18 -07:00
Leonid Ganeline
6b176fb6f7 docs: integrations/platforms/microsoft update (#22100)
Added the `Azure Container Apps dynamic sessions` tool reference
2024-06-20 13:52:18 -07:00
Rahul Triptahi
e666c48302 community[patch]: Put authorized identities behind a feature flag in SharepointLoader (#22125)
Description: Put authorised identities behind a feature flag, load_auth.
Documentation: N/A
Unit tests: N/A

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-06-20 13:52:18 -07:00
Anindyadeep
594fd6b6ee docs: Update PremAI Docs (#22114)
Thank you for contributing to LangChain!

- [X] **PR title**: community: Updated langchain-community PremAI
documentation

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:18 -07:00
sasha
fad96a8cee community: add metadata to chain logging; (#22122)
Hey, I'm Sasha. The SDK engineer from [Comet](https://comet.com).
This PR updates the CometTracer class.
Added metadata to CometTracerr. From now on, both chains and spans will
send it.
2024-06-20 13:52:18 -07:00
Jirka Lhotka
8db705d65a community: Update costs of openai finetuned models (#22124)
- **Description:** Update costs of finetuned models and add
gpt-3-turbo-0125. Source: https://openai.com/api/pricing/
  - **Issue:** N/A
  - **Dependencies:** None
2024-06-20 13:52:18 -07:00
Eugene Yurtsev
2c210e87a8 community[major]: lint for usage of xml library (#22132)
* Lint for usage of standard xml library
* Add forced opt-in for quip client
* Actual security issue is with underlying QuipClient not LangChain
integration (since the client is doing the parsing), but adding
enforcement at the LangChain level.
2024-06-20 13:52:18 -07:00
Tom Aarsen
b578add91c docs: Add explanation on how to use Hugging Face embeddings (#22118)
- **Description:** I've added a tab on embedding text with LangChain
using Hugging Face models to here:
https://python.langchain.com/v0.2/docs/how_to/embed_text/. HF was
mentioned in the running text, but not in the tabs, which I thought was
odd.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** No need, this is tiny :) 

Also, I had a ton of issues with the poetry docs/lint install, so I
haven't linted this. Apologies for that.

cc @Jofthomas 

- Tom Aarsen
2024-06-20 13:52:18 -07:00
Bagatur
d9b5df8543 anthropic[patch]: allow tool call mutation (#22130)
If tool_use blocks and tool_calls with overlapping IDs are present,
prefer the values of the tool_calls. Allows for mutating AIMessages just
via tool_calls.
2024-06-20 13:52:18 -07:00
Christophe Bornet
097b234b4b doc: Add doc for CassandraByteStore (#22126)
Preview:
https://langchain-git-fork-cbornet-doc-cassandrabytestore-langchain.vercel.app/v0.2/docs/integrations/stores/cassandra/
2024-06-20 13:52:18 -07:00
Vadym Barda
dc7e1acb26 docs: improve how-to docs for message history (#22072)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:18 -07:00
Artem
abd83e876e docs: update hub.pull("rlm/map-prompt") to hub.pull("rlm/reduce-prompt") for reduce prompt (#22088)
**PR message**: 
Update `hub.pull("rlm/map-prompt")` to `hub.pull("rlm/reduce-prompt")`
in summarization.ipynb

**Description:** 
Fix typo in prompt hub link from `reduce_prompt =
hub.pull("rlm/map-prompt")` to `reduce_prompt =
hub.pull("rlm/reduce-prompt")` following next issue

**Issue:** #22014

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:18 -07:00
Leonid Ganeline
56a5ef94d1 docs: compact the API Reference links (#21285)
This PR is opinionated. 
Issue: the `API Reference` sections in the examples hold too much
vertical space and make us scroll the page too much. See an
[example](https://python.langchain.com/docs/get_started/quickstart/#conversation-retrieval-chain).
These sections are **important**. So, the compacting should not make
these sections less noticeable.
Change: compacting the `API Reference` sections. See the [same example
after change
applied](https://langchain-j6nya46lf-langchain.vercel.app/docs/get_started/quickstart/#conversation-retrieval-chain).
It is more compact and now looks like references (footnotes).
Note: I would also change the section style, so it would be more
noticeable (maybe to look like the footnotes. Smaller wider font?)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:18 -07:00
ccurme
9cd8f73ec5 groq: read tool calls from .tool_calls attribute (#22096) 2024-06-20 13:52:18 -07:00
Bagatur
1e4ab6ac0f docs: hf feat table tool calling (#22091) 2024-06-20 13:52:18 -07:00
Eugene Yurtsev
54e7a85179 codespell ignore remaining issues (#22097) 2024-06-20 13:52:18 -07:00
Eugene Yurtsev
889f8a55a0 docs: fix some spelling mistakes caught by newest version of code spell (#22090)
Going to merge this even though it doesn't pass all tests, and open a
separate PR for the remaining spelling mistakes.
2024-06-20 13:52:18 -07:00
Bagatur
eb337d8047 infra: api docs quick preview (#22093) 2024-06-20 13:52:18 -07:00
Pavel Zloi
40b4fb9670 community[minor]: ManticoreSearch engine added to vectorstore (#19117)
**Description:** ManticoreSearch engine added to vectorstores
**Issue:** no issue, just a new feature
**Dependencies:** https://pypi.org/project/manticoresearch-dev/
**Twitter handle:** @EvilFreelancer

- Example notebook with test integration:

https://github.com/EvilFreelancer/langchain/blob/manticore-search-vectorstore/docs/docs/integrations/vectorstores/manticore_search.ipynb

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:18 -07:00
Erick Friis
067ecbf4de cli: model name substitution fix, release 0.0.23 (#22089) 2024-06-20 13:52:18 -07:00
Kartheek Yakkala
f675417afa docs : Added integrations for tools with langchain_community (#22056)
- **PR title**:  Docs enhancement

- **Description:** Adding installation instructions for integrations
requiring `langchain-community` package since 0.2
    - **Issue:** https://github.com/langchain-ai/langchain/issues/22005
2024-06-20 13:52:18 -07:00
ccurme
e624a991a3 anthropic, openai: cut pre-releases (#22083) 2024-06-20 13:52:18 -07:00
ccurme
98b617c3e0 core: bump to 0.2.1rc (#22080) 2024-06-20 13:52:18 -07:00
Harrison Chase
8d068b8800 docs: add multi-modal-docs (#21734)
We dont really have any abstractions around multi-modal... so add a
section explaining we dont have any abstrations and then how to guides
for openai and anthropic (probably need to add for more)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: junefish <junefish@users.noreply.github.com>
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:18 -07:00
ccurme
974ab2b7ae core, partners: add token usage attribute to AIMessage (#21944)
```python
class UsageMetadata(TypedDict):
    """Usage metadata for a message, such as token counts.

    Attributes:
        input_tokens: (int) count of input (or prompt) tokens
        output_tokens: (int) count of output (or completion) tokens
        total_tokens: (int) total token count
    """

    input_tokens: int
    output_tokens: int
    total_tokens: int
```
```python
class AIMessage(BaseMessage):
    ...
    usage_metadata: Optional[UsageMetadata] = None
    """If provided, token usage information associated with the message."""
    ...
```
2024-06-20 13:52:18 -07:00
Bagatur
eac2a8ab38 community[patch]: Release. 0.2.1 (#22073) 2024-06-20 13:52:18 -07:00
Bagatur
f182af8f0a langchain[patch]: Release 0.2.1 (#22074) 2024-06-20 13:52:18 -07:00
maang-h
03e0f8756b community[patch]: Update the default “API URL” and “MODEL” of sparkllm (#22070)
- **Description:** When I was running the sparkllm, I found that the
default parameters currently used could no longer run correctly.
    - original parameters & values:
         - spark_api_url: "wss://spark-api.xf-yun.com/v3.1/chat"
         - spark_llm_domain: "generalv3"
    ```python
    # example
    
    from langchain_community.chat_models import ChatSparkLLM
    
spark = ChatSparkLLM(spark_app_id="my_app_id",
spark_api_key="my_api_key", spark_api_secret="my_api_secret")
    spark.invoke("hello")
    ```

![sparkllm](https://github.com/langchain-ai/langchain/assets/55082429/5369bfdf-4305-496a-bcf5-2d3f59d39414)

So I updated them to 3.5 (same as sparkllm official website). After the
update, they can be used normally.
    - new parameters & values:
         - spark_api_url: "wss://spark-api.xf-yun.com/v3.5/chat"
         - spark_llm_domain: "generalv3.5"
2024-06-20 13:52:18 -07:00
junkeon
5eed8fe6f8 upstage[patch] : fix error handling in Layout Analysis parser (#22054)
This pull request addresses and fixes exception handling in the
UpstageLayoutAnalysisParser and enhances the test coverage by adding
error exception tests for the document loader. These improvements ensure
robust error handling and increase the reliability of the system when
dealing with external API calls and JSON responses.

### Changes Made
1. Fix Request Exception Handling:

- Issue: The existing implementation of UpstageLayoutAnalysisParser did
not properly handle exceptions thrown by the requests library, which
could lead to unhandled exceptions and potential crashes.
- Solution: Added comprehensive exception handling for
requests.RequestException to catch any request-related errors. This
includes logging the error details and raising a ValueError with a
meaningful error message.

2. Add Error Exception Tests for Document Loader:

- New Tests: Introduced new test cases to verify the robustness of the
UpstageLayoutAnalysisLoader against various error scenarios. The tests
ensure that the loader gracefully handles:
- RequestException: Simulates network issues or invalid API requests to
ensure appropriate error handling and user feedback.
- JSONDecodeError: Simulates scenarios where the API response is not a
valid JSON, ensuring the system does not crash and provides clear error
messaging.
2024-06-20 13:52:18 -07:00
JuHyung Son
623e200bca partner-upstage[patch]: embeddings empty list bug (#22057)
Fixed an error in `embed_documents` when the input was given as an empty
list. And I have revised the document.
2024-06-20 13:52:18 -07:00
Martin Triska
22ec744b59 community[minor]: Added propagation of document metadata from O365BaseLoader (#20663)
**Description:**
- Added propagation of document metadata from O365BaseLoader to
FileSystemBlobLoader (O365BaseLoader uses FileSystemBlobLoader under the
hood).
- This is done by passing dictionary `metadata_dict`: key=filename and
value=dictionary containing document's metadata
- Modified `FileSystemBlobLoader` to accept the `metadata_dict`, use
`mimetype` from it (if available) and pass metadata further into blob
loader.

**Issue:**
- `O365BaseLoader` under the hood downloads documents to temp folder and
then uses `FileSystemBlobLoader` on it.
- However metadata about the document in question is lost in this
process. In particular:
- `mime_type`: `FileSystemBlobLoader` guesses `mime_type` from the file
extension, but that does not work 100% of the time.
- `web_url`: this is useful to keep around since in RAG LLM we might
want to provide link to the source document. In order to work well with
document parsers, we pass the `web_url` as `source` (`web_url` is
ignored by parsers, `source` is preserved)

**Dependencies:**
None

**Twitter handle:**
@martintriska1

Please review @baskaryan

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:17 -07:00
Eugene Yurtsev
6c3918aea1 community[patch]: Update doc-string in CloudBlobLoader (#22069)
Update doc-string
2024-06-20 13:52:17 -07:00
Maxime Perrin
f5603676ed docs : Adding correct imports to the integrations callbacks doc (#22059)
- **Description:** Adding correct imports to the integrations callbacks
doc (langchain-community package)
  - **Issue:** #22005

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-06-20 13:52:17 -07:00
Philippe PRADOS
d93b231fac community[minor]: Add CloudBlobLoader that supports loading data from cloud buckets (#21957)
Thank you for contributing to LangChain!

- [ ] **PR title**: "Add CloudBlobLoader"
  - community: Add CloudBlobLoader

- [ ] **PR message**: Add cloud blob loader
    - **Description:** 
 Langchain provides several approaches to read different file formats:

Specific loaders (`CVSLoader`) or blob-compatible loaders
(`FileSystemBlobLoader`). The only implementation proposed for
BlobLoader is `FileSystemBlobLoader`.
      
Many projects retrieve files from cloud storage. We propose a new
implementation of `BlobLoader` to read files from the three cloud
storage systems. The interface is strictly identical to
`FileSystemBlobLoader`. The only difference is the constructor, which
takes a cloud "url" object such as `s3://my-bucket`, `az://my-bucket`,
or `gs://my-bucket`.
      
By streamlining the process, this novel implementation eliminates the
requirement to pre-download files from cloud storage to local temporary
files (which are seldom removed).
      
The code relies on the
[CloudPathLib](https://cloudpathlib.drivendata.org/stable/) library to
interpret cloud URLs. This has been added as an optional dependency.

```Python
loader = CloudBlobLoader("s3://mybucket/id")
for blob in loader.yield_blobs():
    print(blob)
```

- [X] **Dependencies:** CloudPathLib
- [X] **Twitter handle:** pprados


- [X] **Add tests and docs**: Add unit test, but it's easy to convert to
integration test, with some files in a cloud storage (see
`test_cloud_blob_loader.py`)

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

Hello from Paris @hwchase17. Can you review this PR?

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-20 13:52:17 -07:00
Christophe Bornet
95d981ee3e community[minor]: Add Cassandra ByteStore (#22064) 2024-06-20 13:52:17 -07:00
Christophe Bornet
654920e205 community[minor]: Add async methods to CassandraChatMessageHistory (#21975) 2024-06-20 13:52:17 -07:00
Eugene Yurtsev
596b83357e docs: concepts callbacks fix admonition (#22048)
Correct the admonition text
2024-06-20 13:52:17 -07:00
Erick Friis
f45336bc62 docs: version increases (#22050) 2024-06-20 13:52:17 -07:00
Sky
28eb6b0cd3 community[patch]: surrealdb provide functions for MMR (Maximal Marginal Relevance) (#21185)
This PR contains 4 added functions:

- max_marginal_relevance_search_by_vector
- amax_marginal_relevance_search_by_vector
- max_marginal_relevance_search
- amax_marginal_relevance_search

I'm no langchain expert, but tried do inspect other vectorstore sources
like chroma, to build these functions for SurrealDB. If someone has some
changes for me, please let me know. Otherwise I would be happy, if these
changes are added to the repository, so that I can use the orignal repo
and not my local monkey patched version.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
Erick Friis
df669ba03a docs: add astream v2 migration guide links (#21845)
- docs: v0.2 version sidebar
- x
- x
2024-06-20 13:52:17 -07:00
Bruno Alvisio
5fe338f5dd community[patch]: Adding HEADER to the list of supported locations (#21946)
**Description:** adds headers to the list of supported locations when
generating the openai function schema
2024-06-20 13:52:17 -07:00
Bagatur
3e3b0e9721 infra: rm unused # noqa violations (#22049)
Updating #21137
2024-06-20 13:52:17 -07:00
acho98
bd88ae8255 community[minor]: Add Clova Embeddings for LangChain Community (#21890)
- [ ] **PR title**: "Add Naver ClovaX embedding to LangChain community"
- HyperClovaX is a large language model developed by
[Naver](https://clova-x.naver.com/welcome).
It's a powerful and purpose-trained LLM.

- You can visit the embedding service provided by
[ClovaX](https://www.ncloud.com/product/aiService/clovaStudio)

- You may get CLOVA_EMB_API_KEY, CLOVA_EMB_APIGW_API_KEY,
CLOVA_EMB_APP_ID From
https://www.ncloud.com/product/aiService/clovaStudio

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
arpitkumar980
c5975a41e7 community[patch]: sharepoint loader identity enabled (#21176)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:https://github.com/arpitkumar980/langchain.git
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:17 -07:00
Eugene Yurtsev
4d2a1b7a5f docs: add admonitions to how-to callbacks (#22046)
Add admonitions with more information.
2024-06-20 13:52:17 -07:00
HuiyuanYan
7379ae8b66 community[patch]: Update tongyi.py to support MultimodalConversation in dashscope. (#21249)
Add the support of multimodal conversation in dashscope,now we can use
multimodal language model "qwen-vl-v1", "qwen-vl-chat-v1",
"qwen-audio-turbo" to processing picture an audio. :)

- [ ] **PR title**: "community: add multimodal conversation support in
dashscope"



- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** add multimodal conversation support in dashscope
    - **Issue:** 
    - **Dependencies:** dashscope≥1.18.0
    - **Twitter handle:** none :)


- [ ] **How to use it?**:
   - ```python
     Tongyi_chat = ChatTongyi(
        top_p=0.5,
        dashscope_api_key=api_key,
        model="qwen-vl-v1"
     )
     response= Tongyi_chat.invoke(
        input = 
        [
        {
            "role": "user",
            "content": [
{"image":
"https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
                {"text": "这是什么?"}
            ]
        }
        ]
       )
      ```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
mochi
31918a1523 experimental[patch], docs: refine notebook for MyScale SelfQueryRetriever (#22016)
- **Description:** upgrade model to `gpt-4o`
2024-06-20 13:52:17 -07:00
MSubik
75a8cdab58 community[patch]: standardize init args, update for javelin sdk release. (#21980)
Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085) Updated
the Javelin chat model to standardize the initialization argument. Also
fixed an existing bug, where code was initialized with incorrect call to
the JavelinClient defined in the javelin_sdk, resulting in an
initialization error. See related [Javelin
Documentation](https://docs.getjavelin.io/docs/javelin-python/quickstart).
2024-06-20 13:52:17 -07:00
Mohammad Mohtashim
a02cca4ba5 community[patch]: AzureSearchVectorStoreRetriever Fixed to account for search_kwargs (#21572)
- **Description:** Fixed `AzureSearchVectorStoreRetriever` to account
for search_kwargs. More explanation is in the mentioned issue.
- **Issue:** #21492

---------

Co-authored-by: MAC <mac@MACs-MacBook-Pro.local>
Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
Klaudia Lemiec
40c81ce13f docs: Chroma docstrings update (#22001)
Thank you for contributing to LangChain!

- [X] **PR title**: "docs: Chroma docstrings update"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [X] **PR message**: 
    - **Description:** Added and updated Chroma docstrings
    - **Issue:** https://github.com/langchain-ai/langchain/issues/21983


- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
  - only docs


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:17 -07:00
Jerron Lim
a4a59f7c08 community[patch]: add args_schema to WikipediaQueryRun (#22019)
Description: This change adds args_schema (pydantic BaseModel) to
WikipediaQueryRun for correct schema formatting on LLM function calls

Issue: currently using WikipediaQueryRun with OpenAI function calling
returns the following error "TypeError: WikipediaQueryRun._run() got an
unexpected keyword argument '__arg1' ". This happens because the schema
sent to the LLM is "input: '{"__arg1":"Hunter x Hunter"}'" while the
method should be called with the "query" parameter.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
Mazen Ramadan
ef511c54f7 community[minor]: Add Scrapfly Loader community integration (#22036)
Added [Scrapfly](https://scrapfly.io/) Web Loader integration. Scrapfly
is a web scraping API that allows extracting web page data into
accessible markdown or text datasets.

- __Description__: Added Scrapfly web loader for retrieving web page
data as markdown or text.
- Dependencies: scrapfly-sdk
- Twitter: @thealchemi1st

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
Chad Juliano
37eb5334e7 docs: Use Kinetica Sql context API (#21993)
Update python notebook to use new Kinetica SQL context API.
2024-06-20 13:52:17 -07:00
ccurme
9ddd08c700 langchain, community: move OpenAIAssistantV2Runnable to community (#22044) 2024-06-20 13:52:17 -07:00
Mirna Wong
0a08aa582d docs: updates code examples in neo4j_cypher.ipynb (#21973)
Resolves #19134

Thank you for contributing to LangChain!

- [x ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** this pr replaces `title` with `name` in the [add
examples in cypher generation
prompt](https://python.langchain.com/v0.1/docs/integrations/graphs/neo4j_cypher/#add-examples-in-the-cypher-generation-prompt)
section.
    - **Issue:** 19134
    - **Dependencies:** any dependencies required for this change
    - **Twitter handle:** @mirna_wong
2024-06-20 13:52:17 -07:00
CaroFG
21b8c28d6c community[patch]: update for compatibility with Meilisearch v1.8 (#21979)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.8. Adds [”showRankingScore”:
true”](https://www.meilisearch.com/docs/reference/api/search#ranking-score)
in the search parameters and replaces `_semanticScore` field with `
_rankingScore`


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:17 -07:00
Oleksii Pokotylo
d0576a05c4 community[patch]: Extend AzureSearch with maximal_marginal_relevance, from_embeddings (#21065)
**Description:**
- Extend AzureSearch with `maximal_marginal_relevance` (for vector and
hybrid search)
- Add construction `from_embeddings` - if the user has already embedded
the texts
- Add `add_embeddings` 
- Refactor common parts (`_simple_search`, `_results_to_documents`,
`_reorder_results_with_maximal_marginal_relevance`)
- Add `vector_search_dimensions` as a parameter to the constructor to
avoid extra calls to `embed_query` (most of the time the user applies
the same model and knows the dimension)

**Issue:** none
**Dependencies:** none

- [x] **Add tests and docs**: The docstrings have been added to the new
functions, and unified for the existing ones. The example notebook is
great in illustrating the main usage of AzureSearch, adding the new
methods would only dilute the main content.
- [x] **Lint and test**

---------

Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
Erick Friis
31a35c991b docs: move feedback into paginator from content (#22041)
we only index what's in the `<article>` tags for search. We should not
have the feedback in the article.
2024-06-20 13:52:17 -07:00
SaschaStoll
2658eb8f17 community[patch]: Performant filter columns option for Hanavector (#21971)
**Description:** Backwards compatible extension of the initialisation
interface of HanaDB to allow the user to specify
specific_metadata_columns that are used for metadata storage of selected
keys which yields increased filter performance. Any not-mentioned
metadata remains in the general metadata column as part of a JSON
string. Furthermore switched to executemany for batch inserts into
HanaDB.

**Issue:** N/A

**Dependencies:** no new dependencies added

**Twitter handle:** @sapopensource

---------

Co-authored-by: Martin Kolb <martin.kolb@sap.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:17 -07:00
Bagatur
ecc91cb243 langchain[patch]: remove dataclasses-json dep (#22042)
vestigial dep afaict
2024-06-20 13:52:17 -07:00
Christos Boulmpasakos
bca214240b text-splitters[patch]: Extend TextSplitter:keep_separator functionality (#21130)
**Description:** Added extra functionality to `CharacterTextSplitter`,
`TextSplitter` classes.
The user can select whether to append the separator to the previous
chunk with `keep_separator='end' ` or else prepend to the next chunk.
Previous functionality prepended by default to next chunk.
  
**Issue:** Fixes #20908

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:17 -07:00
Bagatur
a116fa163f docs: fix partner api ref build (#22007) 2024-06-20 13:52:17 -07:00
Eric Zhang
621e926b80 langchain: add RankLLM Reranker (#21171)
Integrate RankLLM reranker (https://github.com/castorini/rank_llm) into
LangChain

An example notebook is given in
`docs/docs/integrations/retrievers/rankllm-reranker.ipynb`

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:17 -07:00
Eugene Yurtsev
34815937eb concepts: update callback concepts (#22040)
Update callback concepts
2024-06-20 13:52:17 -07:00
maang-h
351cee2bd3 community: Fix CSVLoader columns is None (#20701)
- **Bug code**: In
langchain_community/document_loaders/csv_loader.py:100

- **Description**: currently, when 'CSVLoader' reads the column as None
in the 'csv' file, it will report an error because the 'CSVLoader' does
not verify whether the column is of str type and does not consider how
to handle the corresponding 'row_data' when the column is' None 'in the
csv. This pr provides a solution.

- **Issue:**  Fix #20699 

- **thinking:**

1. Refer to the processing method for
'langchain_community/document_loaders/csv_loader.py:100' when **'v'**
equals'None', and apply the same method to '**k**'.
(Reference`csv.DictReader` ,**'k'** will only be None when `
len(columns) < len(number_row_data)` is established)
2. **‘k’** equals None only holds when it is the last column, and its
corresponding **'v'** type is a list. Therefore, I referred to the data
format in 'Document' and used ',' to concatenated the elements in the
list.(But I'm not sure if you accept this form, if you have any other
ideas, communicate)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:17 -07:00
Nithin James Padayatti
523eaa7701 langchain: added revision_example prompt template (#20916)
**Description:** Added revision_example prompt template to include the
revision request and revision examples in the revision chain.
    **Issue:** Not Applicable
    **Dependencies:** Not Applicable
    **Twitter handle:**  @nithinjp09
2024-06-20 13:52:17 -07:00
Sihan Chen
36b1638d4f community[minor]: allow enabling proxy in aiohttp session in AsyncHTML (#19499)
Allow enabling proxy in aiohttp session async html
2024-06-20 13:52:17 -07:00
Eugene Yurtsev
f1d17b22ee community[patch]: Fix remaining __inits__ in community (#22037)
Fixes the __init__ files in community to use __all__ which is statically
defined.
2024-06-20 13:52:16 -07:00
Eugene Yurtsev
da7e201ab8 docs: update doc feedback to populate URL (#22033)
Update docfeedback to populate URL
2024-06-20 13:52:16 -07:00
Eugene Yurtsev
73d835f53f community[patch]: Add unit test to verify that init is correctly defined (#22030)
Fix some __init__ files and add a unit test
2024-06-20 13:52:16 -07:00
Erick Friis
3572e61526 robocorp: release 0.0.8 (#22034) 2024-06-20 13:52:16 -07:00
Eugene Yurtsev
0b2764d1eb ci: update documentation template to include URL (#22032)
update documentation template to include URL
2024-06-20 13:52:16 -07:00
Matthew Hoffman
610122057a community[patch]: fix public interface for embeddings module (#21650)
## Description

The existing public interface for `langchain_community.emeddings` is
broken. In this file, `__all__` is statically defined, but is
subsequently overwritten with a dynamic expression, which type checkers
like pyright do not support. pyright actually gives the following
diagnostic on the line I am requesting we remove:


[reportUnsupportedDunderAll](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportUnsupportedDunderAll):

```
Operation on "__all__" is not supported, so exported symbol list may be incorrect
```

Currently, I get the following errors when attempting to use publicablly
exported classes in `langchain_community.emeddings`:

```python
import langchain_community.embeddings

langchain_community.embeddings.HuggingFaceEmbeddings(...)  #  error: "HuggingFaceEmbeddings" is not exported from module "langchain_community.embeddings" (reportPrivateImportUsage)
```

This is solved easily by removing the dynamic expression.
2024-06-20 13:52:16 -07:00
Maxime Perrin
336cabb25a docs : Integrations vector stores with langchain-community install (#22028)
- **Description:** Adding installation instruction for integrations
requiring `langchain-community` package since 0.2
  - **Issue:** #22005

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-06-20 13:52:16 -07:00
Eugene Yurtsev
85c2d5d189 community[patch]: Clean up logic in import checking unit test (#22026)
Clean up unit test
2024-06-20 13:52:16 -07:00
Tomaz Bratanic
036f1f8923 community[patch]: Handle exceptions where node props aren't consistent in neo4j schema (#22027) 2024-06-20 13:52:16 -07:00
WeichenXu
ac6d99ac72 community[patch]: Fix ChatDatabricsk in case that streaming response doesn't have role field in delta chunk (#21897)
Thank you for contributing to LangChain!

- [X] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


**Description:**
Fix ChatDatabricsk in case that streaming response doesn't have role
field in delta chunk


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2024-06-20 13:52:16 -07:00
Eugene Yurtsev
5836c67d11 community[patch]: Add unit test to catch bad __all__ definitions (#21996)
This will catch all dynamic __all__ definitions.
2024-06-20 13:52:16 -07:00
Brian Thorne
9a5357770a docs: Update import in wikipedia tool documentation (#21565)
Updates docs so the example doesn't lead to a warning:
```
LangChainDeprecationWarning: Importing tools from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:

`from langchain_community.tools import WikipediaQueryRun`.

To install langchain-community run `pip install -U langchain-community`.
```
2024-06-20 13:52:16 -07:00
Bagatur
de3a9cb3c9 core[patch]: Release 0.2.1 (#22003) 2024-06-20 13:52:16 -07:00
Kefan You
7ae0152f12 community[patch]: raise_for_status logic missing in async _fetch of WebBaseLoader (#21948)
## 'raise_for_status' parameter of WebBaseLoader works in sync load but
not in async load.
In webBaseLoader:  

Sync load is calling `_scrape` and has `raise_for_status` properly
handled.
```
    def _scrape(
        self,
        url: str,
        parser: Union[str, None] = None,
        bs_kwargs: Optional[dict] = None,
    ) -> Any:
        from bs4 import BeautifulSoup

        if parser is None:
            if url.endswith(".xml"):
                parser = "xml"
            else:
                parser = self.default_parser

        self._check_parser(parser)

        html_doc = self.session.get(url, **self.requests_kwargs)
        if self.raise_for_status:
            html_doc.raise_for_status()

        if self.encoding is not None:
            html_doc.encoding = self.encoding
        elif self.autoset_encoding:
            html_doc.encoding = html_doc.apparent_encoding
        return BeautifulSoup(html_doc.text, parser, **(bs_kwargs or {}))
```
Async load is calling `_fetch` but missing `raise_for_status` logic.
```
    async def _fetch(
        self, url: str, retries: int = 3, cooldown: int = 2, backoff: float = 1.5
    ) -> str:
        async with aiohttp.ClientSession() as session:
            for i in range(retries):
                try:
                    async with session.get(
                        url,
                        headers=self.session.headers,
                        ssl=None if self.session.verify else False,
                        cookies=self.session.cookies.get_dict(),
                    ) as response:
                        return await response.text()
```

Co-authored-by: kefan.you <darkfss@sina.com>
2024-06-20 13:52:16 -07:00
Mateusz Szewczyk
9ad6ba7a24 docs: update IBM WatsonxLLM docs with deprecated LLMChain (#21960)
Thank you for contributing to LangChain!

- [x] **PR title**: "update IBM WatsonxLLM docs with deprecated
LLMChain"

- [x] **PR message**: 
- **Description:** update IBM WatsonxLLM docs with deprecated LLMChain

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:16 -07:00
Surya Rath
f418094534 OpenAI Assistants v2 api support for OpenAIAssistantRunnable (#21484)
**Title**: "langchain: OpenAI Assistants v2 api support"

***Descriptions*** 
- [x] "attachments" support added along with backward compatibility of
"file_ids"
- [x]  "tool_resources" support added while creating new assistant

- [ ] "tool_choice" parameter support
- [ ]  Streaming support


- **Dependencies:** OpenAI v2 API (openai>=1.23.0)
- **Twitter handle:** @skanta_rath

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:16 -07:00
Eugene Yurtsev
cd034d6378 langchain[patch]: Add unit test to detect changes to community imports (#21998)
Add unit tests for community imports
2024-06-20 13:52:16 -07:00
Eugene Yurtsev
d2647fd012 langchain[patch]: Turn on all deprecations for 0.2 (#21999)
- Turn on all 0.2 import deprecations.
- Update error messag with URL to upgrade instructions.
2024-06-20 13:52:16 -07:00
Asaf Joseph Gardin
9d69350f46 ai21: AI21 Jamba docs (#21978)
- Updated docs to have an example to use Jamba instead of J2

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:16 -07:00
Pengcheng Liu
2383345a52 community[patch]: Update model client to support vision model in Tong… (#21474)
- **Description:** Tongyi uses different client for chat model and
vision model. This PR chooses proper client based on model name to
support both chat model and vision model. Reference [tongyi
document](https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.27404c9a7upm11)
for details.

```
from langchain_core.messages import HumanMessage
from langchain_community.chat_models import ChatTongyi

llm = ChatTongyi(model_name='qwen-vl-max')
image_message = {
    "image": "https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png"
}
text_message = {
    "text": "summarize this picture",
}
message = HumanMessage(content=[text_message, image_message])
llm.invoke([message])
```

- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** None
2024-06-20 13:52:16 -07:00
Erick Friis
e34d98101c infra: only tag core releases as github latest (#21991) 2024-06-20 13:52:16 -07:00
Sevin F. Varoglu
eb2d9956ae community[patch]: update OctoAIEmbeddings to subclass OpenAIEmbeddings (#21805) 2024-06-20 13:52:16 -07:00
Eugene Yurtsev
74a5c92c22 core[patch]: Add unit test for RunnableGenerator for eventstream v2 (#21990)
No unit tests with runnable generator
2024-06-20 13:52:16 -07:00
Nuno Campos
d8253d2aec core[patch]: In astream_events(version=v2) tap output of root run (#21977)
- if tap_output_iter/aiter is called multiple times for the same run
issue events only once
- if chat model run is tapped don't issue duplicate on_llm_new_token
events
- if first chunk arrives after run has ended do not emit it as a stream
event

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:16 -07:00
Bagatur
4b7601c836 community[patch]: AzureSearch dont overwrite default async (#21989) 2024-06-20 13:52:16 -07:00
ccurme
3d214c4aba docs: set default anthropic model (#21988)
`ChatAnthropic()` raises ValidationError.
2024-06-20 13:52:16 -07:00
Muhammed Al-Dulaimi
9accf00ea4 Fix grammar error (#21985)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-20 13:52:16 -07:00
ccurme
d8fad897e3 Revert "anthropic: set default model" (#21987)
Reverts langchain-ai/langchain#21986
2024-06-20 13:52:16 -07:00
ccurme
f2576ee717 anthropic: set default model (#21986)
Various docs reference `ChatAnthropic()`, but this currently raises
ValidationError.
2024-06-20 13:52:16 -07:00
ccurme
acfbaf78be langchain: default to Runnable in MultiQueryRetriever (#21770)
- `llm_chain` becomes `Union[LLMChain, Runnable]`
- `.from_llm` creates a runnable

tested by verifying that docs/how_to/MultiQueryRetriever.ipynb runs
unchanged with sync/async invoke (and that it runs if we specifically
instantiate with LLMChain).
2024-06-20 13:52:16 -07:00
Yulong Wang
d34887543b community[patch]: Fix typo in arxiv tool's doc (#21970)
Fix typo in arxiv tool's doc
2024-06-20 13:52:16 -07:00
Robert Caulk
75b19393a7 community[minor]: add AskNews retriever and AskNews tool (#21581)
We add a tool and retriever for the [AskNews](https://asknews.app)
platform with example notebooks.

The retriever can be invoked with:

```py
from langchain_community.retrievers import AskNewsRetriever

retriever = AskNewsRetriever(k=3)

retriever.invoke("impact of fed policy on the tech sector")
```

To retrieve 3 documents in then news related to fed policy impacts on
the tech sector. The included notebook also includes deeper details
about controlling filters such as category and time, as well as
including the retriever in a chain.

The tool is quite interesting, as it allows the agent to decide how to
obtain the news by forming a query and deciding how far back in time to
look for the news:

```py
from langchain_community.tools.asknews import AskNewsSearch
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI

tool = AskNewsSearch()

instructions = """You are an assistant."""
base_prompt = hub.pull("langchain-ai/openai-functions-template")
prompt = base_prompt.partial(instructions=instructions)
llm = ChatOpenAI(temperature=0)
asknews_tool = AskNewsSearch()
tools = [asknews_tool]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
)

agent_executor.invoke({"input": "How is the tech sector being affected by fed policy?"})
```

---------

Co-authored-by: Emre <e@emre.pm>
2024-06-20 13:52:16 -07:00
Jesse S
1022029dc1 community[minor]: add aerospike vectorstore integration (#21735)
Please let me know if you see any possible areas of improvement. I would
very much appreciate your constructive criticism if time allows.

**Description:**
- Added a aerospike vector store integration that utilizes
[Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/)
add-on.
- Added both unit tests and integration tests
- Added a docker compose file for spinning up a test environment
- Added a notebook

 **Dependencies:** any dependencies required for this change
- aerospike-vector-search

 **Twitter handle:** 
- No twitter, you can use my GitHub handle or LinkedIn if you'd like

Thanks!

---------

Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:16 -07:00
Prince Canuma
ad48f77e57 community[patch]: Fix MLX LLM Stream (#20575)
Closes #20561

This PR fixes MLX LLM stream `AttributeError`. 

Recently, `mlx-lm` changed the token decoding logic, which affected the
LC+MLX integration.

Additionally, I made minor fixes such as: docs example broken link and
enforcing pipeline arguments (max_tokens, temp and etc) for invoke.
   
- **Issue:** #20561
    
- **Twitter handle:** @Prince_Canuma
2024-06-20 13:52:16 -07:00
Rahul Triptahi
da329c58a9 community[patch]: Remove redundant pebblo cloud api call (#21589)
Description: removed redundant pebblo cloud api call. Changed classified
`doc` key to `ai_apps_data`.
Documentation: N/A
Unit tests: N/A
2024-06-20 13:52:16 -07:00
Param Singh
f650d92a8f community[patch]: standardized sparkllm init args (#21633)
Related to #20085 
@baskaryan 

Thank you for contributing to LangChain!

community:sparkllm[patch]: standardized init args

updated `spark_api_key` so that aliased to `api_key`. Added integration
test for `sparkllm` to test that it continues to set the same underlying
attribute.

updated temperature with Pydantic Field, added to the integration test.

Ran `make format`,`make test`, `make lint`, `make spell_check`
2024-06-20 13:52:16 -07:00
Dhruv Chawla
3ccf045076 community[patch]: Update UpTrain Callback Handler to support the new UpTrain evaluation schema (#21656)
UpTrain has a new dashboard now that makes it easier to view projects
and evaluations. Using this requires specifying both project_name and
evaluation_name when performing evaluations. I have updated the code to
support it.
2024-06-20 13:52:16 -07:00
Alex Riina
6bb6a82925 openai[patch], community[patch]: add pricing and max context window for GPT-4o (#21673)
# Add pricing and max context window for GPT-4o
- community: add cost per 1k tokens and max context window
- partners: add max context window

**Description:** adds static information about GPT-4o based on
https://openai.com/api/pricing/ and
https://platform.openai.com/docs/models/gpt-4o so that GPT-4o reporting
is accurate.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:16 -07:00
缨缨
d58ca68342 community: enable SupabaseVectorStore to support extended table fields (#21762)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: enable SupabaseVectorStore to support
extended table fields"

- [x] **PR message**: 
- Added extension fields to the function _add_vectors so that users can
add other custom fields when insert a record into the database. eg:
    

![image](https://github.com/langchain-ai/langchain/assets/10885578/e1d5ca20-936e-4cab-ba69-8fdd23b8ce8f)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:16 -07:00
Jerome Choo
4739728622 docs: Clean up Diffbot docs (#21781)
The Diffbot DocumentLoader page doesn't actually run for a number of
reasons. This PR fixes it along with some light details on the Graph
Transformer and Provider pages.

## Full Changelog

[Document Loader
Page](https://python.langchain.com/v0.1/docs/integrations/document_loaders/diffbot/)
* Fixed the notebook so that it actually runs (missing required modules,
env variables, etc..)
* Added "open in colab" button like the Graph Transformer page

[Graph Transformer
Page](https://python.langchain.com/v0.2/docs/integrations/graphs/diffbot/)
* Fixed broken colab link
* Moved "open in colab" button to below description so the description
in the [Graphs category
page](https://python.langchain.com/v0.2/docs/integrations/graphs/) shows
up correctly

[Provider
Page](https://python.langchain.com/v0.2/docs/integrations/providers/diffbot/)
* Clarified explanations of Diffbot products
* Added section and link to LangChain Graph Transformer page

---------

Co-authored-by: jeromechoo <hello@jeromechoo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:16 -07:00
Rohan Aggarwal
0585816e9d docs: updates for OracleDB (#21745)
Thank you for contributing to LangChain!

Documentation change for OracleDB

Fixed several things in Oracle Documentation.
2024-06-20 13:52:16 -07:00
Leonid Ganeline
fc94d85108 docs: YouTube page update (#21780)
Greatly simplified to get a cleaner look.
Only the YouTube pages with 40K+ views.
2024-06-20 13:52:16 -07:00
Leonid Ganeline
d5f2dedb46 ai21[patch]: configuration fix (#21790)
added "repository" and "Source Code" parameters (these parameters are
missed only in this partner package configuration).
2024-06-20 13:52:16 -07:00
Trayan Azarov
6887ae85f2 chroma[patch]: Chroma - remove reference to collection upon delete_collection (#21817)
**Description**:

- Reference to `Collection` object is set to `None` when deleting a
collection `delete_collection()`
- Added utility method `reset_collection()` to allow recreating the
collection
- Moved collection creation out of `__init__` into
`__ensure_collection()` to be reused by object init and
`reset_collection()`
- `_collection` is now a property to avoid breaking changes

**Issues**: 

- chroma-core/chroma#2213

**Twitter**: @t_azarov
2024-06-20 13:52:16 -07:00
Jens
37b545b216 community[patch]: fixed aleph alpha default emedding request (#21826)
- **Description:** In the aleph alpha client the paramater `normalize`
is *not* optional. Setting this to `None` gives an error.
- **Dependencies:** None

Co-authored-by: Jens Lücke <jens.luecke@tngtech.com>
Co-authored-by: Jens <jens.luecke@hu-berlin.de>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:16 -07:00
Leonid Ganeline
8841174ca0 docs: added template to arxiv page (#21846)
Updated `arXiv` page with the arxiv references from Templates (were
references from Docs and API Refs, not Templates).
Re #21450 
CC @eyurtsev
2024-06-20 13:52:16 -07:00
Jorge Piedrahita Ortiz
4602528337 community[patch]: Sambanova integration api update (#21848)
- **Description:**:
        SambaStudio generic endpoint compatibility added
        Improved error description, and handling
        streaming examples added
2024-06-20 13:52:16 -07:00
Bagatur
ce7ad06029 docs: correct langserve link (#21940) 2024-06-20 13:52:16 -07:00
Michael Reed
0c8a225f99 core[patch]: Fix NPE in function_calling._get_python_function_required_args (#21863)
Example error message:
line 206, in _get_python_function_required_args
    if is_function_type and required[0] == "self":
                            ~~~~~~~~^^^
IndexError: list index out of range

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:16 -07:00
Liuww
f1c2e56ef0 community[patch]: Adopting the lighter-weight xinference_client (#21900)
While integrating the xinference_embedding, we observed that the
downloaded dependency package is quite substantial in size. With a focus
on resource optimization and efficiency, if the project requirements are
limited to its vector processing capabilities, we recommend migrating to
the xinference_client package. This package is more streamlined,
significantly reducing the storage space requirements of the project and
maintaining a feature focus, making it particularly suitable for
scenarios that demand lightweight integration. Such an approach not only
boosts deployment efficiency but also enhances the application's
maintainability, rendering it an optimal choice for our current context.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:15 -07:00
Tomaz Bratanic
73b5d5e43f experimental[patch]: Pass enum only to openai in llm graph transformer (#21860)
Some models like Groq return bad request if you pass in `enum` parameter
in tool definition
2024-06-20 13:52:15 -07:00
Ozan Kaşıkçı
dfb3239114 docs: Update agents.ipynb, add missing word "see" (#21872)
- **Description:** Add missing see word in the docs
2024-06-20 13:52:15 -07:00
Jiří Spilka
759ecdf825 community[patch]: update apify integration to attribute API activity to langchain (#21909)
**Description:** Add `Origin/langchain` to Apify's client's user-agent
to attribute API activity to LangChain (at Apify, we aim to monitor our
integrations to evaluate whether we should invest more in the LangChain
integration regarding functionality and content)

**Issue:** None
**Dependencies:** None
**Twitter handle:** None
2024-06-20 13:52:15 -07:00
Mohammad Mohtashim
928c3c15c6 docs: HuggingFace Endpoint Documentation Fixed (#21914)
Fixed Documentation for HuggingFaceEndpoint as per the issue #21903

---------

Co-authored-by: keenborder786 <mohammad.mohtashim78@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:15 -07:00
Jared Van Bortel
e600560ddc nomic: implement local embeddings with the inference_mode parameter (#21934)
## Description

This PR implements local and dynamic mode in the Nomic Embed integration
using the inference_mode and device parameters. They work as documented
[here](https://docs.nomic.ai/reference/python-api/embeddings#local-inference).

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, hwchase17. -->

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-06-20 13:52:15 -07:00
ccurme
2341e5115c infra: fix CI on text-splitters (#21935) 2024-06-20 13:52:15 -07:00
Ozan Kaşıkçı
7eb001a5dc docs: how to: tool calling: Fix typo in sentence (#21877)
- **Description:** Fix grammar error.
2024-06-20 13:52:15 -07:00
Erick Friis
cb4c14721d docs: rewrite old home, fix v0.1 infinite redirect (#21936) 2024-06-20 13:52:15 -07:00
Bagatur
50aa0bd524 docs: link to langsmith+langgraph docs (#21930) 2024-06-20 13:52:15 -07:00
ccurme
f93e3e0892 update maintainers (#21305) 2024-06-20 13:52:15 -07:00
ccurme
6c80a33d42 partners: bump core in packages implementing ls_params (#21868)
These packages all import `LangSmithParams` which was released in
langchain-core==0.2.0.

N.B. we will need to release `openai` and then bump `langchain-openai`
in `together` and `upstage`.
2024-06-20 13:52:15 -07:00
junefish
8fd552de88 docs: update notebook for latest Pinecone API + serverless (#21921)
Thank you for contributing to LangChain!

- [x] **PR title**: "docs: update notebook for latest Pinecone API +
serverless"


- [x] **PR message**: Published notebook is incompatible with latest
`pinecone-client` and not runnable. Updated for use with latest Pinecone
Python SDK. Also updated to be compatible with serverless indexes (only
index type available on Pinecone free tier).


- [x] **Add tests and docs**: N/A (tested in Colab)


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.


---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
  - https://app.asana.com/0/0/1207328087952499
2024-06-20 13:52:15 -07:00
ccurme
e11c962122 mistral: implement ls_params (#21867) 2024-06-20 13:52:15 -07:00
junefish
e827d9048f docs: update notebook for new Pinecone API + serverless (#21923)
Thank you for contributing to LangChain!

- [x] **PR title**: "docs: update notebook for new Pinecone API +
serverless"


- [x] **PR message**: The published notebook is not runnable after
`pinecone-client` v2, which is deprecated. `langchain-pinecone` is not
compatible with the latest `pinecone-client` (v4), so I hardcoded it to
the last v3. Also updated for serverless indexes (only index type
available on Pinecone free plan).


- [x] **Add tests and docs**: N/A (tested in Colab)


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.


---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
  - https://app.asana.com/0/0/1207328087952500
2024-06-20 13:52:15 -07:00
Eugene Yurtsev
3af3772667 docs: migrate integrations using langchain-cli (#21929)
Migrate integration docs
2024-06-20 13:52:15 -07:00
Eugene Yurtsev
22d436cf6d docs: migrate tutorials using langchain-cli migrate (#21928)
Migrate tutorials
2024-06-20 13:52:15 -07:00
Eugene Yurtsev
a0a5556c96 docs: run migration script against how-to docs (#21927)
Upgrade imports in how-to docs
2024-06-20 13:52:15 -07:00
Tomaz Bratanic
eac3f09430 community[patch]: Better error message for neo4j vector when text is null (#21861) 2024-06-20 13:52:15 -07:00
Stefano Lottini
a2294da9c1 cli[minor]: fix import path for two Astra DB classes in the migration json data (#21926)
This PR fixes two mistakes in the import paths from community for the
json data aiding the cli migration to 0.2.

It is intended as a quick follow-up to
https://github.com/langchain-ai/langchain/pull/21913 .

@nicoloboschi FYI
2024-06-20 13:52:15 -07:00
WilliamEspegren
5bc7e8e5a0 doc list not empty (#21208)
Make sure the doc list is not empty, and set Metadata: true in param, to
enable the user to disable metadata for slightly faster crawls.
2024-06-20 13:52:15 -07:00
David Charles
db5404c365 langchain[minor]: add libs/partners to dev.Dockerfile (#21902)
Resolves #21886 by adding "COPY libs/partners ../partners/" to
libs/dev.Dockerfile

Twitter: @kabakongo
2024-06-20 13:52:15 -07:00
Eugene Yurtsev
80a11dad44 docs: update how to install (#21920)
Fix installation instructions in how-to install
2024-06-20 13:52:15 -07:00
TJ
48741e618a community[patch]: Update documentation string in databricks chat model (#21915)
Update typos in documentation string in databricks chat model
2024-06-20 13:52:15 -07:00
Maxime Perrin
ea80294730 docs: fix wrong langchain-cli migration commands (#21906)
Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-06-20 13:52:15 -07:00
Nicolò Boschi
58f756471d cli[minor]: add astradb in the cli migration to 0.2 (#21913)
astradb has a new partner package but the automatic migration cli tool
doesn't take care of migration astradb integrations
2024-06-20 13:52:15 -07:00
Jacob Lee
f05092cef3 docs[patch]: Adds callback docs (#21889)
@efriis @hwchase17
2024-06-20 13:52:15 -07:00
Jacob Lee
b205b49aba docs[patch]: Update 0.2 banner copy (#21888)
@nfcampos
2024-06-20 13:52:15 -07:00
Coozywana
8fb415dcbd Fix base.py typo (#21862)
ChatOpenaAI --> ChatOpenAI

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-06-20 13:52:15 -07:00
fzowl
5691abc614 partners: Remove unnecessary print from voyageai embeddings (#21865)
Thank you for contributing to LangChain!

Remove unnecessary print from voyageai embeddings

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-06-20 13:52:15 -07:00
Eugene Yurtsev
585bacdb99 docs: how to remove conversion to openai function from index (#21836)
- bind_tools interface is a better alternative.
- openai doesn't use functions but tools in its API now.
- the underlying content appears in some redirects, so will need to
investigate if we can remove.
2024-06-20 13:52:15 -07:00
Eugene Yurtsev
225d20bad7 docs: how to tools human in the loop (#21858)
Update information in how to guide tools human in the loop.
2024-06-20 13:52:15 -07:00
Eugene Yurtsev
936836a6f1 docs: how-to index page fix minor typo (#21859)
Fix typo
2024-06-20 13:52:15 -07:00
Bagatur
7798ab2590 docs: lcel how to and cheatsheet (#21851) 2024-06-20 13:52:15 -07:00
Erick Friis
95de87d4ac docs: update announcement bar (#21854) 2024-06-20 13:52:15 -07:00
Jacob Lee
c1b241a5ea docs[patch]: Remove padding from first sidebar link (#21852)
CC @efriis
2024-06-20 13:52:15 -07:00
Nuno Campos
c8cb4eade4 core: Tap output of sync iterators for astream_events (#21842)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-06-20 13:52:15 -07:00
Erick Friis
0d9edebd0a docs: v0.2 version sidebar (#21844)
![image](https://github.com/langchain-ai/langchain/assets/9557659/189f2e04-0c08-4395-b729-f48982c6f53b)
2024-06-20 13:52:15 -07:00
Max Jakob
ac5cf321e9 docs: update Elasticsearch strategy names (#21530)
Update documentation with the [new names for retrieval
strategies](https://github.com/langchain-ai/langchain-elastic/pull/22)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:15 -07:00
Erick Friis
78b58d736f docs: resolve local links script escape (#21840)
Fixing warnings. Needs to be propagated to 0.1 branch if this works.

![Screenshot 2024-05-17 at 2 34
15 PM](https://github.com/langchain-ai/langchain/assets/9557659/e6ac95a9-5686-4747-9ab8-4cb49942dc8d)
2024-06-20 13:52:14 -07:00
Erick Friis
d9ed03384a docs: remove postgres from docs build (#21847) 2024-06-20 13:52:14 -07:00
Eugene Yurtsev
17a2c21653 core[patch]: Check if event loop is closed in memory stream (#21841)
Check if event stream is closed in memory loop.

Using try/except here to avoid race condition, but this may incur a
small overhead in versions prios to 3.11
2024-06-20 13:52:14 -07:00
Erick Friis
d9c3b6550c docs: fix vercel core dep 2 (#21839) 2024-06-20 13:52:14 -07:00
Erick Friis
fa844fb82e docs: fix vercel core dep (#21837) 2024-06-20 13:52:14 -07:00
Erick Friis
e71f24c254 experimental: release 0.0.59 (#21835) 2024-06-20 13:52:14 -07:00
Erick Friis
6eab69da76 community: release 0.2.0 (#21834) 2024-06-20 13:52:14 -07:00
Eugene Yurtsev
bb5b6e1bbc docs: how to guide tool calling using prompts (#21827)
Update tool calling using prompts.

- Add required concepts
- Update names of tool invoking function.
- Add doc-string to function, and add information about `config` (which
users often forget)
- Remove steps that show how to use single function only. This makes the
how-to guide a bit shorter and more to the point.
- Add diagram from another how-to guide that shows how the thing works
overall.
2024-06-20 13:52:14 -07:00
Erick Friis
ed20039f4f langchain: release 0.2.0, fix min deps (#21833) 2024-06-20 13:52:14 -07:00
Erick Friis
a497d0e29b text-splitters: release 0.2.0 (#21832) 2024-06-20 13:52:14 -07:00
Erick Friis
31c3919ce8 langchain: release 0.2.0 (#21831) 2024-06-20 13:52:14 -07:00
Eugene Yurtsev
2359fdab1a docs: update how-to for built in tools and toolkits (#21828)
Fix some typos
2024-06-20 13:52:14 -07:00
Erick Friis
8756b36afc core: release 0.2.0 (#21829) 2024-06-20 13:52:14 -07:00
Eugene Yurtsev
6c68d2553a docs: clean up link to bing search (#21825)
Documentation should be inlined, not linking to medium article.
2024-06-20 13:52:14 -07:00
Eugene Yurtsev
59f0a1aeaf docs: how to tools, merge built in tools and toolkits (#21824)
* Rename tools to built in tools
* Merge built in tools and toolkits
* Update links from providers
2024-06-20 13:52:14 -07:00
Leonid Ganeline
1cd075c68d docs: arXiv references page (#21450)
Since the LangChain based on many research papers, the LC documentation
has several references to the arXiv papers. It would be beneficial to
create a single page with all referenced papers.
PR:
1. Developed code to search the arXiv references in the LangChain
Documentation and the LangChain code base. Those references are included
in a newly generated documentation page.
2. Page is linked to the Docs menu.

Controversial:
1. The `arxiv_references` page is automatically generated. But this
generation now started only manually. It is not included in the doc
generation scripts. The reason for this is simple. I don't want to
mangle into the current documentation refactoring. If you think, we need
to regenerate this page in each build, let me know. Note: This script
has a dependency on the `arxiv` package.
2. The link for this page in the menu is not obvious.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:14 -07:00
ccurme
f7f17c7081 core, standard tests, partner packages: add test for model params (#21677)
1. Adds `.get_ls_params` to BaseChatModel which returns
```python
class LangSmithParams(TypedDict, total=False):
    ls_provider: str
    ls_model_name: str
    ls_model_type: Literal["chat"]
    ls_temperature: Optional[float]
    ls_max_tokens: Optional[int]
    ls_stop: Optional[List[str]]
```
by default it will only return
```python
{ls_model_type="chat", ls_stop=stop}
```

2. Add these params to inheritable metadata in
`CallbackManager.configure`

3. Implement `.get_ls_params` and populate all params for Anthropic +
all subclasses of BaseChatOpenAI

Sample trace:
https://smith.langchain.com/public/d2962673-4c83-47c7-b51e-61d07aaffb1b/r

**OpenAI**:
<img width="984" alt="Screenshot 2024-05-17 at 10 03 35 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/2ef41f74-a9df-4e0e-905d-da74fa82a910">

**Anthropic**:
<img width="978" alt="Screenshot 2024-05-17 at 10 06 07 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/39701c9f-7da5-4f1a-ab14-84e9169d63e7">

**Mistral** (and all others for which params are not yet populated):
<img width="977" alt="Screenshot 2024-05-17 at 10 08 43 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/37d7d894-fec2-4300-986f-49a5f0191b03">
2024-06-20 13:52:14 -07:00
Eugene Yurtsev
d57d9cd3b4 docs: Remove duplicated content from how to tools (#21821)
Content is duplicated, and is covered in how to use chat models.
2024-06-20 13:52:14 -07:00
Matthew Koski
4be52757db langchain: Fixing import in docs per https://github.com/langchain-ai/langchain/issues/21814 (#21815)
Description: The example in the How-To guide had an import which did not
work. I changed it to use an import from langchain_core.

Issue: https://github.com/langchain-ai/langchain/issues/21814
2024-06-20 13:52:14 -07:00
Sen Lin
7960df6590 community[patch]: fix typo in ValueError message in load_local function (#21818)
**Description:**
Corrected an error in the `allow_dangerous_deserialization` message
within the `load_local` functions
2024-06-20 13:52:14 -07:00
Jorge Piedrahita Ortiz
a90ddd23f9 community: sambaverse api update (#21816)
- **Description:** fix sambaverse integration to make it compatible with
sambaverse API update / minor changes in docs
2024-06-20 13:52:14 -07:00
Erick Friis
408c4b802c docs: cookbook redirect (#21822) 2024-06-20 13:52:14 -07:00
maang-h
d5587cb3c5 community[patch]: Fix unintended newline in print statement in exception for BaichuanTextEmbeddings (#21820)
- **Code:** langchain_community/embeddings/baichuan.py:82
- **Description:** When I make an error using 'baichuan embeddings', the
printed error message is wrapped (there is actually no need to wrap)
```python
# example
from langchain_community.embeddings import BaichuanTextEmbeddings

# error key
BAICHUAN_API_KEY = "sk-xxxxxxxxxxxxx"
embeddings = BaichuanTextEmbeddings(baichuan_api_key=BAICHUAN_API_KEY)

text_1 = "今天天气不错"
query_result = embeddings.embed_query(text_1)
```



![unintended
newline](https://github.com/langchain-ai/langchain/assets/55082429/e1178ce8-62bb-405d-a4af-e3b28eabc158)
2024-06-20 13:52:14 -07:00
Eugene Yurtsev
50f00434bf docs: minor updates to migration docs (#21819)
Minor aesthetic updates to migration docs
2024-06-20 13:52:14 -07:00
Eugene Yurtsev
230bc67542 docs: Update v0.2 information (#21796)
Update information about v0.2 upgrade
2024-06-20 13:52:14 -07:00
Bakar Tavadze
31f8ac8140 langchain-robocorp[minor]: Enable passing additional headers to the action server. (#21809)
Actions can optionally receive secrets via request headers. This PR
enables this functionality.
2024-06-20 13:52:14 -07:00
Erick Friis
88740b64dc docs: version dropdown (#21784) 2024-06-20 13:52:14 -07:00
Chad Juliano
c9e8d851dc docs: fix errors and table formatting in notebook (#21696)
There are 2 issues fixed here:

* In the notebook pandas dataframes are formatted as HTML in the cells.
On the documentation site the renderer that converts notebooks
incorrectly displays the raw HTML. I can't find any examples of where
this is working and so I am formatting the dataframes as text.

* Some incorrect table names were referenced resulting in errors.
2024-06-20 13:52:14 -07:00
Asaf Joseph Gardin
78ea3fc025 partners: Revert AI21 Labs docs scan feature (#21699)
Description: Reverted commit #21614

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:14 -07:00
github-user-en
7a829a2bb8 Made a grammatical correction in streaming.ipynb (#21707)
The only change is replacing the word "operators" with "operates," to
make the sentence grammatically correct.

Thank you for contributing to LangChain!

- [x] **PR title**: "docs: Made a grammatical correction in
streaming.ipynb to use the word "operates" instead of the word
"operators""


- [x] **PR message**: 
- **Description:** The use of the word "operators" was incorrect, given
the context and grammar of the sentence. This PR updates the
documentation to use the word "operates" instead of the word
"operators".
    - **Issue:** Makes the documentation more easily understandable.
    - **Dependencies:** -no dependencies-
    - **Twitter handle:** --


- [x] **Add tests and docs**: Since no new integration is being made, no
new tests/example notebooks are required.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
    - **No formatting changes made to the documentation**

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-06-20 13:52:14 -07:00
Brace Sproul
4a1f397f02 docs[minor]: Hide prev/next buttons on docs in how to / tutorials (#21789)
These buttons don't navigate to the proper prev/next page. Hide in those
pages
2024-06-20 13:52:14 -07:00
Eugene Yurtsev
13cf33287a langchain[patch],community[patch]: Move unit tests that depend on community to community (#21685) 2024-06-20 13:52:14 -07:00
Eugene Yurtsev
daec5e8564 How To: Custom tools (#21725)
- Remove double implementations of functions. The single input is just
taking up space.
- Added tool specific information for `async + showing invoke vs.
ainvoke.
- Added more general information about about `async` (this should live
in a different place eventually since it's not specific to tools).
- Changed ordering of custom tools (StructuredTool is simpler and should
appear before the inheritance)
- Improved the error handling section (not convinced it should be here
though)
2024-06-20 13:52:14 -07:00
Bagatur
2764e0aa90 docs: link runnable api (#21783) 2024-06-20 13:52:14 -07:00
Bagatur
cd5a1f8371 docs: intro nit (#21785) 2024-06-20 13:52:14 -07:00
Marco Lamina
8b2b48b4c4 community: Add token cost for GPT-4o model (#21771)
Adding [token cost for the new GPT-4o
model](https://openai.com/api/pricing/):
* Input cost US$5.00 / 1M tokens
* Output cost US$15.00 / 1M tokens
2024-06-20 13:52:14 -07:00
Bagatur
15e7ae00e2 docs: update chat feat table (#21778) 2024-06-20 13:52:14 -07:00
Massimiliano Pronesti
5b522ee1f6 feat(community): support semantic hybrid score threshold in Azure AI Search (#21527)
Support semantic hybrid search with a score threshold -- similar to what
we do for similarity search and for hybrid search (#20907).
2024-06-20 13:52:14 -07:00
Erick Friis
2a631826a5 docs: dont rewrite ipynb links that have double slash (#21775) 2024-06-20 13:52:14 -07:00
Eugene Yurtsev
728507ba76 docs: concepts -- add information about tool calling models, update tools section (#21760)
- Add information about naitve tool calling capabilities
- Add information about standard langchain interface for tool calling
- Update description for tools

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:14 -07:00
Bagatur
092340f61d anthropic[patch]: Release 0.1.13, tool_choice support (#21773) 2024-06-20 13:52:14 -07:00
Stefano Lottini
63108ebe25 community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765)
This PR improves on the `CassandraCache` and `CassandraSemanticCache`
classes, mainly in the constructor signature, and also introduces
several minor improvements around these classes.

### Init signature

A (sigh) breaking change is tentatively introduced to the constructor.
To me, the advantages outweigh the possible discomfort: the new syntax
places the DB-connection objects `session` and `keyspace` later in the
param list, so that they can be given a default value. This is what
enables the pattern of _not_ specifying them, provided one has
previously initialized the Cassandra connection through the versatile
utility method `cassio.init(...)`.

In this way, a much less unwieldy instantiation can be done, such as
`CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`,
everything else falling back to defaults.

A downside is that, compared to the earlier signature, this might turn
out to be breaking for those doing positional instantiation. As a way to
mitigate this problem, this PR typechecks its first argument trying to
detect the legacy usage.
(And to make this point less tricky in the future, most arguments are
left to be keyword-only).

If this is considered too harsh, I'd like guidance on how to further
smoothen this transition. **Our plan is to make the pattern of optional
session/keyspace a standard across all Cassandra classes**, so that a
repeatable strategy would be ideal. A possibility would be to keep
positional arguments for legacy reasons but issue a deprecation warning
if any of them is actually used, to later remove them with 0.2 - please
advise on this point.

### Other changes

- class docstrings: enriched, completely moved to class level, added
note on `cassio.init(...)` pattern, added tiny sample usage code.
- semantic cache: revised terminology to never mention "distance" (it is
in fact a similarity!). Kept the legacy constructor param with a
deprecation warning if used.
- `llm_caching` notebook: uniform flow with the Cassandra and Astra DB
separate cases; better and Cassandra-first description; all imports made
explicit and from community where appropriate.
- cache integration tests moved to community (incl. the imported tools),
env var bugfix for `CASSANDRA_CONTACT_POINTS`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:14 -07:00
fzowl
ef7353a138 docs: new voyageai text_embeddings model: voyage-large-2-instruct (#21706) 2024-06-20 13:52:14 -07:00
Bagatur
f087bcd0be docs: datacamp course (#21767) 2024-06-20 13:52:13 -07:00
Kyle Cassidy
476981022c Standardized openai init params (#21739)
## Patch Summary
community:openai[patch]: standardize init args

## Details
I made changes to the OpenAI Chat API wrapper test in the Langchain
open-source repository

- **File**: `libs/community/tests/unit_tests/chat_models/test_openai.py`
- **Changes**:
  - Updated `max_retries` with Pydantic Field
  - Updated the corresponding unit test
- **Related Issues**: #20085
  - Updated max_retries with Pydantic Field, updated the unit test.

---------

Co-authored-by: JuHyung Son <sonju0427@gmail.com>
2024-06-20 13:52:13 -07:00
laishzh
c2bdbfbcc4 docs: Remove unnecessary comment marks from the Makefile help section (#21749)
**Previous screenshot:**
<img width="758" alt="image"
src="https://github.com/langchain-ai/langchain/assets/1683919/7b90626e-35ab-4486-b41d-b664e69eec0b">

**Current:**
<img width="744" alt="image"
src="https://github.com/langchain-ai/langchain/assets/1683919/cdb69512-dc6c-4b7f-a466-4be92d94c076">
2024-06-20 13:52:13 -07:00
Ethan Yang
23647b44e0 community: update openvino doc with streaming support (#21519)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:13 -07:00
Eugene Yurtsev
f01b9225b6 How to: Streaming (#21715)
Update the how to guide on streaming

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:13 -07:00
ccurme
56cff57fe4 community: fix CI (#21766) 2024-06-20 13:52:13 -07:00
Michael Ozery
e1cf4225de docs: sql_qa.ipynb tutorial update (#21756)
1. Updated deprecated method usage.
2. Added LangGraph required installation in tutorial.

X: MichaelOzery
2024-06-20 13:52:13 -07:00
Mish Ushakov
54a56b91a1 community: updated Browserbase loader (#21757)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: updated Browserbase loader"

- [x] **PR message**:
    Updates the Browserbase loader with more options and improved docs.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 13:52:13 -07:00
Ikko Eltociear Ashimine
b229468836 docs: update sql_large_db.ipynb (#21765)
mispelling -> misspelling
2024-06-20 13:52:13 -07:00
Eugene Yurtsev
a26476f866 core[major]: only use function description (#21622)
Do not prefix function signature

---

* Reason for this is that information is already present with tool
calling models.
* This will save on tokens for those models, and makes it more obvious
what the description is!
* The @tool can get more parameters to allow a user to re-introduce the
the signature if we want
2024-06-20 13:52:13 -07:00
William FH
46c1b56b09 Finish agent migration doc (#21731) 2024-06-20 13:52:13 -07:00
Cheese
732a4bc329 community: Implement bind_tools for ChatTongyi (#20725)
## Description

Implement `bind_tools` in ChatTongyi. Usage example:

```py
from langchain_core.tools import tool
from langchain_community.chat_models.tongyi import ChatTongyi

@tool
def multiply(first_int: int, second_int: int) -> int:
    """Multiply two integers together."""
    return first_int * second_int

llm = ChatTongyi(model="qwen-turbo")

llm_with_tools = llm.bind_tools([multiply])

msg = llm_with_tools.invoke("What's 5 times forty two")

print(msg)
```

Streaming is also supported.

## Dependencies

No Dependency is required for this change.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 13:52:13 -07:00
yoogle
2a43e4ba78 docs: fix monorepo typo (#21761)
### Description
fix monorepo typo. `monorep` -> `monorepo`
2024-06-20 13:52:13 -07:00
Bagatur
1d62cba406 docs: aca-ds nit (#21759) 2024-06-20 13:52:13 -07:00
Bagatur
658a87c50e docs: add aca-ds (#21746) 2024-06-20 13:52:13 -07:00
Bagatur
a7b9bc4b7c docs: aza-ds cookbook (#21747) 2024-06-20 13:52:13 -07:00
Erick Friis
2cd418f54a fireworks: add secret (#21744) 2024-06-20 13:52:13 -07:00
Erick Friis
119e11acd9 pinecone: bump min core version (#21742) 2024-06-20 13:52:13 -07:00
Erick Friis
d0f0db2256 fireworks: bump min core version (#21741) 2024-06-20 13:52:13 -07:00
Erick Friis
7c9cbee4b0 infra: release min version dont clobber current lib (#21740) 2024-06-20 13:52:13 -07:00
Erick Friis
592af9f33d airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738) 2024-06-20 13:52:13 -07:00
Erick Friis
7169b25d8a ibm[patch]: release 0.1.7 (#21737) 2024-06-20 13:52:13 -07:00
Erick Friis
83dcb567dd openai[patch]: fix embedding float precision issue (#21736)
also clean up + comment some of the embedding batching code
2024-06-20 13:52:13 -07:00
JuHyung Son
9636c0f7e3 upstage: Support batch input in embedding request. (#21730)
**Description:** upstage embedding now supports batch input.
2024-06-20 13:52:13 -07:00
junefish
85dc31f169 docs: Update Pinecone example notebook with embedded widget (#21719)
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:13 -07:00
Erick Friis
a122dc462e docs: fix installation link (#21728) 2024-06-20 13:52:13 -07:00
Harrison Chase
16a07bc743 Harrison/move flashrank rerank (#21448)
third party integration, should be in community
2024-06-20 13:52:13 -07:00
Harrison Chase
b845f495cf move installation (#21711) 2024-06-20 13:52:13 -07:00
Erick Friis
b12c2fb0bf multiple: releases with relaxed core dep (#21724) 2024-06-20 13:52:13 -07:00
Bagatur
b317b90af5 openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723) 2024-06-20 13:52:13 -07:00
Bagatur
a68ba28157 docs: add feedback link to 0.2 banner (#21600) 2024-06-20 13:52:13 -07:00
William FH
9cda8b7ee8 [Core] Check is async callable (#21714)
To permit proper coercion of objects like the following:


```python
class MyAsyncCallable:
    async def __call__(self, foo):
        return await ...

class MyAsyncGenerator:
    async def __call__(self, foo):
        await ...
        yield 
```
2024-06-20 13:52:13 -07:00
ccurme
d439174bad docs: add tutorial for vector stores and retrievers (#21683)
also update how-to guide for parent document retriever
2024-06-20 13:52:13 -07:00
Eugene Yurtsev
8ac6f8648f core[minor]: Add v2 implementation of astream events (#21638)
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.

After this PR, the astream events implementation:

- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch

As a result of this re-write, a number of issues were discovered with
the existing implementation.

## Changes in V2 vs. V1

### on_chat_model_end `output`

The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.

As a root level runnable the output was: 

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

As part of a chain the output was:

```
            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },
```

After this PR, we will always use the simpler representation:

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.

### Remove some `_stream` events

`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.

The same information is already available in the `x_on_end` events.

### Propagating Names

Names of runnables have been updated to be more consistent

```python
  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )
```

Before:
```python
"name": "RunnableConfigurableFields",
```

After:
```python
"name": "GenericFakeChatModel",
```

### on_retriever_end

on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")

### Retry events

Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`


https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
2024-06-20 13:52:12 -07:00
Rajendra Kadam
fb08b03801 langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641)
- **Description:** PebbloRetrievalQA chain introduces identity
enforcement using vector-db metadata filtering
- **Dependencies:** None
- **Issue:** None
- **Documentation:** Adding documentation for PebbloRetrievalQA chain in
a separate PR(https://github.com/langchain-ai/langchain/pull/20746)
- **Unit tests:** New unit-tests added

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-20 13:52:12 -07:00
Bagatur
92ed9790d4 docs: openai bind tools nit (#21692) 2024-06-20 13:52:12 -07:00
Erick Friis
f97bbf4794 docs: disable contextual search (#21691) 2024-06-20 13:52:12 -07:00
Erick Friis
d5a9e91bdf infra: remove prints from notebook build (#21688) 2024-06-20 13:52:12 -07:00
Bagatur
3fa1219d2c fmt 2024-05-14 16:02:50 -07:00
Bagatur
9b7aedcb96 fmt 2024-05-14 16:00:35 -07:00
Bagatur
e083c641ef fmt 2024-05-14 15:57:51 -07:00
Bagatur
23bacfc222 fmt 2024-05-14 15:57:38 -07:00
William Fu-Hinthorn
e4d8ef659c merge 2024-04-25 18:01:07 -07:00
2 changed files with 156 additions and 4 deletions

View File

@@ -21,14 +21,27 @@ from __future__ import annotations
import asyncio
import inspect
import logging
import textwrap
import typing
import uuid
import warnings
from abc import ABC, abstractmethod
from contextvars import copy_context
from functools import partial
from inspect import signature
from typing import Any, Awaitable, Callable, Dict, List, Optional, Tuple, Type, Union
from typing import (
Any,
Awaitable,
Callable,
Dict,
List,
Mapping,
Optional,
Tuple,
Type,
Union,
)
from langchain_core._api import deprecated
from langchain_core.callbacks import (
@@ -71,13 +84,18 @@ from langchain_core.runnables.config import (
)
from langchain_core.runnables.utils import accepts_context
logger = logging.getLogger(__name__)
class SchemaAnnotationError(TypeError):
"""Raised when 'args_schema' is missing or has an incorrect type annotation."""
def _create_subset_model(
name: str, model: Type[BaseModel], field_names: list
name: str,
model: Type[BaseModel],
field_names: list,
descriptions: Optional[Mapping[str, str]] = None,
) -> Type[BaseModel]:
"""Create a pydantic model with only a subset of model's fields."""
fields = {}
@@ -89,6 +107,10 @@ def _create_subset_model(
if field.required and not field.allow_none
else Optional[field.outer_type_]
)
# Inject the description into the field_info
description = descriptions.get(field_name) if descriptions else None
if description:
field.field_info.description = description
fields[field_name] = (t, field.field_info)
rtn = create_model(name, **fields) # type: ignore
return rtn
@@ -104,6 +126,31 @@ def _get_filtered_args(
return {k: schema[k] for k in valid_keys if k not in ("run_manager", "callbacks")}
def _get_description_from_annotation(ann: Any) -> Optional[str]:
possible_descriptions = [
arg for arg in typing.get_args(ann) if isinstance(arg, str)
]
return "\n".join(possible_descriptions) if possible_descriptions else None
def _get_descriptions(func: Callable) -> Dict[str, str]:
"""Get the descriptions from a function's signature."""
descriptions = {}
for param in inspect.signature(func).parameters.values():
if param.annotation is not inspect.Parameter.empty:
try:
description = _get_description_from_annotation(param.annotation)
except Exception as e:
logger.warning(
"Could not infer tool parameter description"
f" from annotation : {repr(e)}"
)
description = None
if description:
descriptions[param.name] = description
return descriptions
class _SchemaConfig:
"""Configuration for the pydantic model."""
@@ -131,8 +178,13 @@ def create_schema_from_function(
del inferred_model.__fields__["callbacks"]
# Pydantic adds placeholder virtual fields we need to strip
valid_properties = _get_filtered_args(inferred_model, func)
# TODO: we could pass through additional metadata here
descriptions = _get_descriptions(func)
return _create_subset_model(
f"{model_name}Schema", inferred_model, list(valid_properties)
f"{model_name}Schema",
inferred_model,
list(valid_properties),
descriptions=descriptions,
)

View File

@@ -10,6 +10,7 @@ from functools import partial
from typing import Any, Callable, Dict, List, Optional, Type, Union
import pytest
from typing_extensions import Annotated
from langchain_core.callbacks import (
AsyncCallbackManagerForToolRun,
@@ -24,6 +25,7 @@ from langchain_core.tools import (
Tool,
ToolException,
_create_subset_model,
create_schema_from_function,
tool,
)
from tests.unit_tests.fake.callbacks import FakeCallbackHandler
@@ -54,7 +56,12 @@ class _MockStructuredTool(BaseTool):
args_schema: Type[BaseModel] = _MockSchema
description: str = "A Structured Tool"
def _run(self, arg1: int, arg2: bool, arg3: Optional[dict] = None) -> str:
def _run(
self,
arg1: int,
arg2: bool,
arg3: Optional[dict] = None,
) -> str:
return f"{arg1} {arg2} {arg3}"
async def _arun(self, arg1: int, arg2: bool, arg3: Optional[dict] = None) -> str:
@@ -71,6 +78,33 @@ def test_structured_args() -> None:
assert structured_api.run(args) == expected_result
@pytest.mark.skipif(sys.version_info < (3, 10), reason="Requires Python 3.10 or above")
def test_structured_args_description() -> None:
class _AnnotatedTool(BaseTool):
name: str = "structured_api"
description: str = "A Structured Tool"
def _run(
self,
arg1: int,
arg2: Annotated[bool, "V important"],
arg3: Optional[dict] = None,
) -> str:
return f"{arg1} {arg2} {arg3}"
async def _arun(
self, arg1: int, arg2: bool, arg3: Optional[dict] = None
) -> str:
raise NotImplementedError
expected = {
"arg1": {"title": "Arg1", "type": "integer"},
"arg2": {"title": "Arg2", "type": "boolean", "description": "V important"},
"arg3": {"title": "Arg3", "type": "object"},
}
assert _AnnotatedTool().args == expected
def test_misannotated_base_tool_raises_error() -> None:
"""Test that a BaseTool with the incorrect typehint raises an exception.""" ""
with pytest.raises(SchemaAnnotationError):
@@ -874,6 +908,72 @@ def test_tool_invoke_optional_args(inputs: dict, expected: Optional[dict]) -> No
foo.invoke(inputs) # type: ignore
@pytest.mark.skipif(sys.version_info < (3, 10), reason="Requires Python 3.10 or above")
def test_create_schema_from_function_with_descriptions() -> None:
def foo(bar: int, baz: str) -> str:
"""Docstring
Args:
bar: int
baz: str
"""
raise NotImplementedError()
async def foo_async(bar: int, baz: str) -> str:
"""Docstring
Args:
bar: int
baz: str
"""
raise NotImplementedError()
for func in [foo, foo_async]:
schema = create_schema_from_function("foo", func)
expected = {
"title": "fooSchema",
"type": "object",
"properties": {
"bar": {"title": "Bar", "type": "integer"},
"baz": {"title": "Baz", "type": "string"},
},
"required": ["bar", "baz"],
}
assert schema.schema() == expected
def foo_annotated(
bar: Annotated[int, "This is bar", {"gte": 5}, "it's useful"],
) -> str:
"""Docstring
Args:
bar: int
"""
raise NotImplementedError
async def foo_async_annotated(
bar: Annotated[int, "This is bar", {"gte": 5}, "it's useful"],
) -> str:
"""Docstring
Args:
bar: int
"""
raise bar
for func in [foo_annotated, foo_async_annotated]:
schema = create_schema_from_function("foo_annotated", func)
annotated_expected = {
"title": "foo_annotatedSchema",
"type": "object",
"properties": {
"bar": {
"title": "Bar",
"type": "integer",
"description": "This is bar\nit's useful",
},
},
"required": ["bar"],
}
assert schema.schema() == annotated_expected
def test_tool_pass_context() -> None:
@tool
def foo(bar: str) -> str: