Commit Graph

4636 Commits

Author SHA1 Message Date
William Fu-Hinthorn
5a2f2bd615 Merge 2024-06-20 13:52:28 -07:00
Bagatur
07990b437d docs: fix chat model methods table (#23233)
rst table not md
![Screenshot 2024-06-20 at 12 37 46
PM](https://github.com/langchain-ai/langchain/assets/22008038/7a03b869-c1f4-45d0-8d27-3e16f4c6eb19)
2024-06-20 13:52:28 -07:00
Zheng Robert Jia
55003c5ada docs[minor],community[patch]: Minor tutorial docs improvement, minor import error quick fix. (#22725)
minor changes to module import error handling and minor issues in
tutorial documents.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
2ed41433da core[patch]: Fix doc-strings for code blocks (#23232)
Code blocks need extra space around them to be rendered properly by
sphinx
2024-06-20 13:52:28 -07:00
Luis Moros
3a44ec62e4 community[patch]: Fix sql_databse.from_databricks issue when ran from Job (#23224)
**Desscription**: When the ``sql_database.from_databricks`` is executed
from a Workflow Job, the ``context`` object does not have a
"browserHostName" property, resulting in an error. This change manages
the error so the "DATABRICKS_HOST" env variable value is used instead of
stoping the flow

Co-authored-by: lmorosdb <lmorosdb>
2024-06-20 13:52:28 -07:00
Cory Waddingham
1bb15e3ce4 pinecone[patch]: Update Poetry requirements for pinecone-client >=3.2.2 (#22094)
This change updates the requirements in
`libs/partners/pinecone/pyproject.toml` to allow all versions of
`pinecone-client` greater than or equal to 3.2.2.

This change resolves issue
[21955](https://github.com/langchain-ai/langchain/issues/21955).

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
aa10e39f12 core[patch]: Add clarification about streaming to RunnableLambda (#23227)
Add streaming clarification to runnable lambda docstring.
2024-06-20 13:52:28 -07:00
maang-h
9375705162 community[patch]: Update root_validators ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI (#22853)
This PR updates root validators for:

- ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat,
ChatSparkLLM, ChatZhipuAI

Issues #22819

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:28 -07:00
ChrisDEV
45db600671 Fix return value type of dumpd (#20123)
The return type of `json.loads` is `Any`.

In fact, the return type of `dumpd` must be based on `json.loads`, so
the correction here is understandable.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:28 -07:00
Guangdong Liu
a98f7df1b9 core(patch): Fix encoding problem of load_prompt method (#21559)
- description: Add encoding parameters.
- @baskaryan, @efriis, @eyurtsev, @hwchase17.


![54d25ac7b1d5c2e47741a56fe8ed8ba](https://github.com/langchain-ai/langchain/assets/48236177/ffea9596-2001-4e19-b245-f8a6e231b9f9)
2024-06-20 13:52:28 -07:00
Philippe PRADOS
3cb9c4c9bd core[minor]: Adds an in-memory implementation of RecordManager (#13200)
**Description:**
langchain offers three technologies to save data:
-
[vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/)
- [docstore](https://js.langchain.com/docs/api/schema/classes/Docstore)
- [record
manager](https://python.langchain.com/docs/modules/data_connection/indexing)

If you want to combine these technologies in a sample persistence
stategy you need a common implementation for each. `DocStore` propose
`InMemoryDocstore`.

We propose the class `MemoryRecordManager` to complete the system.

This is the prelude to another full-request, which needs a consistent
combination of persistence components.

**Tag maintainer:**
@baskaryan

**Twitter handle:**
@pprados

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 13:52:28 -07:00
Leonid Ganeline
d9eafd42ea community: docstrings (#23202)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:28 -07:00
Julian Weng
5f7a1ea140 partners[minor]: Fix value error message for with_structured_output (#22877)
Currently, calling `with_structured_output()` with an invalid method
argument raises `Unrecognized method argument. Expected one of
'function_calling' or 'json_format'`, but the JSON mode option [is now
referred
to](https://python.langchain.com/v0.2/docs/how_to/structured_output/#the-with_structured_output-method)
by `'json_mode'`. This fixes that.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:28 -07:00
Leonid Ganeline
e864baaf40 huggingface: docstrings (#23148)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:28 -07:00
ccurme
32953f42ba huggingface[patch]: fix CI for python 3.12 (#23197) 2024-06-20 13:52:28 -07:00
xyd
a6158c09f6 fix https://github.com/langchain-ai/langchain/issues/23215 (#23216)
fix bug 
The ZhipuAIEmbeddings class is not working.

Co-authored-by: xu yandong <shaonian@acsx1.onexmail.com>
2024-06-20 13:52:28 -07:00
Bagatur
38bf2ca14a standard-tests[patch]: test stop not stop_sequences (#23200) 2024-06-20 13:52:28 -07:00
David DeCaprio
4c91188a6e core:Add optional max_messages to MessagePlaceholder (#16098)
- **Description:** Add optional max_messages to MessagePlaceholder
- **Issue:**
[16096](https://github.com/langchain-ai/langchain/issues/16096)
- **Dependencies:** None
- **Twitter handle:** @davedecaprio

Sometimes it's better to limit the history in the prompt itself rather
than the memory. This is needed if you want different prompts in the
chain to have different history lengths.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 13:52:28 -07:00
shaunakgodbole
039bd6b373 fireworks[patch]: fix api_key alias in Fireworks LLM (#23118)
Thank you for contributing to LangChain!

**Description**
The current code snippet for `Fireworks` had incorrect parameters. This
PR fixes those parameters.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
c37adacfb7 core[patch]: Document agent schema (#23194)
* Document agent schema
* Refer folks to langgraph for more information on how to create agents.
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
6f994dbca3 core[patch]: Document messages namespace (#23154)
- Moved doc-strings below attribtues in TypedDicts -- seems to render
better on APIReference pages.
* Provided more description and some simple code examples
2024-06-20 13:52:28 -07:00
Eugene Yurtsev
7c2ca1ef0d core[patch]: Add doc-strings to outputs, fix @root_validator (#23190)
- Document outputs namespace
- Update a vanilla @root_validator that was missed
2024-06-20 13:52:28 -07:00
Bagatur
02836772ce infra: add more formatter rules to openai (#23189)
Turns on
https://docs.astral.sh/ruff/settings/#format_docstring-code-format and
https://docs.astral.sh/ruff/settings/#format_skip-magic-trailing-comma

```toml
[tool.ruff.format]
docstring-code-format = true
skip-magic-trailing-comma = true
```
2024-06-20 13:52:27 -07:00
Michał Krassowski
7aa67d9dac community[patch]: restore compatibility with SQLAlchemy 1.x (#22546)
- **Description:** Restores compatibility with SQLAlchemy 1.4.x that was
broken since #18992 and adds a test run for this version on CI (only for
Python 3.11)
- **Issue:** fixes #19681
- **Dependencies:** None
- **Twitter handle:** `@krassowski_m`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-20 13:52:27 -07:00
Erick Friis
49441cdfeb upstage: move to external repo (#22506) 2024-06-20 13:52:27 -07:00
Bagatur
bd16729412 openai[patch]: image token counting (#23147)
Resolves #23000

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:27 -07:00
Jorge Piedrahita Ortiz
eeb5c1ba71 community[patch]: sambanova llm integration improvement (#23137)
- **Description:** sambanova sambaverse integration improvement: removed
input parsing that was changing raw user input, and was making to use
process prompt parameter as true mandatory
2024-06-20 13:52:27 -07:00
Jorge Piedrahita Ortiz
03503ec44a community[patch]: update sambastudio embeddings (#23133)
Description: update sambastudio embeddings integration, now compatible
with generic endpoints and CoE endpoints
2024-06-20 13:52:27 -07:00
Philippe PRADOS
ca8717afae langchain[small]: Change type to BasePromptTemplate (#23083)
```python
Change from_llm(
 prompt: PromptTemplate 
 ...
 )
```
 to
```python
Change from_llm(
 prompt: BasePromptTemplate 
 ...
 )
```
2024-06-20 13:52:27 -07:00
Sergey Kozlov
7cd38097f9 core[patch[: add exceptions propagation test for astream_events v2 (#23159)
**Description:** `astream_events(version="v2")` didn't propagate
exceptions in `langchain-core<=0.2.6`, fixed in the #22916. This PR adds
a unit test to check that exceptions are propagated upwards.

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2024-06-20 13:52:27 -07:00
Leonid Ganeline
54cc5dd1a2 prompty: docstring (#23152)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:52:27 -07:00
chenxi
4fa60db884 fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176)
Close #23174

Co-authored-by: tianming <tianming@bytenew.com>
2024-06-20 13:52:27 -07:00
Bagatur
0b453f8112 core[patch]: fix chat history circular import (#23182) 2024-06-20 13:52:27 -07:00
Eugene Yurtsev
3debf68cac core[patch]: Add an example to the Document schema doc-string (#23131)
Add an example to the document schema
2024-06-20 13:52:27 -07:00
ccurme
63c8caabc6 core[patch]: update test to catch circular imports (#23172)
This raises ImportError due to a circular import:
```python
from langchain_core import chat_history
```

This does not:
```python
from langchain_core import runnables
from langchain_core import chat_history
```

Here we update `test_imports` to run each import in a separate
subprocess. Open to other ways of doing this!
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
e3b6ca35ce core[patch]: Add documentation to load namespace (#23143)
Document some of the modules within the load namespace
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
c09c8b79ae core[patch]: Add doc-string to document compressor (#23085) 2024-06-20 13:52:27 -07:00
Eugene Yurtsev
1843d85fce community[patch]: Prevent unit tests from making network requests (#23180)
* Prevent unit tests from making network requests
2024-06-20 13:52:27 -07:00
ccurme
3676f642dc community: move test to integration tests (#23178)
Tests failing on master with

> FAILED
tests/unit_tests/embeddings/test_ovhcloud.py::test_ovhcloud_embed_documents
- ValueError: Request failed with status code: 401, {"message":"Bad
token; invalid JSON"}
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
8f42570f23 core[patch]: Expand documentation in the indexing namespace (#23134) 2024-06-20 13:52:27 -07:00
Eugene Yurtsev
93179c26ef core[patch]: Document embeddings namespace (#23132)
Document embeddings namespace
2024-06-20 13:52:27 -07:00
Eugene Yurtsev
eab3159589 core[patch]: Update documentation in LLM namespace (#23138)
Update documentation in lllm namespace.
2024-06-20 13:52:27 -07:00
Leonid Ganeline
53183c166e ai21: docstrings (#23142)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:27 -07:00
bilk0h
3eed107f3a text-splitters: Fix/recursive json splitter data persistence issue (#21529)
Thank you for contributing to LangChain!

**Description:** Noticed an issue with when I was calling
`RecursiveJsonSplitter().split_json()` multiple times that I was getting
weird results. I found an issue where `chunks` list in the `_json_split`
method. If chunks is not provided when _json_split (which is the case
when split_json calls _json_split) then the same list is used for
subsequent calls to `_json_split`.


You can see this in the test case i also added to this commit.

Output should be: 
```
[{'a': 1, 'b': 2}]
[{'c': 3, 'd': 4}]
```

Instead you get:
```
[{'a': 1, 'b': 2}]
[{'a': 1, 'b': 2, 'c': 3, 'd': 4}]
```

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-20 13:52:27 -07:00
鹿鹿鹿鲨
32628ec5f5 community: add **request_kwargs and expect TimeError AsyncHtmlLoader (#23068)
- **Description:** add `**request_kwargs` and expect `TimeError` in
`_fetch` function for AsyncHtmlLoader. This allows you to fill in the
kwargs parameter when using the `load()` method of the `AsyncHtmlLoader`
class.

Co-authored-by: Yucolu <yucolu@tencent.com>
2024-06-20 13:52:27 -07:00
Leonid Ganeline
8225f75102 ibm: docstrings (#23149)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:27 -07:00
Ryan Elston
f9b99411c0 text-splitters: Introduce Experimental Markdown Syntax Splitter (#22257)
#### Description
This MR defines a `ExperimentalMarkdownSyntaxTextSplitter` class. The
main goal is to replicate the functionality of the original
`MarkdownHeaderTextSplitter` which extracts the header stack as metadata
but with one critical difference: it keeps the whitespace of the
original text intact.

This draft reimplements the `MarkdownHeaderTextSplitter` with a very
different algorithmic approach. Instead of marking up each line of the
text individually and aggregating them back together into chunks, this
method builds each chunk sequentially and applies the metadata to each
chunk. This makes the implementation simpler. However, since it's
designed to keep white space intact its not a full drop in replacement
for the original. Since it is a radical implementation change to the
original code and I would like to get feedback to see if this is a
worthwhile replacement, should be it's own class, or is not a good idea
at all.

Note: I implemented the `return_each_line` parameter but I don't think
it's a necessary feature. I'd prefer to remove it.

This implementation also adds the following additional features:
- Splits out code blocks and includes the language in the `"Code"`
metadata key
- Splits text on the horizontal rule `---` as well
- The `headers_to_split_on` parameter is now optional - with sensible
defaults that can be overridden.

#### Issue
Keeping the whitespace keeps the paragraphs structure and the formatting
of the code blocks intact which allows the caller much more flexibility
in how they want to further split the individuals sections of the
resulting documents. This addresses the issues brought up by the
community in the following issues:
- https://github.com/langchain-ai/langchain/issues/20823
- https://github.com/langchain-ai/langchain/issues/19436
- https://github.com/langchain-ai/langchain/issues/22256

#### Dependencies
N/A

#### Twitter handle
@RyanElston

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-20 13:52:27 -07:00
Bagatur
38da833159 anthropic[patch]: test image input (#23155) 2024-06-20 13:52:27 -07:00
Leonid Ganeline
5fa5a976ba anthropic: docstrings (#23145)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 13:52:27 -07:00
Bagatur
86bea8a671 openai[patch], standard-tests[patch]: don't pass in falsey stop vals (#23153)
adds an image input test to standard-tests as well
2024-06-20 13:52:27 -07:00