Commit Graph

10001 Commits

Author SHA1 Message Date
Philippe PRADOS
b61de9728e community[minor]: Fix long_context_reorder.py async (#22839)
Implement `async def atransform_documents( self, documents:
Sequence[Document], **kwargs: Any ) -> Sequence[Document]` for
`LongContextReorder`
2024-06-14 13:55:18 -04:00
Eugene Yurtsev
c72bcda4f2 community[major], experimental[patch]: Remove Python REPL from community (#22904)
Remove the REPL from community, and suggest an alternative import from
langchain_experimental.

Fix for this issue:
https://github.com/langchain-ai/langchain/issues/14345

This is not a bug in the code or an actual security risk. The python
REPL itself is behaving as expected.

The PR is done to appease blanket security policies that are just
looking for the presence of exec in the code.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-14 17:53:29 +00:00
Eugene Yurtsev
9a877c7adb community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) (#22903)
This PR restricts the depth to which the sitemap can be parsed.

Fix for: CVE-2024-2965
2024-06-14 13:04:40 -04:00
Eugene Yurtsev
4a77a3ab19 core[patch]: fix validation of @deprecated decorator (#22513)
This PR moves the validation of the decorator to a better place to avoid
creating bugs while deprecating code.

Prevent issues like this from arising:
https://github.com/langchain-ai/langchain/issues/22510

we should replace with a linter at some point that just does static
analysis
2024-06-14 16:52:30 +00:00
Jacob Lee
181a61982f anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.

One thing - Anthropic events look like this:

```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```

Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.

We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?

If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?

CC @ccurme @efriis @baskaryan
2024-06-14 09:14:43 -07:00
ccurme
f40b2c6f9d fireworks[patch]: add usage_metadata to (a)invoke and (a)stream (#22906) 2024-06-14 12:07:19 -04:00
Mohammad Mohtashim
d1b7a934aa [Community]: HuggingFaceCrossEncoder score accounting for <not-relevant score,relevant score> pairs. (#22578)
- **Description:** Some of the Cross-Encoder models provide scores in
pairs, i.e., <not-relevant score (higher means the document is less
relevant to the query), relevant score (higher means the document is
more relevant to the query)>. However, the `HuggingFaceCrossEncoder`
`score` method does not currently take into account the pair situation.
This PR addresses this issue by modifying the method to consider only
the relevant score if score is being provided in pair. The reason for
focusing on the relevant score is that the compressors select the top-n
documents based on relevance.
    - **Issue:** #22556 
- Please also refer to this
[comment](https://github.com/UKPLab/sentence-transformers/issues/568#issuecomment-729153075)
2024-06-14 08:28:24 -07:00
Baskar Gopinath
83643cbdfe docs: Fix typo in tutorial about structured data extraction (#22888)
[Fixed typo](docs: Fix typo in tutorial about structured data
extraction)
2024-06-14 15:19:55 +00:00
Thanh Nguyen
b5e2ba3a47 community[minor]: add chat model llamacpp (#22589)
- **PR title**: [community] add chat model llamacpp


- **PR message**:
- **Description:** This PR introduces a new chat model integration with
llamacpp_python, designed to work similarly to the existing ChatOpenAI
model.
      + Work well with instructed chat, chain and function/tool calling.
+ Work with LangGraph (persistent memory, tool calling), will update
soon

- **Dependencies:** This change requires the llamacpp_python library to
be installed.
    
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-14 14:51:43 +00:00
Bagatur
e4279f80cd docs: doc loader feat table alignment (#22900) 2024-06-14 14:25:01 +00:00
Isaac Francisco
984c7a9d42 docs: generate table for document loaders (#22871)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-14 07:03:27 -07:00
Jacob Lee
8e89178047 docs[patch]: Expand embeddings docs (#22881) 2024-06-13 23:06:07 -07:00
ccurme
73c76b9628 anthropic[patch]: always add tool_result type to ToolMessage content (#22721)
Anthropic tool results can contain image data, which are typically
represented with content blocks having `"type": "image"`. Currently,
these content blocks are passed as-is as human/user messages to
Anthropic, which raises BadRequestError as it expects a tool_result
block to follow a tool_use.

Here we update ChatAnthropic to nest the content blocks inside a
tool_result content block.

Example:
```python
import base64

import httpx
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import AIMessage, HumanMessage, ToolMessage
from langchain_core.pydantic_v1 import BaseModel, Field


# Fetch image
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")


class FetchImage(BaseModel):
    should_fetch: bool = Field(..., description="Whether an image is requested.")


llm = ChatAnthropic(model="claude-3-sonnet-20240229").bind_tools([FetchImage])

messages = [
    HumanMessage(content="Could you summon a beautiful image please?"),
    AIMessage(
        content=[
            {
                "type": "tool_use",
                "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
                "name": "FetchImage",
                "input": {"should_fetch": True},
            },
        ],
        tool_calls=[
            {
                "name": "FetchImage",
                "args": {"should_fetch": True},
                "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
            },
        ],
    ),
    ToolMessage(
        name="FetchImage",
        content=[
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/jpeg",
                    "data": image_data,
                },
            },
        ],
        tool_call_id="toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
    ),
]

llm.invoke(messages)
```

Trace:
https://smith.langchain.com/public/d27e4fc1-a96d-41e1-9f52-54f5004122db/r
2024-06-13 20:14:23 -07:00
Lucas Tucker
7114aed78f docs: Standardize ChatGroq (#22751)
Updated ChatGroq doc string as per issue
https://github.com/langchain-ai/langchain/issues/22296:"langchain_groq:
updated docstring for ChatGroq in langchain_groq to match that of the
description (in the appendix) provided in issue
https://github.com/langchain-ai/langchain/issues/22296. "

Issue: This PR is in response to issue
https://github.com/langchain-ai/langchain/issues/22296, and more
specifically the ChatGroq model. In particular, this PR updates the
docstring for langchain/libs/partners/groq/langchain_groq/chat_model.py
by adding the following sections: Instantiate, Invoke, Stream, Async,
Tool calling, Structured Output, and Response metadata. I used the
template from the Anthropic implementation and referenced the Appendix
of the original issue post. I also noted that: `usage_metadata `returns
none for all ChatGroq models I tested; there is no mention of image
input in the ChatGroq documentation; unlike that of ChatHuggingFace,
`.stream(messages)` for ChatGroq returned blocks of output.

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-14 03:08:36 +00:00
Anush
e002c855bd qdrant[patch]: Use collection_exists API instead of exceptions (#22764)
## Description

Currently, the Qdrant integration relies on exceptions raised by
[`get_collection`
](https://qdrant.tech/documentation/concepts/collections/#collection-info)
to check if a collection exists.

Using
[`collection_exists`](https://qdrant.tech/documentation/concepts/collections/#check-collection-existence)
is recommended to avoid missing any unhandled exceptions. This PR
addresses this.

## Testing
All integration and unit tests pass. No user-facing changes.
langchain-qdrant==0.1.1
2024-06-13 20:01:32 -07:00
Anindyadeep
c417803908 community[minor]: Prem Templates (#22783)
This PR adds the feature add Prem Template feature in ChatPremAI.
Additionally it fixes a minor bug for API auth error when API passed
through arguments.
2024-06-13 19:59:28 -07:00
Stefano Lottini
4160b700e6 docs: Astra DB vectorstore, adjust syntax for automatic-embedding example (#22833)
Description: Adjusting the syntax for creating the vectorstore
collection (in the case of automatic embedding computation) for the most
idiomatic way to submit the stored secret name.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-14 02:52:32 +00:00
maang-h
1055b9a309 community[minor]: Implement ZhipuAIEmbeddings interface (#22821)
- **Description:** Implement ZhipuAIEmbeddings interface, include:
     - The `embed_query` method
     - The `embed_documents` method

refer to [ZhipuAI
Embedding-2](https://open.bigmodel.cn/dev/api#text_embedding)

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-13 19:45:11 -07:00
Leonid Ganeline
46c9784127 docs: ReAct reference (#22830)
The `ReAct` is used all across LangChain but it is not referenced
properly.
Added references to the original paper.
2024-06-13 19:39:28 -07:00
Giacomo Berardi
712aa0c529 docs: fixes for Elasticsearch integrations, cache doc and providers list (#22817)
Some minor fixes in the documentation:
 - ElasticsearchCache initilization is now correct
 - List of integrations for ES updated
2024-06-13 19:39:10 -07:00
Isaac Francisco
f9a6d5c845 infra: lint new docs to match doc loader template (#22867) 2024-06-13 19:34:50 -07:00
Bagatur
8bd368d07e cli[patch]: Release 0.0.25 (#22876) langchain-cli==0.0.25 2024-06-14 02:31:04 +00:00
Isaac Francisco
75e966a2fa docs, cli[patch]: document loaders doc template (#22862)
From: https://github.com/langchain-ai/langchain/pull/22290

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-13 19:28:57 -07:00
Hayden Wolff
d1cdde267a docs: update NVIDIA Riva tool to use NVIDIA NIM for LLM (#22873)
**Description:**
Update the NVIDIA Riva tool documentation to use NVIDIA NIM for the LLM.
Show how to use NVIDIA NIMs and link to documentation for LangChain with
NIM.

---------

Co-authored-by: Hayden Wolff <hwolff@nvidia.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-13 19:26:05 -07:00
Zeeshan Qureshi
ada1e5cc64 docs: s/path_images/images/ for ImageCaptionLoader keyword arguments (#22857)
Quick update to `ImageCaptionLoader` documentation to reflect what's in
code.
2024-06-13 18:37:12 -07:00
liuzc9
41e232cb82 Fix typo in vearch.md (#22840)
Fix typo
2024-06-13 18:24:51 -07:00
Kagura Chen
57783c5e55 Fix: lint errors and update Field alias in models.py and AutoSelectionScorer initialization (#22846)
This PR addresses several lint errors in the core package of LangChain.
Specifically, the following issues were fixed:

1.Unexpected keyword argument "required" for "Field"  [call-arg]
2.tests/integration_tests/chains/test_cpal.py:263: error: Unexpected
keyword argument "narrative_input" for "QueryModel" [call-arg]
2024-06-13 18:18:00 -07:00
Erick Friis
5bc774827b langchain: release 0.2.4 (#22872) langchain==0.2.4 2024-06-14 00:14:48 +00:00
Erick Friis
7234fd0f51 core: release 0.2.6 (#22868) langchain-core==0.2.6 2024-06-13 22:22:34 +00:00
Jacob Lee
bcbb43480c core[patch]: Treat type as a special field when merging lists (#22750)
Should we even log a warning? At least for Anthropic, it's expected to
get e.g. `text_block` followed by `text_delta`.

@ccurme @baskaryan @efriis
2024-06-13 15:08:24 -07:00
Nuno Campos
bae82e966a core: In astream_events v2 propagate cancel/break to the inner astream call (#22865)
- previous behavior was for the inner astream to continue running with
no interruption
- also propagate break in core runnable methods
2024-06-13 15:02:48 -07:00
Eugene Yurtsev
a766815a99 experimental[patch]/docs[patch]: Update links to security docs (#22864)
Minor update to newest version of security docs (content should be
identical).
2024-06-13 20:29:34 +00:00
Eugene Yurtsev
8f7cc73817 ci: Add script to check for pickle usage in community (#22863)
Add script to check for pickle usage in community.
2024-06-13 16:13:15 -04:00
Eugene Yurtsev
77209f315e community[patch]: FAISS VectorStore deserializer should be opt-in (#22861)
FAISS deserializer uses pickle module. Users have to opt-in to
de-serialize.
2024-06-13 15:48:13 -04:00
Eugene Yurtsev
ce0b0f22a1 experimental[major]: Force users to opt-in into code that relies on the python repl (#22860)
This should make it obvious that a few of the agents in langchain
experimental rely on the python REPL as a tool under the hood, and will
force users to opt-in.
2024-06-13 15:41:24 -04:00
Isaac Francisco
869523ad72 [docs]: added info for TavilySearchResults (#22765) 2024-06-13 12:14:11 -07:00
ccurme
42257b120f partners: fix numpy dep (#22858)
Following https://github.com/langchain-ai/langchain/pull/22813, which
added python 3.12 to CI, here we update numpy accordingly in partner
packages.
2024-06-13 14:46:42 -04:00
Isaac Francisco
345fd3a556 minor functionality change: adding API functionality to tavilysearch (#22761) 2024-06-13 11:10:28 -07:00
Isaac Francisco
034257e9bf docs: improved recursive url loader docs (#22648)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-13 11:09:35 -07:00
Isaac Francisco
e832bbb486 [docs]: bind tools (#22831) 2024-06-13 09:50:43 -07:00
ccurme
b626c3ca23 groq[patch]: add usage_metadata to (a)invoke and (a)stream (#22834) 2024-06-13 10:26:27 -04:00
Jacob Lee
e01e5d5a91 docs[patch]: Improve Groq integration page (#22844)
Was bare bones and got marked by folks as unhelpful.

CC @efriis @colemccracken
2024-06-13 03:40:29 -07:00
Jacob Lee
12eff6a130 docs[patch]: Readd Pydantic compatibility docs (#22836)
As a how-to guide.

CC @eyurtsev @hwchase17
2024-06-13 02:56:10 -07:00
Jacob Lee
cb654a3245 docs[patch]: Adds multimodal column to chat models table, move up in concepts (#22837)
CC @hwchase17 @baskaryan
2024-06-13 02:26:55 -07:00
James Braza
45b394268c core[patch]: allowing latest packaging versions (#22792)
Allowing version 24 of https://github.com/pypa/packaging

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-12 23:22:20 +00:00
Jacob Lee
00ad197502 docs[patch]: Add structured output to conceptual docs (#22791)
This downgrades `Function/tool calling` from a h3 to an h4 which means
it'll no longer show up in the right sidebar, but any direct links will
still work. I think that is ok, but LMK if you disapprove.

CC @hwchase17 @eyurtsev @rlancemartin
2024-06-12 15:30:51 -07:00
Karim Lalani
276be6cdd4 [experimental][llms][OllamaFunctions] tool calling related fixes (#22339)
Fixes issues with tool calling to handle tool objects correctly. Added
support to handle ToolMessage correctly.
Added additional checks for error conditions.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-12 16:34:43 -04:00
Christophe Bornet
d04e899b56 ci: add testing with Python 3.12 (#22813)
We need to use a different version of numpy for py3.8 and py3.12 in
pyproject.
And so do projects that use that Python version range and import
langchain.

    - **Twitter handle:** _cbornet
2024-06-12 16:31:36 -04:00
HyoJin Kang
b6bf2bb234 community[patch]: fix database uri type in SQLDatabase (#22661)
**Description**
sqlalchemy uses "sqlalchemy.engine.URL" type for db uri argument.
Added 'URL' type for compatibility.

**Issue**: None

**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-12 15:11:00 -04:00
Eugene Yurtsev
5dbbdcbf8e core[patch]: Update remaining root_validators (#22829)
This PR updates the remaining root_validators in core to either be explicit pre-init or post-init validators.
2024-06-12 14:47:40 -04:00