Compare commits

..

94 Commits

Author SHA1 Message Date
Eugene Yurtsev
8e5074d82d core: release 0.3.36 (#29869)
Release 0.3.36
2025-02-18 19:51:43 +00:00
Vadym Barda
d04fa1ae50 core[patch]: allow passing JSON schema as args_schema to tools (#29812) 2025-02-18 14:44:31 -05:00
ccurme
5034a8dc5c xai[patch]: release 0.2.1 (#29854) 2025-02-17 14:30:41 -05:00
ccurme
83dcef234d xai[patch]: support dedicated structured output feature (#29853)
https://docs.x.ai/docs/guides/structured-outputs

Interface appears identical to OpenAI's.
```python
from langchain.chat_models import init_chat_model
from pydantic import BaseModel

class Joke(BaseModel):
    setup: str
    punchline: str

llm = init_chat_model("xai:grok-2").with_structured_output(
    Joke, method="json_schema"
)
llm.invoke("Tell me a joke about cats.")
```
2025-02-17 14:19:51 -05:00
ccurme
9d6fcd0bfb infra: add xai to scheduled testing (#29852) 2025-02-17 18:59:45 +00:00
ccurme
8a3b05ae69 langchain[patch]: release 0.3.19 (#29851) 2025-02-17 13:36:23 -05:00
ccurme
c9061162a1 langchain[patch]: add xai to extras (#29850) 2025-02-17 17:49:34 +00:00
Bagatur
1acf57e9bd langchain[patch]: init_chat_model xai support (#29849) 2025-02-17 09:45:39 -08:00
Paul Nikonowicz
1a55da9ff4 docs: Update gemini vector docs (#29841)
# Description

2 changes: 
1. removes get pass from the code example as it reads from stdio causing
a freeze to occur
2. updates to the latest gemini model in the example

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-17 07:54:23 -05:00
hsm207
037b129b86 weaviate: Add-deprecation-warning (#29757)
- **Description:** add deprecation warning when using weaviate from
langchain_community
  - **Issue:** NA
  - **Dependencies:** NA
  - **Twitter handle:** NA

---------

Signed-off-by: hsm207 <hsm207@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-16 21:42:18 -05:00
Đỗ Quang Minh
cd198ac9ed community: add custom model for OpenAIWhisperParser (#29831)
Add `model` properties for OpenAIWhisperParser. Defaulted to `whisper-1`
(previous value).
Please help me update the docs and other related components of this
repo.
2025-02-16 21:26:07 -05:00
Cole McIntosh
6874c9c1d0 docs: add notebook for langchain-salesforce package (#29800)
**Description:**  
This PR adds a Jupyter notebook that explains the features,
installation, and usage of the
[`langchain-salesforce`](https://github.com/colesmcintosh/langchain-salesforce)
package. The notebook includes:
- Setup instructions for configuring Salesforce credentials  
- Example code demonstrating common operations such as querying,
describing objects, creating, updating, and deleting records

**Issue:**  
N/A

**Dependencies:**  
No new dependencies are required.

**Tests and Docs:**  
- Added an example notebook demonstrating the usage of the
`langchain-salesforce` package, located in `docs/docs/integrations`.

**Lint and Test:**  
- Ran `make format`, `make lint`, and `make test` successfully.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-16 08:34:57 -05:00
Jan Heimes
60f58df5b3 community: add top_k as param to Needle Retriever (#29821)
Thank you for contributing to LangChain!

- [X] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: 
This PR adds top_k as a param to the Needle Retriever. By default we use
top 10.



- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2025-02-16 08:30:52 -05:00
Mateusz Szewczyk
8147679169 docs: Rename IBM product name to IBM watsonx (#29802)
Thank you for contributing to LangChain!

Rename IBM product name to `IBM watsonx`

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2025-02-15 21:48:02 -05:00
Jesus Fernandez Bes
1dfac909d8 community: Adding IN Operator to AzureCosmosDBNoSQLVectorStore (#29805)
- ** Description**: I have added a new operator in the operator map with
key `$in` and value `IN`, so that you can define filters using lists as
values. This was already contemplated but as IN operator was not in the
map they cannot be used.
- **Issue**: Fixes #29804.
- **Dependencies**: No extra.
2025-02-15 21:44:54 -05:00
Wahed Hemati
8901b113c3 docs: add Discord integration docs (#29822)
This PR adds documentation for the `langchain-discord-shikenso`
integration, including an example notebook at
`docs/docs/integrations/tools/discord.ipynb` and updates to
`libs/packages.yml` to track the new package.

  **Issue:**  
  N/A

  **Dependencies:**  
  None

  **Twitter handle:**  
  N/A

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-15 21:43:45 -05:00
Akmal Ali Jasmin
f1792e486e fix: Correct getpass usage in Google Generative AI Embedding docs (#29809) (#29810)
**fix: Correct getpass usage in Google Generative AI Embedding docs
(#29809)**

- **Description:** Corrected the `getpass` usage in the Google
Generative AI Embedding documentation by replacing `getpass()` with
`getpass.getpass()` to fix the `TypeError`.
- **Issue:** #29809  
- **Dependencies:** None  

**Additional Notes:**  
The change ensures compatibility with Google Colab and follows Python's
`getpass` module usage standards.
2025-02-15 21:41:00 -05:00
HackHuang
80ca310c15 langchain : Add the full code snippet in rag.ipynb (#29820)
docs(rag.ipynb) : Add the `full code` snippet, it’s necessary and useful
for beginners to demonstrate.

Preview the change :
https://langchain-git-fork-googtech-patch-3-langchain.vercel.app/docs/tutorials/rag/

Two `full code` snippets are added as below :
<details>
<summary>Full Code:</summary>

```python
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.chat_models import init_chat_model
from langchain_openai import OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore
from google.colab import userdata
from langchain_core.prompts import PromptTemplate
from langchain_core.documents import Document
from typing_extensions import List, TypedDict
from langgraph.graph import START, StateGraph

#################################################
# 1.Initialize the ChatModel and EmbeddingModel #
#################################################
llm = init_chat_model(
    model="gpt-4o-mini",
    model_provider="openai",
    openai_api_key=userdata.get('OPENAI_API_KEY'),
    base_url=userdata.get('BASE_URL'),
)
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-large",
    openai_api_key=userdata.get('OPENAI_API_KEY'),
    base_url=userdata.get('BASE_URL'),
)

#######################
# 2.Loading documents #
#######################
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        # Only keep post title, headers, and content from the full HTML.
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

#########################
# 3.Splitting documents #
#########################
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

###########################################################
# 4.Embedding documents and storing them in a vectorstore #
###########################################################
vector_store = InMemoryVectorStore(embeddings)
_ = vector_store.add_documents(documents=all_splits)

##########################################################
# 5.Customizing the prompt or loading it from Prompt Hub #
##########################################################
# prompt = hub.pull("rlm/rag-prompt") # load the prompt from the prompt-hub
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""
prompt = PromptTemplate.from_template(template)

##################################################################################################
# 5.Using LangGraph to tie together the retrieval and generation steps into a single application #                               #
##################################################################################################
# 5.1.Define the state of application, which controls the application datas
class State(TypedDict):
    question: str
    context: List[Document]
    answer: str

# 5.2.1.Define the node of application, which signifies the application steps
def retrieve(state: State):
    retrieved_docs = vector_store.similarity_search(state["question"])
    return {"context": retrieved_docs}

# 5.2.2.Define the node of application, which signifies the application steps
def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}

# 6.Define the "control flow" of application, which signifies the ordering of the application steps
graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()
```

</details>

<details>
<summary>Full Code:</summary>

```python
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.chat_models import init_chat_model
from langchain_openai import OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore
from google.colab import userdata
from langchain_core.prompts import PromptTemplate
from langchain_core.documents import Document
from typing_extensions import List, TypedDict
from langgraph.graph import START, StateGraph
from typing import Literal
from typing_extensions import Annotated

#################################################
# 1.Initialize the ChatModel and EmbeddingModel #
#################################################
llm = init_chat_model(
    model="gpt-4o-mini",
    model_provider="openai",
    openai_api_key=userdata.get('OPENAI_API_KEY'),
    base_url=userdata.get('BASE_URL'),
)
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-large",
    openai_api_key=userdata.get('OPENAI_API_KEY'),
    base_url=userdata.get('BASE_URL'),
)

#######################
# 2.Loading documents #
#######################
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        # Only keep post title, headers, and content from the full HTML.
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

#########################
# 3.Splitting documents #
#########################
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # chunk size (characters)
    chunk_overlap=200,  # chunk overlap (characters)
    add_start_index=True,  # track index in original document
)
all_splits = text_splitter.split_documents(docs)

# Search analysis: Add some metadata to the documents in our vector store,
# so that we can filter on section later. 
total_documents = len(all_splits)
third = total_documents // 3
for i, document in enumerate(all_splits):
    if i < third:
        document.metadata["section"] = "beginning"
    elif i < 2 * third:
        document.metadata["section"] = "middle"
    else:
        document.metadata["section"] = "end"

# Search analysis: Define the schema for our search query
class Search(TypedDict):
    query: Annotated[str, ..., "Search query to run."]
    section: Annotated[
        Literal["beginning", "middle", "end"], ..., "Section to query."]

###########################################################
# 4.Embedding documents and storing them in a vectorstore #
###########################################################
vector_store = InMemoryVectorStore(embeddings)
_ = vector_store.add_documents(documents=all_splits)

##########################################################
# 5.Customizing the prompt or loading it from Prompt Hub #
##########################################################
# prompt = hub.pull("rlm/rag-prompt") # load the prompt from the prompt-hub
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""
prompt = PromptTemplate.from_template(template)

###################################################################
# 5.Using LangGraph to tie together the analyze_query, retrieval  #
# and generation steps into a single application                  #
###################################################################
# 5.1.Define the state of application, which controls the application datas
class State(TypedDict):
    question: str
    query: Search
    context: List[Document]
    answer: str

# Search analysis: Define the node of application, 
# which be used to generate a query from the user's raw input
def analyze_query(state: State):
    structured_llm = llm.with_structured_output(Search)
    query = structured_llm.invoke(state["question"])
    return {"query": query}

# 5.2.1.Define the node of application, which signifies the application steps
def retrieve(state: State):
    query = state["query"]
    retrieved_docs = vector_store.similarity_search(
        query["query"],
        filter=lambda doc: doc.metadata.get("section") == query["section"],
    )
    return {"context": retrieved_docs}

# 5.2.2.Define the node of application, which signifies the application steps
def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}

# 6.Define the "control flow" of application, which signifies the ordering of the application steps
graph_builder = StateGraph(State).add_sequence([analyze_query, retrieve, generate]) 
graph_builder.add_edge(START, "analyze_query")
graph = graph_builder.compile()
```

</details>

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-15 21:37:58 -05:00
Michael Chin
b2c21f3e57 docs: Update SagemakerEndpoint examples (#29814)
Related issue: https://github.com/langchain-ai/langchain-aws/issues/361

Updated the AWS `SagemakerEndpoint` LLM documentation to import from
`langchain-aws`.
2025-02-15 21:34:56 -05:00
Krishna Kulkarni
a98c5f1c4b langchain_community: add image support to DuckDuckGoSearchAPIWrapper (#29816)
- [ ] **PR title**: langchain_community: add image support to
DuckDuckGoSearchAPIWrapper

- **Description:** This PR enhances the DuckDuckGoSearchAPIWrapper
within the langchain_community package by introducing support for image
searches. The enhancement includes:
  - Adding a new method _ddgs_images to handle image search queries.
- Updating the run and results methods to process and return image
search results appropriately.
- Modifying the source parameter to accept "images" as a valid option,
alongside "text" and "news".
- **Dependencies:** No additional dependencies are required for this
change.
2025-02-15 21:32:14 -05:00
Iris Liu
0d9f0b4215 docs: updates Chroma integration API ref docs (#29826)
- Description: updates Chroma integration API ref docs
- Issue: #29817
- Dependencies: N/A
- Twitter handle: @irieliu

Co-authored-by: “Iris <“liuirisny@gmail.com”>
2025-02-15 21:05:21 -05:00
ccurme
3fe7c07394 openai[patch]: release 0.3.6 (#29824) 2025-02-15 13:53:35 -05:00
ccurme
65a6dce428 openai[patch]: enable streaming for o1 (#29823)
Verified streaming works for the `o1-2024-12-17` snapshot as well.
2025-02-15 12:42:05 -05:00
Christophe Bornet
3dffee3d0b all: Bump blockbuster version to 1.5.18 (#29806)
Has fixes for running on Windows and non-CPython runtimes.
2025-02-14 07:55:38 -08:00
ccurme
d9a069c414 tests[patch]: release 0.3.12 (#29797) 2025-02-13 23:57:44 +00:00
ccurme
e4f106ea62 groq[patch]: remove xfails (#29794)
These appear to pass.
2025-02-13 15:49:50 -08:00
Erick Friis
f34e62ef42 packages: add langchain-xai (#29795)
wasn't registered per the contribution guide:
https://python.langchain.com/docs/contributing/how_to/integrations/
2025-02-13 15:36:41 -08:00
ccurme
49cc6106f7 tests[patch]: fix query for test_tool_calling_with_no_arguments (#29793) 2025-02-13 23:15:52 +00:00
Erick Friis
1a225fad03 multiple: fix uv path deps (#29790)
file:// format wasn't working with updates - it doesn't install as an
editable dep

move to tool.uv.sources with path= instead
2025-02-13 21:32:34 +00:00
Erick Friis
ff13384eb6 packages: update counts, add command (#29789) 2025-02-13 20:45:25 +00:00
Mateusz Szewczyk
8d0e31cbc5 docs: Fix model_id on EmbeddingTabs page (#29784)
Thank you for contributing to LangChain!

Fix `model_id` in IBM provider on EmbeddingTabs page

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2025-02-13 09:41:51 -08:00
Mateusz Szewczyk
61f1be2152 docs: Added IBM to ChatModelTabs and EmbeddingTabs (#29774)
Thank you for contributing to LangChain!

Added IBM to ChatModelTabs and EmbeddingTabs

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2025-02-13 08:43:42 -08:00
HackHuang
76d32754ff core : update the class docs of InMemoryVectorStore in in_memory.py (#29781)
- **Description:** Add the new introduction about checking `store` in
in_memory.py, It’s necessary and useful for beginners.
```python
Check Documents:
    .. code-block:: python
    
        for doc in vector_store.store.values():
            print(doc)
```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 16:41:47 +00:00
Mateusz Szewczyk
b82cef36a5 docs: Update IBM WatsonxLLM and ChatWatsonx documentation (#29752)
Thank you for contributing to LangChain!

Update presented model in `WatsonxLLM` and `ChatWatsonx` documentation.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2025-02-13 08:41:07 -08:00
Mohammad Mohtashim
96ad09fa2d (Community): Added API Key for Jina Search API Wrapper (#29622)
- **Description:** Simple change for adding the API Key for Jina Search
API Wrapper
- **Issue:** #29596
2025-02-12 20:12:07 -08:00
ccurme
f1c66a3040 docs: minor fix to provider table (#29771)
Langfair renders as LangfAIr
2025-02-13 04:06:58 +00:00
Jakub Kopecký
c8cb7c25bf docs: update apify integration (#29553)
**Description:** Fixed and updated Apify integration documentation to
use the new [langchain-apify](https://github.com/apify/langchain-apify)
package.
**Twitter handle:** @apify
2025-02-12 20:02:55 -08:00
ccurme
16fb1f5371 chroma[patch]: release 0.2.2 (#29769)
Resolves https://github.com/langchain-ai/langchain/issues/29765
2025-02-13 02:39:16 +00:00
Mohammad Mohtashim
2310847c0f (Chroma): Small Fix in add_texts when checking for embeddings (#29766)
- **Description:** Small fix in `add_texts` to make embedding
nullability is checked properly.
- **Issue:** #29765

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 02:26:13 +00:00
Eric Pinzur
716fd89d8e docs: contributed Graph RAG Retriever integration (#29744)
**Description:** 

This adds the `Graph RAG` Retriever integration documentation, per
https://python.langchain.com/docs/contributing/how_to/integrations/.

* The integration exists in this public repository:
https://github.com/datastax/graph-rag
* We've implemented the standard langchain tests for retrievers:
https://github.com/datastax/graph-rag/blob/main/packages/langchain-graph-retriever/tests/test_langchain.py
* Our integration is published to PyPi:
https://pypi.org/project/langchain-graph-retriever/
2025-02-12 18:25:48 -08:00
Sunish Sheth
f42dafa809 Deprecating sql_database access for creating UC functions for agent tools (#29745)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-13 02:24:44 +00:00
Thor 雷神 Schaeff
a0970d8d7e [WIP] chore: update ElevenLabs tool. (#29722)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 01:54:34 +00:00
Chaymae El Aattabi
4b08a7e8e8 Fix #29759: Use local chunk_size_ for looping in embed_documents (#29761)
This fix ensures that the chunk size is correctly determined when
processing text embeddings. Previously, the code did not properly handle
cases where chunk_size was None, potentially leading to incorrect
chunking behavior.

Now, chunk_size_ is explicitly set to either the provided chunk_size or
the default self.chunk_size, ensuring consistent chunking. This update
improves reliability when processing large text inputs in batches and
prevents unintended behavior when chunk_size is not specified.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 01:28:26 +00:00
Jorge Piedrahita Ortiz
1fbc01c350 docs: update sambanova integration api reference links (#29762)
- **Description:** update sambanova external package integration api
reference links in docs
2025-02-12 15:58:00 -08:00
Sunish Sheth
043d78d85d Deprecate langhchain community ucfunctiontoolkit in favor for databricks_langchain (#29746)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2025-02-12 15:50:35 -08:00
Hugues Chocart
e4eec9e9aa community: add langchain-abso documentation (#29739)
Add the documentation for the community package `langchain-abso`. It
provides a new Chat Model class, that uses https://abso.ai

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2025-02-12 19:57:33 +00:00
ccurme
e61f463745 core[patch]: release 0.3.35 (#29764) 2025-02-12 18:13:10 +00:00
Nuno Campos
fe59f2cc88 core: Fix output of convert_messages when called with BaseMessage.model_dump() (#29763)
- additional_kwargs was being nested twice
- example, response_metadata was placed inside additional_kwargs
2025-02-12 10:05:33 -08:00
Jacob Lee
f4e3e86fbb feat(langchain): Infer o3 modelstrings passed to init_chat_model as OpenAI (#29743) 2025-02-11 16:51:41 -08:00
Mohammad Mohtashim
9f3bcee30a (Community): Adding Structured Support for ChatPerplexity (#29361)
- **Description:** Adding Structured Support for ChatPerplexity
- **Issue:** #29357
- This is implemented as per the Perplexity official docs:
https://docs.perplexity.ai/guides/structured-outputs

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-11 15:51:18 -08:00
Jawahar S
994c5465e0 feat: add support for IBM WatsonX AI chat models (#29688)
**Description:** Updated init_chat_model to support Granite models
deployed on IBM WatsonX
**Dependencies:**
[langchain-ibm](https://github.com/langchain-ai/langchain-ibm)

Tagging @baskaryan @efriis for review when you get a chance.
2025-02-11 15:34:29 -08:00
Shailendra Mishra
c7d74eb7a3 Oraclevs integration (#29723)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"
  community: langchain_community/vectorstore/oraclevs.py


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Refactored code to allow a connection or a connection
pool.
- **Issue:** Normally an idel connection is terminated by the server
side listener at timeout. A user thus has to re-instantiate the vector
store. The timeout in case of connection is not configurable. The
solution is to use a connection pool where a user can specify a user
defined timeout and the connections are managed by the pool.
    - **Dependencies:** None
    - **Twitter handle:** 


- [ ] **Add tests and docs**: This is not a new integration. A user can
pass either a connection or a connection pool. The determination of what
is passed is made at run time. Everything should work as before.

- [ ] **Lint and test**:  Already done.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-11 14:56:55 -08:00
ccurme
42ebf6ae0c deepseek[patch]: release 0.1.2 (#29742) 2025-02-11 11:53:43 -08:00
ccurme
ec55553807 pinecone[patch]: release 0.2.3 (#29741) 2025-02-11 19:27:39 +00:00
ccurme
001cf99253 pinecone[patch]: add support for python 3.13 (#29737) 2025-02-11 11:20:21 -08:00
ccurme
ba8f752bf5 openai[patch]: release 0.3.5 (#29740) 2025-02-11 19:20:11 +00:00
ccurme
9477f49409 openai, deepseek: make _convert_chunk_to_generation_chunk an instance method (#29731)
1. Make `_convert_chunk_to_generation_chunk` an instance method on
BaseChatOpenAI
2. Override on ChatDeepSeek to add `"reasoning_content"` to message
additional_kwargs.

Resolves https://github.com/langchain-ai/langchain/issues/29513
2025-02-11 11:13:23 -08:00
Christopher Menon
1edd27d860 docs: fix SQL-based metadata filter syntax, add link to BigQuery docs (#29736)
Fix the syntax for SQL-based metadata filtering in the [Google BigQuery
Vector Search
docs](https://python.langchain.com/docs/integrations/vectorstores/google_bigquery_vector_search/#searching-documents-with-metadata-filters).
Also add a link to learn more about BigQuery operators that can be used
here.

I have been using this library, and have found that this is the correct
syntax to use for the SQL-based filters.

**Issue**: no open issue.
**Dependencies**: none.
**Twitter handle**: none.

No tests as this is only a change to the documentation.

<!-- Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->
2025-02-11 11:10:12 -08:00
ccurme
d0c2dc06d5 mongodb[patch]: fix link in readme (#29738) 2025-02-11 18:19:59 +00:00
zzaebok
3b3d52206f community: change wikidata rest api version from v0 to v1 (#29708)
**Description:**

According to the [wikidata
documentation](https://www.wikidata.org/wiki/Wikidata_talk:REST_API),
Wikibase REST API version 1 (stable) is released from November 11, 2024.
Their guide is to use the new v1 API and, it just requires replacing v0
in the routes with v1 in almost all cases.
So I replaced WIKIDATA_REST_API_URL from v0 to v1 for stable usage.

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-10 17:12:38 -08:00
ccurme
4a389ef4c6 community: fix extended testing (#29715)
v0.3.100 of premai sdk appears to break on import:
89d9276cbf/premai/api/__init__.py (L230)
2025-02-10 16:57:34 -08:00
Yoav Levy
af3f759073 docs: fixed nimble's provider page and retriever (#29695)
## **Description:**
- Added information about the retriever that Nimble's provider exposes.
- Fixed the authentication explanation on the retriever page.
2025-02-10 15:30:40 -08:00
Bhav Sardana
624216aa64 community:Fix for Pydantic model validator of GoogleApiYoutubeLoader (#29694)
- **Description:** Community: bugfix for pedantic model validator for
GoogleApiYoutubeLoader
- **Issue:** #29165, #27432 
Fix is similar to #29346
2025-02-10 08:57:58 -05:00
Changyong Um
60740c44c5 community: Add configurable text key for indexing and the retriever in Pinecone Hybrid Search (#29697)
**issue**

In Langchain, the original content is generally stored under the `text`
key. However, the `PineconeHybridSearchRetriever` searches the `context`
field in the metadata and cannot change this key. To address this, I
have modified the code to allow changing the key to something other than
context.

In my opinion, following Langchain's conventions, the `text` key seems
more appropriate than `context`. However, since I wasn't sure about the
author's intent, I have left the default value as `context`.
2025-02-10 08:56:37 -05:00
Jun He
894b0cac3c docs: Remove redundant line (#29698)
If I understand it correctly, chain1 is never used.
2025-02-10 08:53:21 -05:00
Tiest van Gool
6655246504 Classification Tutorial: Replaced .dict() with .model_dump() method (#29701)
The .dict() method is deprecated inf Pydantic V2.0 and use `model_dump`
method instead.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2025-02-10 08:38:15 -05:00
Edmond Wang
c36e6d4371 docs: Add Comments and Supplementary Example Code to Vearch Vector Dat… (#29706)
- **Description:** Added some comments to the example code in the Vearch
vector database documentation and included commonly used sample code.
- **Issue:** None
- **Dependencies:** None

---------

Co-authored-by: wangchuxiong <wangchuxiong@jd.com>
2025-02-10 08:35:38 -05:00
Akmal Ali Jasmin
bc5fafa20e [DOC] Fix #29685: HuggingFaceEndpoint missing task argument in documentation (#29686)
## **Description**
This PR updates the LangChain documentation to address an issue where
the `HuggingFaceEndpoint` example **does not specify the required `task`
argument**. Without this argument, users on `huggingface_hub == 0.28.1`
encounter the following error:

```
ValueError: Task unknown has no recommended model. Please specify a model explicitly.
```

---

## **Issue**
Fixes #29685

---

## **Changes Made**
 **Updated `HuggingFaceEndpoint` documentation** to explicitly define
`task="text-generation"`:
```python
llm = HuggingFaceEndpoint(
    repo_id=GEN_MODEL_ID,
    huggingfacehub_api_token=HF_TOKEN,
    task="text-generation"  # Explicitly specify task
)
```

 **Added a deprecation warning note** and recommended using
`InferenceClient`:
```python
from huggingface_hub import InferenceClient
from langchain.llms.huggingface_hub import HuggingFaceHub

client = InferenceClient(model=GEN_MODEL_ID, token=HF_TOKEN)

llm = HuggingFaceHub(
    repo_id=GEN_MODEL_ID,
    huggingfacehub_api_token=HF_TOKEN,
    client=client,
)
```

---

## **Dependencies**
- No new dependencies introduced.
- Change only affects **documentation**.

---

## **Testing**
-  Verified that adding `task="text-generation"` resolves the issue.
-  Tested the alternative approach with `InferenceClient` in Google
Colab.

---

## **Twitter Handle (Optional)**
If this PR gets announced, a shout-out to **@AkmalJasmin** would be
great! 🚀

---

## **Reviewers**
📌 **@langchain-maintainers** Please review this PR. Let me know if
further changes are needed.

🚀 This fix improves **developer onboarding** and ensures the **LangChain
documentation remains up to date**! 🚀
2025-02-08 14:41:02 -05:00
manukychen
3de445d521 using getattr and default value to prevent 'OpenSearchVectorSearch' has no attribute 'bulk_size' (#29682)
- Description: Adding getattr methods and set default value 500 to
cls.bulk_size, it can prevent the error below:
Error: type object 'OpenSearchVectorSearch' has no attribute 'bulk_size'

- Issue: https://github.com/langchain-ai/langchain/issues/29071
2025-02-08 14:39:57 -05:00
Yao Tianjia
5d581ba22c langchain: support the situation when action_input is null in json output_parser (#29680)
Description:
This PR fixes handling of null action_input in
[langchain.agents.output_parser]. Previously, passing null to
action_input could cause OutputParserException with unclear error
message which cause LLM don't know how to modify the action. The changes
include:

Added null-check validation before processing action_input
Implemented proper fallback behavior with default values
Maintained backward compatibility with existing implementations

Error Examples:
```
{
  "action":"some action",
  "action_input":null
}
```

Issue:
None

Dependencies:
None
2025-02-07 22:01:01 -05:00
Philippe PRADOS
beb75b2150 community[minor]: 05 - Refactoring PyPDFium2 parser (#29625)
This is one part of a larger Pull Request (PR) that is too large to be
submitted all at once. This specific part focuses on updating the
PyPDFium2 parser.

For more details, see
https://github.com/langchain-ai/langchain/pull/28970.
2025-02-07 21:31:12 -05:00
Christophe Bornet
723031d548 community: Bump ruff version to 0.9 (#29206)
Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 01:21:10 +00:00
Christophe Bornet
30f6c9f5c8 community: Use Blockbuster to detect blocking calls in asyncio during tests (#29609)
Same as https://github.com/langchain-ai/langchain/pull/29043 for
langchain-community.

**Dependencies:**
- blockbuster (test)

**Twitter handle:** cbornet_

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 01:10:39 +00:00
Christophe Bornet
3a57a28daa langchain: Use Blockbuster to detect blocking calls in asyncio during tests (#29616)
Same as https://github.com/langchain-ai/langchain/pull/29043 for the
langchain package.

**Dependencies:**
- blockbuster (test)

**Twitter handle:** cbornet_

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 01:08:15 +00:00
Keenan Pepper
c67d473397 core: Make abatch_as_completed respect max_concurrency (#29426)
- **Description:** Add tests for respecting max_concurrency and
implement it for abatch_as_completed so that test passes
- **Issue:** #29425
- **Dependencies:** none
- **Twitter handle:** keenanpepper
2025-02-07 16:51:22 -08:00
Aaron V
dcfaae85d2 Core: Fix __add__ for concatting two BaseMessageChunk's (#29531)
Description:

The change allows you to use the overloaded `+` operator correctly when
`+`ing two BaseMessageChunk subclasses. Without this you *must*
instantiate a subclass for it to work.

Which feels... wrong. Base classes should be decoupled from sub classes
and should have in no way a dependency on them.

Issue:

You can't `+` a BaseMessageChunk with a BaseMessageChunk

e.g. this will explode

```py
from langchain_core.outputs import (
    ChatGenerationChunk,
)
from langchain_core.messages import BaseMessageChunk


chunk1 = ChatGenerationChunk(
    message=BaseMessageChunk(
        type="customChunk",
        content="HI",
    ),
)

chunk2 = ChatGenerationChunk(
    message=BaseMessageChunk(
        type="customChunk",
        content="HI",
    ),
)

# this will throw
new_chunk = chunk1 + chunk2
```

In case anyone ran into this issue themselves, it's probably best to use
the AIMessageChunk:

a la 

```py
from langchain_core.outputs import (
    ChatGenerationChunk,
)
from langchain_core.messages import AIMessageChunk


chunk1 = ChatGenerationChunk(
    message=AIMessageChunk(
        content="HI",
    ),
)

chunk2 = ChatGenerationChunk(
    message=AIMessageChunk(
        content="HI",
    ),
)

# No explosion!
new_chunk = chunk1 + chunk2
```

Dependencies:

None!

Twitter handle: 
`aaron_vogler`

Keeping these for later if need be:
```
baskaryan
efriis 
eyurtsev
ccurme 
vbarda
hwchase17
baskaryan
efriis
```

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 00:43:36 +00:00
Marlene
4fa3ef0d55 Community/Partner: Adding Azure community and partner user agent to better track usage in Python (#29561)
- This pull request includes various changes to add a `user_agent`
parameter to Azure OpenAI, Azure Search and Whisper in the Community and
Partner packages. This helps in identifying the source of API requests
so we can better track usage and help support the community better. I
will also be adding the user_agent to the new `langchain-azure` repo as
well.

- No issue connected or  updated dependencies. 
- Utilises existing tests and docs

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 23:28:30 +00:00
Ella Charlaix
c401254770 huggingface: Add ipex support to HuggingFaceEmbeddings (#29386)
ONNX and OpenVINO models are available by specifying the `backend`
argument (the model is loaded using `optimum`
https://github.com/huggingface/optimum)

```python
from langchain_huggingface import HuggingFaceEmbeddings

embedding = HuggingFaceEmbeddings(
    model_name=model_id,
    model_kwargs={"backend": "onnx"},
)
```

With this PR we also enable the IPEX backend 



```python
from langchain_huggingface import HuggingFaceEmbeddings

embedding = HuggingFaceEmbeddings(
    model_name=model_id,
    model_kwargs={"backend": "ipex"},
)
```
2025-02-07 15:21:09 -08:00
Bruno Alvisio
3eaf561561 core: Handle unterminated escape character when parsing partial JSON (#29065)
**Description**
Currently, when parsing a partial JSON, if a string ends with the escape
character, the whole key/value is removed. For example:

```
>>> from langchain_core.utils.json import parse_partial_json
>>> my_str = '{"foo": "bar", "baz": "qux\\'
>>> 
>>> parse_partial_json(my_str)
{'foo': 'bar'}
```

My expectation (and with this fix) would be for `parse_partial_json()`
to return:
```
>>> from langchain_core.utils.json import parse_partial_json
>>> 
>>> my_str = '{"foo": "bar", "baz": "qux\\'
>>> parse_partial_json(my_str)
{'foo': 'bar', 'baz': 'qux'}
```

Notes:
1. It could be argued that current behavior is still desired.
2. I have experienced this issue when the streaming output from an LLM
and the chunk happens to end with `\\`
3. I haven't included tests. Will do if change is accepted.
4. This is specially troublesome when this function is used by

187131c55c/libs/core/langchain_core/output_parsers/transform.py (L111)

since what happens is that, for example, if the received sequence of
chunks are: `{"foo": "b` , `ar\\` :

Then, the result of calling `self.parse_result()` is:
```
{"foo": "b"}
```
and the second time:
```
{}
```

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 23:18:21 +00:00
ccurme
0040d93b09 docs: showcase extras in chat model tabs (#29677)
Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 18:16:44 -05:00
Viren
252cf0af10 docs: add LangFair as a provider (#29390)
**Description:**
- Add `docs/docs/providers/langfair.mdx`
- Register langfair in `libs/packages.yml`

**Twitter handle:** @LangFair

**Tests and docs**
1. Integration tests not needed as this PR only adds a .mdx file to
docs.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Dylan Bouchard <dylan.bouchard@cvshealth.com>
Co-authored-by: Dylan Bouchard <109233938+dylanbouchard@users.noreply.github.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 21:27:37 +00:00
Erick Friis
eb9eddae0c docs: use init_chat_model (#29623) 2025-02-07 12:39:27 -08:00
ccurme
bff25b552c community: release 0.3.17 (#29676) 2025-02-07 19:41:44 +00:00
ccurme
01314c51fa langchain: release 0.3.18 (#29654) 2025-02-07 13:40:26 -05:00
ccurme
92e2239414 openai[patch]: make parallel_tool_calls explicit kwarg of bind_tools (#29669)
Improves discoverability and documentation.

cc @vbarda
2025-02-07 13:34:32 -05:00
ccurme
2a243df7bb infra: add UV_NO_SYNC to monorepo makefile (#29670)
Helpful for running `api_docs_quick_preview` locally.
2025-02-07 17:17:05 +00:00
Marc Ammann
5690575f13 openai: Removed tool_calls from completion chunk after other chunks have already been sent. (#29649)
- **Description:** Before sending a completion chunk at the end of an
OpenAI stream, removing the tool_calls as those have already been sent
as chunks.
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** -

@ccurme as mentioned in another PR

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-07 10:15:52 -05:00
Ikko Eltociear Ashimine
0d45ad57c1 community: update base_o365.py (#29657)
extention -> extension
2025-02-07 08:43:29 -05:00
weeix
1b064e198f docs: Fix llama.cpp GPU Installation in llamacpp.ipynb (Deprecated Env Variable) (#29659)
- **Description:** The llamacpp.ipynb notebook used a deprecated
environment variable, LLAMA_CUBLAS, for llama.cpp installation with GPU
support. This commit updates the notebook to use the correct GGML_CUDA
variable, fixing the installation error.
- **Issue:** none
-  **Dependencies:** none
2025-02-07 08:43:09 -05:00
Vincent Emonet
3645181d0e qdrant: Add similarity_search_with_score_by_vector() function to the QdrantVectorStore (#29641)
Added `similarity_search_with_score_by_vector()` function to the
`QdrantVectorStore` class.

It is required when we want to query multiple time with the same
embeddings. It was present in the now deprecated original `Qdrant`
vectorstore implementation, but was absent from the new one. It is also
implemented in a number of others `VectorStore` implementations

I have added tests for this new function

Note that I also argued in this discussion that it should be part of the
general `VectorStore`
https://github.com/langchain-ai/langchain/discussions/29638

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 00:55:58 +00:00
ccurme
488cb4a739 anthropic: release 0.3.7 (#29653) 2025-02-06 17:05:57 -05:00
ccurme
ab09490c20 openai: release 0.3.4 (#29652) 2025-02-06 17:02:21 -05:00
ccurme
29a0c38cc3 openai[patch]: add test for message.name (#29651) 2025-02-06 16:49:28 -05:00
ccurme
91cca827c0 tests: release 0.3.11 (#29648) 2025-02-06 21:48:09 +00:00
350 changed files with 12297 additions and 3222 deletions

View File

@@ -39,7 +39,6 @@ IGNORED_PARTNERS = [
PY_312_MAX_PACKAGES = [
"libs/partners/huggingface", # https://github.com/pytorch/pytorch/issues/130249
"libs/partners/pinecone",
"libs/partners/voyageai",
]

View File

@@ -63,12 +63,12 @@ jobs:
if: ${{ ! startsWith(inputs.working-directory, 'libs/partners/') }}
working-directory: ${{ inputs.working-directory }}
run: |
uv sync --group test
uv sync --inexact --group test
- name: Install unit+integration test dependencies
if: ${{ startsWith(inputs.working-directory, 'libs/partners/') }}
working-directory: ${{ inputs.working-directory }}
run: |
uv sync --group test --group test_integration
uv sync --inexact --group test --group test_integration
- name: Analysing the code with our lint
working-directory: ${{ inputs.working-directory }}

View File

@@ -22,6 +22,7 @@ on:
env:
PYTHON_VERSION: "3.11"
UV_FROZEN: "true"
UV_NO_SYNC: "true"
jobs:
build:

View File

@@ -14,6 +14,7 @@ on:
env:
UV_FROZEN: "true"
UV_NO_SYNC: "true"
jobs:
build:

View File

@@ -19,6 +19,7 @@ on:
env:
UV_FROZEN: "true"
UV_NO_SYNC: "true"
jobs:
build:

View File

@@ -19,6 +19,7 @@ concurrency:
env:
UV_FROZEN: "true"
UV_NO_SYNC: "true"
jobs:
build:

View File

@@ -15,7 +15,7 @@ on:
env:
POETRY_VERSION: "1.8.4"
UV_FROZEN: "true"
DEFAULT_LIBS: '["libs/partners/openai", "libs/partners/anthropic", "libs/partners/fireworks", "libs/partners/groq", "libs/partners/mistralai", "libs/partners/google-vertexai", "libs/partners/google-genai", "libs/partners/aws"]'
DEFAULT_LIBS: '["libs/partners/openai", "libs/partners/anthropic", "libs/partners/fireworks", "libs/partners/groq", "libs/partners/mistralai", "libs/partners/xai", "libs/partners/google-vertexai", "libs/partners/google-genai", "libs/partners/aws"]'
POETRY_LIBS: ("libs/partners/google-vertexai" "libs/partners/google-genai" "libs/partners/aws")
jobs:
@@ -139,6 +139,7 @@ jobs:
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}

View File

@@ -2,6 +2,7 @@
.EXPORT_ALL_VARIABLES:
UV_FROZEN = true
UV_NO_SYNC = true
## help: Show this help info.
help: Makefile
@@ -82,3 +83,6 @@ lint lint_package lint_tests:
format format_diff:
uv run --group lint ruff format docs cookbook
uv run --group lint ruff check --select I --fix docs cookbook
update-package-downloads:
uv run python docs/scripts/packages_yml_get_downloads.py

View File

@@ -21,7 +21,6 @@ Notebook | Description
[code-analysis-deeplake.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/code-analysis-deeplake.ipynb) | Analyze its own code base with the help of gpt and activeloop's deep lake.
[custom_agent_with_plugin_retri...](https://github.com/langchain-ai/langchain/tree/master/cookbook/custom_agent_with_plugin_retrieval.ipynb) | Build a custom agent that can interact with ai plugins by retrieving tools and creating natural language wrappers around openapi endpoints.
[custom_agent_with_plugin_retri...](https://github.com/langchain-ai/langchain/tree/master/cookbook/custom_agent_with_plugin_retrieval_using_plugnplai.ipynb) | Build a custom agent with plugin retrieval functionality, utilizing ai plugins from the `plugnplai` directory.
[databricks_sql_db.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/databricks_sql_db.ipynb) | Connect to databricks runtimes and databricks sql.
[deeplake_semantic_search_over_...](https://github.com/langchain-ai/langchain/tree/master/cookbook/deeplake_semantic_search_over_chat.ipynb) | Perform semantic search and question-answering over a group chat using activeloop's deep lake with gpt4.
[elasticsearch_db_qa.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/elasticsearch_db_qa.ipynb) | Interact with elasticsearch analytics databases in natural language and build search queries via the elasticsearch dsl API.
[extraction_openai_tools.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/extraction_openai_tools.ipynb) | Structured Data Extraction with OpenAI Tools

View File

@@ -1,273 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "707d13a7",
"metadata": {},
"source": [
"# Databricks\n",
"\n",
"This notebook covers how to connect to the [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the SQLDatabase wrapper of LangChain.\n",
"It is broken into 3 parts: installation and setup, connecting to Databricks, and examples."
]
},
{
"cell_type": "markdown",
"id": "0076d072",
"metadata": {},
"source": [
"## Installation and Setup"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "739b489b",
"metadata": {},
"outputs": [],
"source": [
"!pip install databricks-sql-connector"
]
},
{
"cell_type": "markdown",
"id": "73113163",
"metadata": {},
"source": [
"## Connecting to Databricks\n",
"\n",
"You can connect to [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the `SQLDatabase.from_databricks()` method.\n",
"\n",
"### Syntax\n",
"```python\n",
"SQLDatabase.from_databricks(\n",
" catalog: str,\n",
" schema: str,\n",
" host: Optional[str] = None,\n",
" api_token: Optional[str] = None,\n",
" warehouse_id: Optional[str] = None,\n",
" cluster_id: Optional[str] = None,\n",
" engine_args: Optional[dict] = None,\n",
" **kwargs: Any)\n",
"```\n",
"### Required Parameters\n",
"* `catalog`: The catalog name in the Databricks database.\n",
"* `schema`: The schema name in the catalog.\n",
"\n",
"### Optional Parameters\n",
"There following parameters are optional. When executing the method in a Databricks notebook, you don't need to provide them in most of the cases.\n",
"* `host`: The Databricks workspace hostname, excluding 'https://' part. Defaults to 'DATABRICKS_HOST' environment variable or current workspace if in a Databricks notebook.\n",
"* `api_token`: The Databricks personal access token for accessing the Databricks SQL warehouse or the cluster. Defaults to 'DATABRICKS_TOKEN' environment variable or a temporary one is generated if in a Databricks notebook.\n",
"* `warehouse_id`: The warehouse ID in the Databricks SQL.\n",
"* `cluster_id`: The cluster ID in the Databricks Runtime. If running in a Databricks notebook and both 'warehouse_id' and 'cluster_id' are None, it uses the ID of the cluster the notebook is attached to.\n",
"* `engine_args`: The arguments to be used when connecting Databricks.\n",
"* `**kwargs`: Additional keyword arguments for the `SQLDatabase.from_uri` method."
]
},
{
"cell_type": "markdown",
"id": "b11c7e48",
"metadata": {},
"source": [
"## Examples"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8102bca0",
"metadata": {},
"outputs": [],
"source": [
"# Connecting to Databricks with SQLDatabase wrapper\n",
"from langchain_community.utilities import SQLDatabase\n",
"\n",
"db = SQLDatabase.from_databricks(catalog=\"samples\", schema=\"nyctaxi\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "9dd36f58",
"metadata": {},
"outputs": [],
"source": [
"# Creating a OpenAI Chat LLM wrapper\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(temperature=0, model_name=\"gpt-4\")"
]
},
{
"cell_type": "markdown",
"id": "5b5c5f1a",
"metadata": {},
"source": [
"### SQL Chain example\n",
"\n",
"This example demonstrates the use of the [SQL Chain](https://python.langchain.com/en/latest/modules/chains/examples/sqlite.html) for answering a question over a Databricks database."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "36f2270b",
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.utilities import SQLDatabaseChain\n",
"\n",
"db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4e2b5f25",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new SQLDatabaseChain chain...\u001b[0m\n",
"What is the average duration of taxi rides that start between midnight and 6am?\n",
"SQLQuery:\u001b[32;1m\u001b[1;3mSELECT AVG(UNIX_TIMESTAMP(tpep_dropoff_datetime) - UNIX_TIMESTAMP(tpep_pickup_datetime)) as avg_duration\n",
"FROM trips\n",
"WHERE HOUR(tpep_pickup_datetime) >= 0 AND HOUR(tpep_pickup_datetime) < 6\u001b[0m\n",
"SQLResult: \u001b[33;1m\u001b[1;3m[(987.8122786304605,)]\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3mThe average duration of taxi rides that start between midnight and 6am is 987.81 seconds.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The average duration of taxi rides that start between midnight and 6am is 987.81 seconds.'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db_chain.run(\n",
" \"What is the average duration of taxi rides that start between midnight and 6am?\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "e496d5e5",
"metadata": {},
"source": [
"### SQL Database Agent example\n",
"\n",
"This example demonstrates the use of the [SQL Database Agent](/docs/integrations/tools/sql_database) for answering questions over a Databricks database."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "9918e86a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import create_sql_agent\n",
"from langchain_community.agent_toolkits import SQLDatabaseToolkit\n",
"\n",
"toolkit = SQLDatabaseToolkit(db=db, llm=llm)\n",
"agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "c484a76e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction: list_tables_sql_db\n",
"Action Input: \u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mtrips\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mI should check the schema of the trips table to see if it has the necessary columns for trip distance and duration.\n",
"Action: schema_sql_db\n",
"Action Input: trips\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m\n",
"CREATE TABLE trips (\n",
"\ttpep_pickup_datetime TIMESTAMP, \n",
"\ttpep_dropoff_datetime TIMESTAMP, \n",
"\ttrip_distance FLOAT, \n",
"\tfare_amount FLOAT, \n",
"\tpickup_zip INT, \n",
"\tdropoff_zip INT\n",
") USING DELTA\n",
"\n",
"/*\n",
"3 rows from trips table:\n",
"tpep_pickup_datetime\ttpep_dropoff_datetime\ttrip_distance\tfare_amount\tpickup_zip\tdropoff_zip\n",
"2016-02-14 16:52:13+00:00\t2016-02-14 17:16:04+00:00\t4.94\t19.0\t10282\t10171\n",
"2016-02-04 18:44:19+00:00\t2016-02-04 18:46:00+00:00\t0.28\t3.5\t10110\t10110\n",
"2016-02-17 17:13:57+00:00\t2016-02-17 17:17:55+00:00\t0.7\t5.0\t10103\t10023\n",
"*/\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe trips table has the necessary columns for trip distance and duration. I will write a query to find the longest trip distance and its duration.\n",
"Action: query_checker_sql_db\n",
"Action Input: SELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001b[0m\n",
"Observation: \u001b[31;1m\u001b[1;3mSELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe query is correct. I will now execute it to find the longest trip distance and its duration.\n",
"Action: query_sql_db\n",
"Action Input: SELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m[(30.6, '0 00:43:31.000000000')]\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
"Final Answer: The longest trip distance is 30.6 miles and it took 43 minutes and 31 seconds.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The longest trip distance is 30.6 miles and it took 43 minutes and 31 seconds.'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"What is the longest trip distance and how long did it take?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -270,7 +270,7 @@
"\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs openaiParams={`model=\"gpt-4\"`} />\n"
"<ChatModelTabs overrideParams={{openai: {model: \"gpt-4\"}}} />\n"
]
},
{

View File

@@ -354,7 +354,7 @@
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" openaiParams={`model=\"gpt-4-0125-preview\", temperature=0`}\n",
" overrideParams={{openai: {model: \"gpt-4-0125-preview\", kwargs: \"temperature=0\"}}}\n",
"/>\n"
]
},

View File

@@ -179,7 +179,7 @@
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" openaiParams={`model=\"gpt-4o\", temperature=0`}\n",
" overrideParams={{openai: {model: \"gpt-4o\", kwargs: \"temperature=0\"}}}\n",
"/>\n"
]
},

View File

@@ -167,7 +167,7 @@
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" fireworksParams={`model=\"accounts/fireworks/models/firefunction-v1\", temperature=0`}\n",
" overrideParams={{fireworks: {model: \"accounts/fireworks/models/firefunction-v1\", kwargs: \"temperature=0\"}}}\n",
"/>\n",
"\n",
"We can use the `bind_tools()` method to handle converting\n",

View File

@@ -99,8 +99,6 @@
"\n",
"prompt = ChatPromptTemplate.from_template(\"what is {a} + {b}\")\n",
"\n",
"chain1 = prompt | model\n",
"\n",
"chain = (\n",
" {\n",
" \"a\": itemgetter(\"foo\") | RunnableLambda(length_function),\n",

View File

@@ -200,7 +200,12 @@
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" fireworksParams={`model=\"accounts/fireworks/models/firefunction-v1\", temperature=0`}\n",
" overrideParams={{\n",
" fireworks: {\n",
" model: \"accounts/fireworks/models/firefunction-v1\",\n",
" kwargs: \"temperature=0\",\n",
" }\n",
" }}\n",
"/>\n"
]
},

View File

@@ -33,7 +33,7 @@
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" fireworksParams={`model=\"accounts/fireworks/models/firefunction-v1\", temperature=0`}\n",
" overrideParams={{fireworks: {model: \"accounts/fireworks/models/firefunction-v1\", kwargs: \"temperature=0\"}}}\n",
"/>\n"
]
},

View File

@@ -46,7 +46,7 @@
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" fireworksParams={`model=\"accounts/fireworks/models/firefunction-v1\", temperature=0`}\n",
" overrideParams={{fireworks: {model: \"accounts/fireworks/models/firefunction-v1\", kwargs: \"temperature=0\"}}}\n",
"/>\n"
]
},

View File

@@ -91,7 +91,7 @@
"\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs openaiParams={`model=\"gpt-4\"`} />\n",
"<ChatModelTabs overrideParams={{openai: {model: \"gpt-4\"}}} />\n",
"\n",
"To illustrate the idea, we'll use `phi3` via Ollama, which does **NOT** have native support for tool calling. If you'd like to use `Ollama` as well follow [these instructions](/docs/integrations/chat/ollama/)."
]

View File

@@ -0,0 +1,206 @@
{
"cells": [
{
"cell_type": "raw",
"id": "afaf8039",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Abso\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "e49f1e0d",
"metadata": {},
"source": [
"# ChatAbso\n",
"\n",
"This will help you getting started with ChatAbso [chat models](https://python.langchain.com/docs/concepts/chat_models/). For detailed documentation of all ChatAbso features and configurations head to the [API reference](https://python.langchain.com/api_reference/en/latest/chat_models/langchain_abso.chat_models.ChatAbso.html).\n",
"\n",
"- You can find the full documentation for the Abso router [here] (https://abso.ai)\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/abso) | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
"| [ChatAbso](https://python.langchain.com/api_reference/en/latest/chat_models/langchain_abso.chat_models.ChatAbso.html) | [langchain-abso](https://python.langchain.com/api_reference/en/latest/abso_api_reference.html) | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-abso?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-abso?style=flat-square&label=%20) |\n",
"\n",
"## Setup\n",
"To access ChatAbso models you'll need to create an OpenAI account, get an API key, and install the `langchain-abso` integration package.\n",
"\n",
"### Credentials\n",
"\n",
"- TODO: Update with relevant info.\n",
"\n",
"Head to (TODO: link) to sign up to ChatAbso and generate an API key. Once you've done this set the ABSO_API_KEY environment variable:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "433e8d2b-9519-4b49-b2c4-7ab65b046c94",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"if not os.getenv(\"OPENAI_API_KEY\"):\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI API key: \")"
]
},
{
"cell_type": "markdown",
"id": "0730d6a1-c893-4840-9817-5e5251676d5d",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"The LangChain ChatAbso integration lives in the `langchain-abso` package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "652d6238-1f87-422a-b135-f5abbb8652fc",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-abso"
]
},
{
"cell_type": "markdown",
"id": "a38cde65-254d-4219-a441-068766c0d4b5",
"metadata": {},
"source": [
"## Instantiation\n",
"\n",
"Now we can instantiate our model object and generate chat completions:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
"metadata": {},
"outputs": [],
"source": [
"from langchain_abso import ChatAbso\n",
"\n",
"llm = ChatAbso(fast_model=\"gpt-4o\", slow_model=\"o3-mini\")"
]
},
{
"cell_type": "markdown",
"id": "2b4f3e15",
"metadata": {},
"source": [
"## Invocation\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62e0dbc3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"messages = [\n",
" (\n",
" \"system\",\n",
" \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
" ),\n",
" (\"human\", \"I love programming.\"),\n",
"]\n",
"ai_msg = llm.invoke(messages)\n",
"ai_msg"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d86145b3-bfef-46e8-b227-4dda5c9c2705",
"metadata": {},
"outputs": [],
"source": [
"print(ai_msg.content)"
]
},
{
"cell_type": "markdown",
"id": "18e2bfc0-7e78-4528-a73f-499ac150dca8",
"metadata": {},
"source": [
"## Chaining\n",
"\n",
"We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.prompts import ChatPromptTemplate\n",
"\n",
"prompt = ChatPromptTemplate(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
" ),\n",
" (\"human\", \"{input}\"),\n",
" ]\n",
")\n",
"\n",
"chain = prompt | llm\n",
"chain.invoke(\n",
" {\n",
" \"input_language\": \"English\",\n",
" \"output_language\": \"German\",\n",
" \"input\": \"I love programming.\",\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
"metadata": {},
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all ChatAbso features and configurations head to the API reference: https://python.langchain.com/api_reference/en/latest/chat_models/langchain_abso.chat_models.ChatAbso.html"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -210,7 +210,7 @@
"id": "96ed13d4",
"metadata": {},
"source": [
"Instead of `model_id`, you can also pass the `deployment_id` of the previously tuned model. The entire model tuning workflow is described in [Working with TuneExperiment and PromptTuner](https://ibm.github.io/watsonx-ai-python-sdk/pt_working_with_class_and_prompt_tuner.html)."
"Instead of `model_id`, you can also pass the `deployment_id` of the previously [deployed model with reference to a Prompt Template](https://cloud.ibm.com/apidocs/watsonx-ai#deployments-text-chat)."
]
},
{
@@ -228,6 +228,31 @@
")"
]
},
{
"cell_type": "markdown",
"id": "3d29767c",
"metadata": {},
"source": [
"For certain requirements, there is an option to pass the IBM's [`APIClient`](https://ibm.github.io/watsonx-ai-python-sdk/base.html#apiclient) object into the `ChatWatsonx` class."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ae9531e",
"metadata": {},
"outputs": [],
"source": [
"from ibm_watsonx_ai import APIClient\n",
"\n",
"api_client = APIClient(...)\n",
"\n",
"chat = ChatWatsonx(\n",
" model_id=\"ibm/granite-34b-code-instruct\",\n",
" watsonx_client=api_client,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f571001d",
@@ -448,9 +473,7 @@
"source": [
"## Tool calling\n",
"\n",
"### ChatWatsonx.bind_tools()\n",
"\n",
"Please note that `ChatWatsonx.bind_tools` is on beta state, so we recommend using `mistralai/mistral-large` model."
"### ChatWatsonx.bind_tools()"
]
},
{
@@ -563,7 +586,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "langchain_ibm",
"language": "python",
"name": "python3"
},

View File

@@ -17,7 +17,7 @@ If you'd like to contribute an integration, see [Contributing integrations](/doc
import ChatModelTabs from "@theme/ChatModelTabs";
<ChatModelTabs openaiParams={`model="gpt-4o-mini"`} />
<ChatModelTabs overrideParams={{openai: {model: "gpt-4o-mini"}}} />
```python
model.invoke("Hello, world!")

View File

@@ -19,7 +19,7 @@
"source": [
"# ChatSambaNovaCloud\n",
"\n",
"This will help you getting started with SambaNovaCloud [chat models](/docs/concepts/chat_models/). For detailed documentation of all ChatSambaNovaCloud features and configurations head to the [API reference](https://python.langchain.com/api_reference/sambanova/chat_models/langchain_sambanova.ChatSambaNovaCloud.html).\n",
"This will help you getting started with SambaNovaCloud [chat models](/docs/concepts/chat_models/). For detailed documentation of all ChatSambaNovaCloud features and configurations head to the [API reference](https://docs.sambanova.ai/cloud/docs/get-started/overview).\n",
"\n",
"**[SambaNova](https://sambanova.ai/)'s** [SambaNova Cloud](https://cloud.sambanova.ai/) is a platform for performing inference with open-source models\n",
"\n",
@@ -28,7 +28,7 @@
"\n",
"| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
"| [ChatSambaNovaCloud](https://python.langchain.com/api_reference/sambanova/chat_models/langchain_sambanova.ChatSambaNovaCloud.html) | [langchain-community](https://python.langchain.com/api_reference/community/index.html) | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_sambanova?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_sambanova?style=flat-square&label=%20) |\n",
"| [ChatSambaNovaCloud](https://docs.sambanova.ai/cloud/docs/get-started/overview) | [langchain-sambanova](https://python.langchain.com/docs/integrations/providers/sambanova/) | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_sambanova?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_sambanova?style=flat-square&label=%20) |\n",
"\n",
"### Model features\n",
"\n",
@@ -545,7 +545,7 @@
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all ChatSambaNovaCloud features and configurations head to the API reference: https://python.langchain.com/api_reference/sambanova/chat_models/langchain_sambanova.ChatSambaNovaCloud.html"
"For detailed documentation of all SambaNovaCloud features and configurations head to the API reference: https://docs.sambanova.ai/cloud/docs/get-started/overview"
]
}
],

View File

@@ -19,7 +19,7 @@
"source": [
"# ChatSambaStudio\n",
"\n",
"This will help you getting started with SambaStudio [chat models](/docs/concepts/chat_models). For detailed documentation of all ChatStudio features and configurations head to the [API reference](https://python.langchain.com/api_reference/sambanova/chat_models/langchain_sambanova.chat_models.sambanova.ChatSambaStudio.html).\n",
"This will help you getting started with SambaStudio [chat models](/docs/concepts/chat_models). For detailed documentation of all ChatStudio features and configurations head to the [API reference](https://docs.sambanova.ai/sambastudio/latest/index.html).\n",
"\n",
"**[SambaNova](https://sambanova.ai/)'s** [SambaStudio](https://docs.sambanova.ai/sambastudio/latest/sambastudio-intro.html) SambaStudio is a rich, GUI-based platform that provides the functionality to train, deploy, and manage models in SambaNova [DataScale](https://sambanova.ai/products/datascale) systems.\n",
"\n",
@@ -28,7 +28,7 @@
"\n",
"| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
"| [ChatSambaStudio](https://python.langchain.com/api_reference/sambanova/chat_models/langchain_sambanova.chat_models.sambanova.ChatSambaStudio.html) | [langchain-community](https://python.langchain.com/api_reference/community/index.html) | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_sambanova?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_sambanova?style=flat-square&label=%20) |\n",
"| [ChatSambaStudio](https://docs.sambanova.ai/sambastudio/latest/index.html) | [langchain-sambanova](https://python.langchain.com/docs/integrations/providers/sambanova/) | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_sambanova?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_sambanova?style=flat-square&label=%20) |\n",
"\n",
"### Model features\n",
"\n",
@@ -483,7 +483,7 @@
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all ChatSambaStudio features and configurations head to the API reference: https://python.langchain.com/api_reference/sambanova/chat_models/langchain_sambanova.sambanova.chat_models.ChatSambaStudio.html"
"For detailed documentation of all SambaStudio features and configurations head to the API reference: https://docs.sambanova.ai/sambastudio/latest/api-ref-landing.html"
]
}
],

View File

@@ -2,7 +2,9 @@
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"id": "xwiDq5fOuoRn"
},
"source": [
"# Apify Dataset\n",
"\n",
@@ -20,33 +22,63 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "qRW2-mokuoRp",
"tags": []
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet apify-client"
"%pip install --upgrade --quiet langchain langchain-apify langchain-openai"
]
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"id": "8jRVq16LuoRq"
},
"source": [
"First, import `ApifyDatasetLoader` into your source code:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"execution_count": 2,
"metadata": {
"id": "umXQHqIJuoRq"
},
"outputs": [],
"source": [
"from langchain_community.document_loaders import ApifyDatasetLoader\n",
"from langchain_apify import ApifyDatasetLoader\n",
"from langchain_core.documents import Document"
]
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"id": "NjGwKy59vz1X"
},
"source": [
"Find your [Apify API token](https://console.apify.com/account/integrations) and [OpenAI API key](https://platform.openai.com/account/api-keys) and initialize these into environment variable:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"id": "AvzNtyCxwDdr"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"APIFY_API_TOKEN\"] = \"your-apify-api-token\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "d1O-KL48uoRr"
},
"source": [
"Then provide a function that maps Apify dataset record fields to LangChain `Document` format.\n",
"\n",
@@ -64,8 +96,10 @@
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"execution_count": 8,
"metadata": {
"id": "m1SpA7XZuoRr"
},
"outputs": [],
"source": [
"loader = ApifyDatasetLoader(\n",
@@ -78,8 +112,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 9,
"metadata": {
"id": "0hWX7ABsuoRs"
},
"outputs": [],
"source": [
"data = loader.load()"
@@ -87,7 +123,9 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"id": "EJCVFVKNuoRs"
},
"source": [
"## An example with question answering\n",
"\n",
@@ -96,21 +134,26 @@
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"execution_count": 14,
"metadata": {
"id": "sNisJKzZuoRt"
},
"outputs": [],
"source": [
"from langchain.indexes import VectorstoreIndexCreator\n",
"from langchain_community.utilities import ApifyWrapper\n",
"from langchain_apify import ApifyWrapper\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAI\n",
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"from langchain_openai import ChatOpenAI\n",
"from langchain_openai.embeddings import OpenAIEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"execution_count": 15,
"metadata": {
"id": "qcfmnbdDuoRu"
},
"outputs": [],
"source": [
"loader = ApifyDatasetLoader(\n",
@@ -123,27 +166,47 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 16,
"metadata": {
"id": "8b0xzKJxuoRv"
},
"outputs": [],
"source": [
"index = VectorstoreIndexCreator(embedding=OpenAIEmbeddings()).from_loaders([loader])"
"index = VectorstoreIndexCreator(\n",
" vectorstore_cls=InMemoryVectorStore, embedding=OpenAIEmbeddings()\n",
").from_loaders([loader])"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"execution_count": 17,
"metadata": {
"id": "7zPXGsVFwUGA"
},
"outputs": [],
"source": [
"llm = ChatOpenAI(model=\"gpt-4o-mini\")"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"id": "ecWrdM4guoRv"
},
"outputs": [],
"source": [
"query = \"What is Apify?\"\n",
"result = index.query_with_sources(query, llm=OpenAI())"
"result = index.query_with_sources(query, llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"execution_count": null,
"metadata": {
"id": "QH8r44e9uoRv",
"outputId": "361fe050-f75d-4d5a-c327-5e7bd190fba5"
},
"outputs": [
{
"name": "stdout",
@@ -162,6 +225,9 @@
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
@@ -181,5 +247,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 4
}
"nbformat_minor": 0
}

View File

@@ -443,6 +443,7 @@
"llm = HuggingFaceEndpoint(\n",
" repo_id=GEN_MODEL_ID,\n",
" huggingfacehub_api_token=HF_TOKEN,\n",
" task=\"text-generation\",\n",
")"
]
},

File diff suppressed because it is too large Load Diff

View File

@@ -195,7 +195,7 @@
"id": "96ed13d4",
"metadata": {},
"source": [
"Instead of `model_id`, you can also pass the `deployment_id` of the previously tuned model. The entire model tuning workflow is described [here](https://ibm.github.io/watsonx-ai-python-sdk/pt_working_with_class_and_prompt_tuner.html)."
"Instead of `model_id`, you can also pass the `deployment_id` of the previously tuned model. The entire model tuning workflow is described in [Working with TuneExperiment and PromptTuner](https://ibm.github.io/watsonx-ai-python-sdk/pt_tune_experiment_run.html)."
]
},
{
@@ -420,7 +420,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "langchain_ibm",
"language": "python",
"name": "python3"
},

View File

@@ -65,7 +65,7 @@
"metadata": {},
"outputs": [],
"source": [
"!CMAKE_ARGS=\"-DLLAMA_CUBLAS=on\" FORCE_CMAKE=1 pip install llama-cpp-python"
"!CMAKE_ARGS=\"-DGGML_CUDA=on\" FORCE_CMAKE=1 pip install llama-cpp-python"
]
},
{
@@ -81,7 +81,7 @@
"metadata": {},
"outputs": [],
"source": [
"!CMAKE_ARGS=\"-DLLAMA_CUBLAS=on\" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir"
"!CMAKE_ARGS=\"-DGGML_CUDA=on\" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir"
]
},
{
@@ -149,9 +149,9 @@
"\n",
"```\n",
"set FORCE_CMAKE=1\n",
"set CMAKE_ARGS=-DLLAMA_CUBLAS=OFF\n",
"set CMAKE_ARGS=-DGGML_CUDA=OFF\n",
"```\n",
"If you have an NVIDIA GPU make sure `DLLAMA_CUBLAS` is set to `ON`\n",
"If you have an NVIDIA GPU make sure `DGGML_CUDA` is set to `ON`\n",
"\n",
"#### Compiling and installing\n",
"\n",

View File

@@ -104,8 +104,8 @@
"\n",
"import boto3\n",
"from langchain.chains.question_answering import load_qa_chain\n",
"from langchain_community.llms import SagemakerEndpoint\n",
"from langchain_community.llms.sagemaker_endpoint import LLMContentHandler\n",
"from langchain_aws.llms import SagemakerEndpoint\n",
"from langchain_aws.llms.sagemaker_endpoint import LLMContentHandler\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"query = \"\"\"How long was Elizabeth hospitalized?\n",
@@ -174,8 +174,8 @@
"from typing import Dict\n",
"\n",
"from langchain.chains.question_answering import load_qa_chain\n",
"from langchain_community.llms import SagemakerEndpoint\n",
"from langchain_community.llms.sagemaker_endpoint import LLMContentHandler\n",
"from langchain_aws.llms import SagemakerEndpoint\n",
"from langchain_aws.llms.sagemaker_endpoint import LLMContentHandler\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"query = \"\"\"How long was Elizabeth hospitalized?\n",

View File

@@ -0,0 +1,14 @@
# Abso
[Abso](https://abso.ai/#router) is an open-source LLM proxy that automatically routes requests between fast and slow models based on prompt complexity. It uses various heuristics to chose the proper model. It's very fast and has low latency.
## Installation and setup
```bash
pip install langchain-abso
```
## Chat Model
See usage details [here](/docs/integrations/chat/abso)

View File

@@ -14,20 +14,34 @@ blogs, or knowledge bases.
## Installation and Setup
- Install the Apify API client for Python with `pip install apify-client`
- Install the LangChain Apify package for Python with:
```bash
pip install langchain-apify
```
- Get your [Apify API token](https://console.apify.com/account/integrations) and either set it as
an environment variable (`APIFY_API_TOKEN`) or pass it to the `ApifyWrapper` as `apify_api_token` in the constructor.
an environment variable (`APIFY_API_TOKEN`) or pass it as `apify_api_token` in the constructor.
## Tool
## Utility
You can use the `ApifyActorsTool` to use Apify Actors with agents.
```python
from langchain_apify import ApifyActorsTool
```
See [this notebook](/docs/integrations/tools/apify_actors) for example usage.
For more information on how to use this tool, visit [the Apify integration documentation](https://docs.apify.com/platform/integrations/langgraph).
## Wrapper
You can use the `ApifyWrapper` to run Actors on the Apify platform.
```python
from langchain_community.utilities import ApifyWrapper
from langchain_apify import ApifyWrapper
```
For more information on this wrapper, see [the API reference](https://python.langchain.com/api_reference/community/utilities/langchain_community.utilities.apify.ApifyWrapper.html).
For more information on how to use this wrapper, see [the Apify integration documentation](https://docs.apify.com/platform/integrations/langchain).
## Document loader
@@ -35,7 +49,10 @@ For more information on this wrapper, see [the API reference](https://python.lan
You can also use our `ApifyDatasetLoader` to get data from Apify dataset.
```python
from langchain_community.document_loaders import ApifyDatasetLoader
from langchain_apify import ApifyDatasetLoader
```
For a more detailed walkthrough of this loader, see [this notebook](/docs/integrations/document_loaders/apify_dataset).
Source code for this integration can be found in the [LangChain Apify repository](https://github.com/apify/langchain-apify).

View File

@@ -103,14 +103,7 @@ See [MLflow LangChain Integration](/docs/integrations/providers/mlflow_tracking)
SQLDatabase
-----------
You can connect to Databricks SQL using the SQLDatabase wrapper of LangChain.
```
from langchain.sql_database import SQLDatabase
db = SQLDatabase.from_databricks(catalog="samples", schema="nyctaxi")
```
See [Databricks SQL Agent](https://docs.databricks.com/en/large-language-models/langchain.html#databricks-sql-agent) for how to connect Databricks SQL with your LangChain Agent as a powerful querying tool.
To connect to Databricks SQL or query structured data, see the [Databricks structured retriever tool documentation](https://docs.databricks.com/en/generative-ai/agent-framework/structured-retrieval-tools.html#table-query-tool) and to create an agent using the above created SQL UDF see [Databricks UC Integration](https://docs.unitycatalog.io/ai/integrations/langchain/).
Open Models
-----------

View File

@@ -0,0 +1,65 @@
# Discord
> [Discord](https://discord.com/) is an instant messaging, voice, and video communication platform widely used by communities of all types.
## Installation and Setup
Install the `langchain-discord-shikenso` package:
```bash
pip install langchain-discord-shikenso
```
You must provide a bot token via environment variable so the tools can authenticate with the Discord API:
```bash
export DISCORD_BOT_TOKEN="your-discord-bot-token"
```
If `DISCORD_BOT_TOKEN` is not set, the tools will raise a `ValueError` when instantiated.
---
## Tools
Below is a snippet showing how you can read and send messages in Discord. For more details, see the [documentation for Discord tools](/docs/integrations/tools/discord).
```python
from langchain_discord.tools.discord_read_messages import DiscordReadMessages
from langchain_discord.tools.discord_send_messages import DiscordSendMessage
# Create tool instances
read_tool = DiscordReadMessages()
send_tool = DiscordSendMessage()
# Example: Read the last 3 messages from channel 1234567890
read_result = read_tool({"channel_id": "1234567890", "limit": 3})
print(read_result)
# Example: Send a message to channel 1234567890
send_result = send_tool({"channel_id": "1234567890", "message": "Hello from Markdown example!"})
print(send_result)
```
---
## Toolkit
`DiscordToolkit` groups multiple Discord-related tools into a single interface. For a usage example, see [the Discord toolkit docs](/docs/integrations/tools/discord).
```python
from langchain_discord.toolkits import DiscordToolkit
toolkit = DiscordToolkit()
tools = toolkit.get_tools()
read_tool = tools[0] # DiscordReadMessages
send_tool = tools[1] # DiscordSendMessage
```
---
## Future Integrations
Additional integrations (e.g., document loaders, chat loaders) could be added for Discord.
Check the [Discord Developer Docs](https://discord.com/developers/docs/intro) for more information, and watch for updates or advanced usage examples in the [langchain_discord GitHub repo](https://github.com/Shikenso-Analytics/langchain-discord).

View File

@@ -1,4 +1,4 @@
# Discord
# Discord (community loader)
>[Discord](https://discord.com/) is a VoIP and instant messaging social platform. Users have the ability to communicate
> with voice calls, video calls, text messaging, media and files in private chats or as part of communities called

View File

@@ -1,34 +0,0 @@
# FalkorDB
>[FalkorDB](https://www.falkordb.com/) is a creator of the [FalkorDB](https://docs.falkordb.com/),
> a low-latency Graph Database that delivers knowledge to GenAI.
## Installation and Setup
See [installation instructions here](/docs/integrations/graphs/falkordb/).
## Graphs
See a [usage example](/docs/integrations/graphs/falkordb).
```python
from langchain_community.graphs import FalkorDBGraph
```
## Chains
See a [usage example](/docs/integrations/graphs/falkordb).
```python
from langchain_community.chains.graph_qa.falkordb import FalkorDBQAChain
```
## Memory
See a [usage example](/docs/integrations/memory/falkordb_chat_message_history).
```python
from langchain_falkordb import FalkorDBChatMessageHistory
```

View File

@@ -0,0 +1,22 @@
# Graph RAG
## Overview
[Graph RAG](https://datastax.github.io/graph-rag/) provides a retriever interface
that combines **unstructured** similarity search on vectors with **structured**
traversal of metadata properties. This enables graph-based retrieval over **existing**
vector stores.
## Installation and setup
```bash
pip install langchain-graph-retriever
```
## Retrievers
```python
from langchain_graph_retriever import GraphRetriever
```
For more information, see the [Graph RAG Integration Guide](/docs/integrations/retrievers/graph_rag).

View File

@@ -0,0 +1,129 @@
# LangFair: Use-Case Level LLM Bias and Fairness Assessments
LangFair is a comprehensive Python library designed for conducting bias and fairness assessments of large language model (LLM) use cases. The LangFair [repository](https://github.com/cvs-health/langfair) includes a comprehensive framework for [choosing bias and fairness metrics](https://github.com/cvs-health/langfair/tree/main#-choosing-bias-and-fairness-metrics-for-an-llm-use-case), along with [demo notebooks](https://github.com/cvs-health/langfair/tree/main/examples) and a [technical playbook](https://arxiv.org/abs/2407.10853) that discusses LLM bias and fairness risks, evaluation metrics, and best practices.
Explore our [documentation site](https://cvs-health.github.io/langfair/) for detailed instructions on using LangFair.
## ⚡ Quickstart Guide
### (Optional) Create a virtual environment for using LangFair
We recommend creating a new virtual environment using venv before installing LangFair. To do so, please follow instructions [here](https://docs.python.org/3/library/venv.html).
### Installing LangFair
The latest version can be installed from PyPI:
```bash
pip install langfair
```
### Usage Examples
Below are code samples illustrating how to use LangFair to assess bias and fairness risks in text generation and summarization use cases. The below examples assume the user has already defined a list of prompts from their use case, `prompts`.
##### Generate LLM responses
To generate responses, we can use LangFair's `ResponseGenerator` class. First, we must create a `langchain` LLM object. Below we use `ChatVertexAI`, but **any of [LangChains LLM classes](https://js.langchain.com/docs/integrations/chat/) may be used instead**. Note that `InMemoryRateLimiter` is to used to avoid rate limit errors.
```python
from langchain_google_vertexai import ChatVertexAI
from langchain_core.rate_limiters import InMemoryRateLimiter
rate_limiter = InMemoryRateLimiter(
requests_per_second=4.5, check_every_n_seconds=0.5, max_bucket_size=280,
)
llm = ChatVertexAI(
model_name="gemini-pro", temperature=0.3, rate_limiter=rate_limiter
)
```
We can use `ResponseGenerator.generate_responses` to generate 25 responses for each prompt, as is convention for toxicity evaluation.
```python
from langfair.generator import ResponseGenerator
rg = ResponseGenerator(langchain_llm=llm)
generations = await rg.generate_responses(prompts=prompts, count=25)
responses = generations["data"]["response"]
duplicated_prompts = generations["data"]["prompt"] # so prompts correspond to responses
```
##### Compute toxicity metrics
Toxicity metrics can be computed with `ToxicityMetrics`. Note that use of `torch.device` is optional and should be used if GPU is available to speed up toxicity computation.
```python
# import torch # uncomment if GPU is available
# device = torch.device("cuda") # uncomment if GPU is available
from langfair.metrics.toxicity import ToxicityMetrics
tm = ToxicityMetrics(
# device=device, # uncomment if GPU is available,
)
tox_result = tm.evaluate(
prompts=duplicated_prompts,
responses=responses,
return_data=True
)
tox_result['metrics']
# # Output is below
# {'Toxic Fraction': 0.0004,
# 'Expected Maximum Toxicity': 0.013845130120171235,
# 'Toxicity Probability': 0.01}
```
##### Compute stereotype metrics
Stereotype metrics can be computed with `StereotypeMetrics`.
```python
from langfair.metrics.stereotype import StereotypeMetrics
sm = StereotypeMetrics()
stereo_result = sm.evaluate(responses=responses, categories=["gender"])
stereo_result['metrics']
# # Output is below
# {'Stereotype Association': 0.3172750176745329,
# 'Cooccurrence Bias': 0.44766333654278373,
# 'Stereotype Fraction - gender': 0.08}
```
##### Generate counterfactual responses and compute metrics
We can generate counterfactual responses with `CounterfactualGenerator`.
```python
from langfair.generator.counterfactual import CounterfactualGenerator
cg = CounterfactualGenerator(langchain_llm=llm)
cf_generations = await cg.generate_responses(
prompts=prompts, attribute='gender', count=25
)
male_responses = cf_generations['data']['male_response']
female_responses = cf_generations['data']['female_response']
```
Counterfactual metrics can be easily computed with `CounterfactualMetrics`.
```python
from langfair.metrics.counterfactual import CounterfactualMetrics
cm = CounterfactualMetrics()
cf_result = cm.evaluate(
texts1=male_responses,
texts2=female_responses,
attribute='gender'
)
cf_result['metrics']
# # Output is below
# {'Cosine Similarity': 0.8318708,
# 'RougeL Similarity': 0.5195852482361165,
# 'Bleu Similarity': 0.3278433712872481,
# 'Sentiment Bias': 0.0009947145187601957}
```
##### Alternative approach: Semi-automated evaluation with `AutoEval`
To streamline assessments for text generation and summarization use cases, the `AutoEval` class conducts a multi-step process that completes all of the aforementioned steps with two lines of code.
```python
from langfair.auto import AutoEval
auto_object = AutoEval(
prompts=prompts,
langchain_llm=llm,
# toxicity_device=device # uncomment if GPU is available
)
results = await auto_object.evaluate()
results['metrics']
# # Output is below
# {'Toxicity': {'Toxic Fraction': 0.0004,
# 'Expected Maximum Toxicity': 0.013845130120171235,
# 'Toxicity Probability': 0.01},
# 'Stereotype': {'Stereotype Association': 0.3172750176745329,
# 'Cooccurrence Bias': 0.44766333654278373,
# 'Stereotype Fraction - gender': 0.08,
# 'Expected Maximum Stereotype - gender': 0.60355167388916,
# 'Stereotype Probability - gender': 0.27036},
# 'Counterfactual': {'male-female': {'Cosine Similarity': 0.8318708,
# 'RougeL Similarity': 0.5195852482361165,
# 'Bleu Similarity': 0.3278433712872481,
# 'Sentiment Bias': 0.0009947145187601957}}}
```

View File

@@ -27,18 +27,12 @@
"If you'd like to learn more about Nimble, visit us at [nimbleway.com](https://www.nimbleway.com/).\n",
"\n",
"\n",
"## Currently we expose the following components\n",
"\n",
"* **Retriever** - Allow us to query the internet and get parsed textual results utilizing several search engines.\n",
"\n",
"\n"
"## Retrievers:"
]
},
{
"cell_type": "markdown",
"source": [
"## Usage"
],
"source": "### NimbleSearchRetriever",
"metadata": {
"id": "AuMFgVFrKbNH"
},
@@ -47,7 +41,9 @@
{
"cell_type": "markdown",
"source": [
"In order to use our provider you have to provide an API key like so"
"Enables developers to build RAG applications and AI Agents that can search, access, and retrieve online information from anywhere on the web.\n",
"\n",
"We need to install the `langchain-nimble` python package."
],
"metadata": {
"id": "sFlPjZX9KdK6"
@@ -55,25 +51,32 @@
"id": "sFlPjZX9KdK6"
},
{
"metadata": {},
"cell_type": "code",
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"NIMBLE_API_KEY\"] = getpass.getpass()"
],
"metadata": {
"id": "eAqSHZ-Z8R3F"
},
"id": "eAqSHZ-Z8R3F",
"outputs": [],
"execution_count": null,
"outputs": []
"source": "%pip install -U langchain-nimble",
"id": "65f237c852aa3885"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "See a [usage example](/docs/integrations/retrievers/nimble/).",
"id": "77bd7b9a6a8e381b"
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"```python\n",
"from langchain_nimble import NimbeSearchRetriever\n",
"```"
],
"id": "511f9d569c21a5d2"
},
{
"cell_type": "markdown",
"source": [
"For more information about the Authentication process, see [Nimble APIs Authentication Documentation](https://docs.nimbleway.com/nimble-sdk/web-api/nimble-web-api-quick-start-guide/nimble-apis-authentication)."
],
"source": "Note that authentication is required, please refer to the [Setup section in the documentation](/docs/integrations/retrievers/nimble/#setup).",
"metadata": {
"id": "WfwnI_RS8PO5"
},

View File

@@ -0,0 +1,19 @@
# Salesforce
[Salesforce](https://www.salesforce.com/) is a cloud-based software company that
provides customer relationship management (CRM) solutions and a suite of enterprise
applications focused on sales, customer service, marketing automation, and analytics.
[langchain-salesforce](https://pypi.org/project/langchain-salesforce/) implements
tools enabling LLMs to interact with Salesforce data.
## Installation and Setup
```bash
pip install langchain-salesforce
```
## Tools
See detail on available tools [here](/docs/integrations/tools/salesforce/).

View File

@@ -81,6 +81,13 @@
"llm.invoke(\"Tell me a joke about artificial intelligence.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For a more detailed walkthrough of the ChatSambaNovaCloud component, see [this notebook](https://python.langchain.com/docs/integrations/chat/sambanova/)"
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -93,6 +100,13 @@
"llm.invoke(\"Tell me a joke about artificial intelligence.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For a more detailed walkthrough of the ChatSambaStudio component, see [this notebook](https://python.langchain.com/docs/integrations/chat/sambastudio/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -116,7 +130,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"API Reference [langchain-sambanova](https://python.langchain.com/api_reference/sambanova/index.html)"
"For a more detailed walkthrough of the SambaStudioEmbeddings component, see [this notebook](https://python.langchain.com/docs/integrations/text_embedding/sambanova/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"API Reference [langchain-sambanova](https://docs.sambanova.ai/cloud/api-reference)"
]
}
],

View File

@@ -0,0 +1,379 @@
---
sidebar_label: Graph RAG
description: Graph traversal over any Vector Store using document metadata.
---
import ChatModelTabs from "@theme/ChatModelTabs";
import EmbeddingTabs from "@theme/EmbeddingTabs";
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Graph RAG
This guide provides an introduction to Graph RAG. For detailed documentation of all
supported features and configurations, refer to the
[Graph RAG Project Page](https://datastax.github.io/graph-rag/).
## Overview
The `GraphRetriever` from the `langchain-graph-retriever` package provides a LangChain
[retriever](/docs/concepts/retrievers/) that combines **unstructured** similarity search
on vectors with **structured** traversal of metadata properties. This enables graph-based
retrieval over an **existing** vector store.
### Integration details
| Retriever | Source | PyPI Package | Latest | Project Page |
| :--- | :--- | :---: | :---: | :---: |
| GraphRetriever | [github.com/datastax/graph-rag](https://github.com/datastax/graph-rag/tree/main/packages/langchain-graph-retriever) | [langchain-graph-retriever](https://pypi.org/project/langchain-graph-retriever/) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-graph-retriever?style=flat-square&label=%20&color=orange) | [Graph RAG](https://datastax.github.io/graph-rag/) |
## Benefits
* [**Link based on existing metadata:**](https://datastax.github.io/graph-rag/get-started/)
Use existing metadata fields without additional processing. Retrieve more from an
existing vector store!
* [**Change links on demand:**](https://datastax.github.io/graph-rag/get-started/edges/)
Edges can be specified on-the-fly, allowing different relationships to be traversed
based on the question.
* [**Pluggable Traversal Strategies:**](https://datastax.github.io/graph-rag/get-started/strategies/)
Use built-in traversal strategies like Eager or MMR, or define custom logic to select
which nodes to explore.
* [**Broad compatibility:**](https://datastax.github.io/graph-rag/get-started/adapters/)
Adapters are available for a variety of vector stores with support for additional
stores easily added.
## Setup
### Installation
This retriever lives in the `langchain-graph-retriever` package.
```bash
pip install -qU langchain-graph-retriever
```
## Instantiation
The following examples will show how to perform graph traversal over some sample
Documents about animals.
### Prerequisites
<details>
<summary>Toggle for Details</summary>
<div>
1. Ensure you have Python 3.10+ installed
1. Install the following package that provides sample data.
```bash
pip install -qU graph_rag_example_helpers
```
1. Download the test documents:
```python
from graph_rag_example_helpers.datasets.animals import fetch_documents
animals = fetch_documents()
```
1. <EmbeddingTabs/>
</div>
</details>
### Populating the Vector store
This section shows how to populate a variety of vector stores with the sample data.
For help on choosing one of the vector stores below, or to add support for your
vector store, consult the documentation about
[Adapters and Supported Stores](https://datastax.github.io/graph-rag/guide/adapters/).
<Tabs groupId="vector-store" queryString>
<TabItem value="astra-db" label="AstraDB" default>
<div style={{ paddingLeft: '30px' }}>
Install the `langchain-graph-retriever` package with the `astra` extra:
```bash
pip install "langchain-graph-retriever[astra]"
```
Then create a vector store and load the test documents:
```python
from langchain_astradb import AstraDBVectorStore
vector_store = AstraDBVectorStore.from_documents(
documents=animals,
embedding=embeddings,
collection_name="animals",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
```
For the `ASTRA_DB_API_ENDPOINT` and `ASTRA_DB_APPLICATION_TOKEN` credentials,
consult the [AstraDB Vector Store Guide](/docs/integrations/vectorstores/astradb).
:::note
For faster initial testing, consider using the **InMemory** Vector Store.
:::
</div>
</TabItem>
<TabItem value="cassandra" label="Apache Cassandra">
<div style={{ paddingLeft: '30px' }}>
Install the `langchain-graph-retriever` package with the `cassandra` extra:
```bash
pip install "langchain-graph-retriever[cassandra]"
```
Then create a vector store and load the test documents:
```python
from langchain_community.vectorstores.cassandra import Cassandra
from langchain_graph_retriever.transformers import ShreddingTransformer
vector_store = Cassandra.from_documents(
documents=list(ShreddingTransformer().transform_documents(animals)),
embedding=embeddings,
table_name="animals",
)
```
For help creating a Cassandra connection, consult the
[Apache Cassandra Vector Store Guide](/docs/integrations/vectorstores/cassandra#connection-parameters)
:::note
Apache Cassandra doesn't support searching in nested metadata. Because of this
it is necessary to use the [`ShreddingTransformer`](https://datastax.github.io/graph-rag/reference/langchain_graph_retriever/transformers/#langchain_graph_retriever.transformers.shredding.ShreddingTransformer)
when inserting documents.
:::
</div>
</TabItem>
<TabItem value="opensearch" label="OpenSearch">
<div style={{ paddingLeft: '30px' }}>
Install the `langchain-graph-retriever` package with the `opensearch` extra:
```bash
pip install "langchain-graph-retriever[opensearch]"
```
Then create a vector store and load the test documents:
```python
from langchain_community.vectorstores import OpenSearchVectorSearch
vector_store = OpenSearchVectorSearch.from_documents(
documents=animals,
embedding=embeddings,
engine="faiss",
index_name="animals",
opensearch_url=OPEN_SEARCH_URL,
bulk_size=500,
)
```
For help creating an OpenSearch connection, consult the
[OpenSearch Vector Store Guide](/docs/integrations/vectorstores/opensearch).
</div>
</TabItem>
<TabItem value="chroma" label="Chroma">
<div style={{ paddingLeft: '30px' }}>
Install the `langchain-graph-retriever` package with the `chroma` extra:
```bash
pip install "langchain-graph-retriever[chroma]"
```
Then create a vector store and load the test documents:
```python
from langchain_chroma.vectorstores import Chroma
from langchain_graph_retriever.transformers import ShreddingTransformer
vector_store = Chroma.from_documents(
documents=list(ShreddingTransformer().transform_documents(animals)),
embedding=embeddings,
collection_name="animals",
)
```
For help creating an Chroma connection, consult the
[Chroma Vector Store Guide](/docs/integrations/vectorstores/chroma).
:::note
Chroma doesn't support searching in nested metadata. Because of this
it is necessary to use the [`ShreddingTransformer`](https://datastax.github.io/graph-rag/reference/langchain_graph_retriever/transformers/#langchain_graph_retriever.transformers.shredding.ShreddingTransformer)
when inserting documents.
:::
</div>
</TabItem>
<TabItem value="in-memory" label="InMemory" default>
<div style={{ paddingLeft: '30px' }}>
Install the `langchain-graph-retriever` package:
```bash
pip install "langchain-graph-retriever"
```
Then create a vector store and load the test documents:
```python
from langchain_core.vectorstores import InMemoryVectorStore
vector_store = InMemoryVectorStore.from_documents(
documents=animals,
embedding=embeddings,
)
```
:::tip
Using the `InMemoryVectorStore` is the fastest way to get started with Graph RAG
but it isn't recommended for production use. Instead it is recommended to use
**AstraDB** or **OpenSearch**.
:::
</div>
</TabItem>
</Tabs>
### Graph Traversal
This graph retriever starts with a single animal that best matches the query, then
traverses to other animals sharing the same `habitat` and/or `origin`.
```python
from graph_retriever.strategies import Eager
from langchain_graph_retriever import GraphRetriever
traversal_retriever = GraphRetriever(
store = vector_store,
edges = [("habitat", "habitat"), ("origin", "origin")],
strategy = Eager(k=5, start_k=1, max_depth=2),
)
```
The above creates a graph traversing retriever that starts with the nearest
animal (`start_k=1`), retrieves 5 documents (`k=5`) and limits the search to documents
that are at most 2 steps away from the first animal (`max_depth=2`).
The `edges` define how metadata values can be used for traversal. In this case, every
animal is connected to other animals with the same `habitat` and/or `origin`.
```python
results = traversal_retriever.invoke("what animals could be found near a capybara?")
for doc in results:
print(f"{doc.id}: {doc.page_content}")
```
```output
capybara: capybaras are the largest rodents in the world and are highly social animals.
heron: herons are wading birds known for their long legs and necks, often seen near water.
crocodile: crocodiles are large reptiles with powerful jaws and a long lifespan, often living over 70 years.
frog: frogs are amphibians known for their jumping ability and croaking sounds.
duck: ducks are waterfowl birds known for their webbed feet and quacking sounds.
```
Graph traversal improves retrieval quality by leveraging structured relationships in
the data. Unlike standard similarity search (see below), it provides a clear,
explainable rationale for why documents are selected.
In this case, the documents `capybara`, `heron`, `frog`, `crocodile`, and `newt` all
share the same `habitat=wetlands`, as defined by their metadata. This should increase
Document Relevance and the quality of the answer from the LLM.
### Comparison to Standard Retrieval
When `max_depth=0`, the graph traversing retriever behaves like a standard retriever:
```python
standard_retriever = GraphRetriever(
store = vector_store,
edges = [("habitat", "habitat"), ("origin", "origin")],
strategy = Eager(k=5, start_k=5, max_depth=0),
)
```
This creates a retriever that starts with the nearest 5 animals (`start_k=5`),
and returns them without any traversal (`max_depth=0`). The edge definitions
are ignored in this case.
This is essentially the same as:
```python
standard_retriever = vector_store.as_retriever(search_kwargs={"k":5})
```
For either case, invoking the retriever returns:
```python
results = standard_retriever.invoke("what animals could be found near a capybara?")
for doc in results:
print(f"{doc.id}: {doc.page_content}")
```
```output
capybara: capybaras are the largest rodents in the world and are highly social animals.
iguana: iguanas are large herbivorous lizards often found basking in trees and near water.
guinea pig: guinea pigs are small rodents often kept as pets due to their gentle and social nature.
hippopotamus: hippopotamuses are large semi-aquatic mammals known for their massive size and territorial behavior.
boar: boars are wild relatives of pigs, known for their tough hides and tusks.
```
These documents are joined based on similarity alone. Any structural data that existed
in the store is ignored. As compared to graph retrieval, this can decrease Document
Relevance because the returned results have a lower chance of being helpful to answer
the query.
## Usage
Following the examples above, `.invoke` is used to initiate retrieval on a query.
## Use within a chain
Like other retrievers, `GraphRetriever` can be incorporated into LLM applications
via [chains](/docs/how_to/sequence/).
<ChatModelTabs customVarName="llm" />
```python
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
prompt = ChatPromptTemplate.from_template(
"""Answer the question based only on the context provided.
Context: {context}
Question: {question}"""
)
def format_docs(docs):
return "\n\n".join(f"text: {doc.page_content} metadata: {doc.metadata}" for doc in docs)
chain = (
{"context": traversal_retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
```
```python
chain.invoke("what animals could be found near a capybara?")
```
```output
Animals that could be found near a capybara include herons, crocodiles, frogs,
and ducks, as they all inhabit wetlands.
```
## API reference
To explore all available parameters and advanced configurations, refer to the
[Graph RAG API reference](https://datastax.github.io/graph-rag/reference/).

File diff suppressed because one or more lines are too long

View File

@@ -47,7 +47,7 @@
"import os\n",
"\n",
"if \"GOOGLE_API_KEY\" not in os.environ:\n",
" os.environ[\"GOOGLE_API_KEY\"] = getpass(\"Provide your Google API key here\")"
" os.environ[\"GOOGLE_API_KEY\"] = getpass.getpass(\"Provide your Google API key here\")"
]
},
{
@@ -78,7 +78,7 @@
"source": [
"from langchain_google_genai import GoogleGenerativeAIEmbeddings\n",
"\n",
"embeddings = GoogleGenerativeAIEmbeddings(model=\"models/embedding-001\")\n",
"embeddings = GoogleGenerativeAIEmbeddings(model=\"models/text-embedding-004\")\n",
"vector = embeddings.embed_query(\"hello, world!\")\n",
"vector[:5]"
]

View File

@@ -21,16 +21,16 @@
"source": [
"# SambaStudioEmbeddings\n",
"\n",
"This will help you get started with SambaNova's SambaStudio embedding models using LangChain. For detailed documentation on `SambaStudioEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/sambanova/embeddings/langchain_sambanova.embeddingsSambaStudioEmbeddings.html).\n",
"This will help you get started with SambaNova's SambaStudio embedding models using LangChain. For detailed documentation on `SambaStudioEmbeddings` features and configuration options, please refer to the [API reference](https://docs.sambanova.ai/sambastudio/latest/index.html).\n",
"\n",
"**[SambaNova](https://sambanova.ai/)'s** [Sambastudio](https://sambanova.ai/technology/full-stack-ai-platform) is a platform for running your own open-source models\n",
"**[SambaNova](https://sambanova.ai/)'s** [SambaStudio](https://sambanova.ai/technology/full-stack-ai-platform) is a platform for running your own open-source models\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"| Provider | Package |\n",
"|:--------:|:-------:|\n",
"| [SambaNova](/docs/integrations/providers/sambanova/) | [langchain-sambanova](https://python.langchain.com/api_reference/langchain_sambanova/embeddings/langchain_sambanova.embeddings.SambaStudioEmbeddings.html) |\n",
"| [SambaNova](/docs/integrations/providers/sambanova/) | [langchain-sambanova](https://python.langchain.com/docs/integrations/providers/sambanova/) |\n",
"\n",
"## Setup\n",
"\n",
@@ -227,7 +227,7 @@
"source": [
"## API Reference\n",
"\n",
"For detailed documentation on `SambaNovaEmbeddings` features and configuration options, please refer to the [API reference](https://python.langchain.com/api_reference/langchain_sambanova/embeddings/langchain_sambanova.embeddings.SambaStudioEmbeddings.html).\n"
"For detailed documentation on `SambaStudio` features and configuration options, please refer to the [API reference](https://docs.sambanova.ai/sambastudio/latest/api-ref-landing.html).\n"
]
}
],

View File

@@ -0,0 +1,256 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "_9MNj58sIkGN"
},
"source": [
"# Apify Actor\n",
"\n",
"## Overview\n",
"\n",
">[Apify Actors](https://docs.apify.com/platform/actors) are cloud programs designed for a wide range of web scraping, crawling, and data extraction tasks. These actors facilitate automated data gathering from the web, enabling users to extract, process, and store information efficiently. Actors can be used to perform tasks like scraping e-commerce sites for product details, monitoring price changes, or gathering search engine results. They integrate seamlessly with [Apify Datasets](https://docs.apify.com/platform/storage/dataset), allowing the structured data collected by actors to be stored, managed, and exported in formats like JSON, CSV, or Excel for further analysis or use.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OHLF9t9v9HCb"
},
"source": [
"## Setup\n",
"\n",
"This integration lives in the [langchain-apify](https://pypi.org/project/langchain-apify/) package. The package can be installed using pip.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "4DdGmBn5IbXz"
},
"outputs": [],
"source": [
"%pip install langchain-apify"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rEAwonXqwggR"
},
"source": [
"### Prerequisites\n",
"\n",
"- **Apify account**: Register your free Apify account [here](https://console.apify.com/sign-up).\n",
"- **Apify API token**: Learn how to get your API token in the [Apify documentation](https://docs.apify.com/platform/integrations/api)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9nJOl4MBMkcR"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"APIFY_API_TOKEN\"] = \"your-apify-api-token\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UfoQxAlCxR9q"
},
"source": [
"## Instantiation"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qG9KtXtLM8i7"
},
"source": [
"Here we instantiate the `ApifyActorsTool` to be able to call [RAG Web Browser](https://apify.com/apify/rag-web-browser) Apify Actor. This Actor provides web browsing functionality for AI and LLM applications, similar to the web browsing feature in ChatGPT. Any Actor from the [Apify Store](https://apify.com/store) can be used in this way."
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"id": "cyxeTlPnM4Ya"
},
"outputs": [],
"source": [
"from langchain_apify import ApifyActorsTool\n",
"\n",
"tool = ApifyActorsTool(\"apify/rag-web-browser\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fGDLvDCqyKWO"
},
"source": [
"## Invocation\n",
"\n",
"The `ApifyActorsTool` takes a single argument, which is `run_input` - a dictionary that is passed as a run input to the Actor. Run input schema documentation can be found in the input section of the Actor details page. See [RAG Web Browser input schema](https://apify.com/apify/rag-web-browser/input-schema).\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "nTWy6Hx1yk04"
},
"outputs": [],
"source": [
"tool.invoke({\"run_input\": {\"query\": \"what is apify?\", \"maxResults\": 2}})"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kQsa27hoO58S"
},
"source": [
"## Chaining\n",
"\n",
"We can provide the created tool to an [agent](https://python.langchain.com/docs/tutorials/agents/). When asked to search for information, the agent will call the Apify Actor, which will search the web, and then retrieve the search results.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YySvLskW72Y8"
},
"outputs": [],
"source": [
"%pip install langgraph langchain-openai"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"id": "QEDz07btO5Gi"
},
"outputs": [],
"source": [
"from langchain_core.messages import ToolMessage\n",
"from langchain_openai import ChatOpenAI\n",
"from langgraph.prebuilt import create_react_agent\n",
"\n",
"model = ChatOpenAI(model=\"gpt-4o\")\n",
"tools = [tool]\n",
"graph = create_react_agent(model, tools=tools)"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XS1GEyNkQxGu",
"outputId": "195273d7-034c-425b-f3f9-95c0a9fb0c9e"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"================================\u001b[1m Human Message \u001b[0m=================================\n",
"\n",
"search for what is Apify\n",
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
"Tool Calls:\n",
" apify_actor_apify_rag-web-browser (call_27mjHLzDzwa5ZaHWCMH510lm)\n",
" Call ID: call_27mjHLzDzwa5ZaHWCMH510lm\n",
" Args:\n",
" run_input: {\"run_input\":{\"query\":\"Apify\",\"maxResults\":3,\"outputFormats\":[\"markdown\"]}}\n",
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
"\n",
"Apify is a comprehensive platform for web scraping, browser automation, and data extraction. It offers a wide array of tools and services that cater to developers and businesses looking to extract data from websites efficiently and effectively. Here's an overview of Apify:\n",
"\n",
"1. **Ecosystem and Tools**:\n",
" - Apify provides an ecosystem where developers can build, deploy, and publish data extraction and web automation tools called Actors.\n",
" - The platform supports various use cases such as extracting data from social media platforms, conducting automated browser-based tasks, and more.\n",
"\n",
"2. **Offerings**:\n",
" - Apify offers over 3,000 ready-made scraping tools and code templates.\n",
" - Users can also build custom solutions or hire Apify's professional services for more tailored data extraction needs.\n",
"\n",
"3. **Technology and Integration**:\n",
" - The platform supports integration with popular tools and services like Zapier, GitHub, Google Sheets, Pinecone, and more.\n",
" - Apify supports open-source tools and technologies such as JavaScript, Python, Puppeteer, Playwright, Selenium, and its own Crawlee library for web crawling and browser automation.\n",
"\n",
"4. **Community and Learning**:\n",
" - Apify hosts a community on Discord where developers can get help and share expertise.\n",
" - It offers educational resources through the Web Scraping Academy to help users become proficient in data scraping and automation.\n",
"\n",
"5. **Enterprise Solutions**:\n",
" - Apify provides enterprise-grade web data extraction solutions with high reliability, 99.95% uptime, and compliance with SOC2, GDPR, and CCPA standards.\n",
"\n",
"For more information, you can visit [Apify's official website](https://apify.com/) or their [GitHub page](https://github.com/apify) which contains their code repositories and further details about their projects.\n"
]
}
],
"source": [
"inputs = {\"messages\": [(\"user\", \"search for what is Apify\")]}\n",
"for s in graph.stream(inputs, stream_mode=\"values\"):\n",
" message = s[\"messages\"][-1]\n",
" # skip tool messages\n",
" if isinstance(message, ToolMessage):\n",
" continue\n",
" message.pretty_print()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WYXuQIQx8AvG"
},
"source": [
"## API reference\n",
"\n",
"For more information on how to use this integration, see the [git repository](https://github.com/apify/langchain-apify) or the [Apify integration documentation](https://docs.apify.com/platform/integrations/langgraph)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "f1NnMik78oib"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

View File

@@ -66,21 +66,20 @@
"metadata": {},
"outputs": [],
"source": [
"from databricks.sdk import WorkspaceClient\n",
"from langchain_community.tools.databricks import UCFunctionToolkit\n",
"from databricks_langchain.uc_ai import (\n",
" DatabricksFunctionClient,\n",
" UCFunctionToolkit,\n",
" set_uc_function_client,\n",
")\n",
"\n",
"tools = (\n",
" UCFunctionToolkit(\n",
" # You can find the SQL warehouse ID in its UI after creation.\n",
" warehouse_id=\"xxxx123456789\"\n",
" )\n",
" .include(\n",
" # Include functions as tools using their qualified names.\n",
" # You can use \"{catalog_name}.{schema_name}.*\" to get all functions in a schema.\n",
" \"main.tools.python_exec\",\n",
" )\n",
" .get_tools()\n",
")"
"client = DatabricksFunctionClient()\n",
"set_uc_function_client(client)\n",
"\n",
"tools = UCFunctionToolkit(\n",
" # Include functions as tools using their qualified names.\n",
" # You can use \"{catalog_name}.{schema_name}.*\" to get all functions in a schema.\n",
" function_names=[\"main.tools.python_exec\"]\n",
").tools"
]
},
{

View File

@@ -0,0 +1,257 @@
{
"cells": [
{
"cell_type": "raw",
"id": "10238e62-3465-4973-9279-606cbb7ccf16",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Discord\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "a6f91f20",
"metadata": {},
"source": [
"# Discord\n",
"\n",
"This notebook provides a quick overview for getting started with Discord tooling in [langchain_discord](/docs/integrations/tools/). For more details on each tool and configuration, see the docstrings in your repository or relevant doc pages.\n",
"\n",
"## Overview\n",
"\n",
"### Integration details\n",
"\n",
"| Class | Package | Serializable | [JS support](https://js.langchain.com/docs/integrations/tools/langchain_discord) | Package latest |\n",
"| :--- |:------------------------------------------------------------------------| :---: | :---: |:-------------------------------------------------------------------------------------------------------:|\n",
"| `DiscordReadMessages`, `DiscordSendMessage` | [langchain-discord-shikenso](https://github.com/Shikenso-Analytics/langchain-discord) | N/A | TBD | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-discord-shikenso?style=flat-square&label=%20) |\n",
"\n",
"### Tool features\n",
"\n",
"- **`DiscordReadMessages`**: Reads messages from a specified channel.\n",
"- **`DiscordSendMessage`**: Sends messages to a specified channel.\n",
"\n",
"## Setup\n",
"\n",
"The integration is provided by the `langchain-discord-shikenso` package. Install it as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f85b4089",
"metadata": {},
"outputs": [],
"source": [
"%pip install --quiet -U langchain-discord-shikenso"
]
},
{
"cell_type": "markdown",
"id": "b15e9266",
"metadata": {},
"source": [
"### Credentials\n",
"\n",
"This integration requires you to set `DISCORD_BOT_TOKEN` as an environment variable to authenticate with the Discord API.\n",
"\n",
"```bash\n",
"export DISCORD_BOT_TOKEN=\"your-bot-token\"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0b178a2-8816-40ca-b57c-ccdd86dde9c9",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"# Example prompt to set your token if not already set:\n",
"# if not os.environ.get(\"DISCORD_BOT_TOKEN\"):\n",
"# os.environ[\"DISCORD_BOT_TOKEN\"] = getpass.getpass(\"DISCORD Bot Token:\\n\")"
]
},
{
"cell_type": "markdown",
"id": "bc5ab717-fd27-4c59-b912-bdd099541478",
"metadata": {},
"source": "You can optionally set up [LangSmith](https://smith.langchain.com/) for tracing or observability:"
},
{
"cell_type": "code",
"execution_count": null,
"id": "a6c2f136-6367-4f1f-825d-ae741e1bf281",
"metadata": {},
"outputs": [],
"source": [
"# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
]
},
{
"cell_type": "markdown",
"id": "1c97218f-f366-479d-8bf7-fe9f2f6df73f",
"metadata": {},
"source": [
"## Instantiation\n",
"\n",
"Below is an example showing how to instantiate the Discord tools in `langchain_discord`. Adjust as needed for your specific usage."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b3ddfe9-ca79-494c-a7ab-1f56d9407a64",
"metadata": {},
"outputs": [],
"source": [
"from langchain_discord.tools.discord_read_messages import DiscordReadMessages\n",
"from langchain_discord.tools.discord_send_messages import DiscordSendMessage\n",
"\n",
"read_tool = DiscordReadMessages()\n",
"send_tool = DiscordSendMessage()\n",
"\n",
"# Example usage:\n",
"# response = read_tool({\"channel_id\": \"1234567890\", \"limit\": 5})\n",
"# print(response)\n",
"#\n",
"# send_result = send_tool({\"message\": \"Hello from notebook!\", \"channel_id\": \"1234567890\"})\n",
"# print(send_result)"
]
},
{
"cell_type": "markdown",
"id": "74147a1a",
"metadata": {},
"source": [
"## Invocation\n",
"\n",
"### Direct invocation with args\n",
"\n",
"Below is a simple example of calling the tool with keyword arguments in a dictionary."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "65310a8b-eb0c-4d9e-a618-4f4abe2414fc",
"metadata": {},
"outputs": [],
"source": [
"invocation_args = {\"channel_id\": \"1234567890\", \"limit\": 3}\n",
"response = read_tool(invocation_args)\n",
"response"
]
},
{
"cell_type": "markdown",
"id": "d6e73897",
"metadata": {},
"source": [
"### Invocation with ToolCall\n",
"\n",
"If you have a model-generated `ToolCall`, pass it to `tool.invoke()` in the format shown below."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f90e33a7",
"metadata": {},
"outputs": [],
"source": [
"tool_call = {\n",
" \"args\": {\"channel_id\": \"1234567890\", \"limit\": 2},\n",
" \"id\": \"1\",\n",
" \"name\": read_tool.name,\n",
" \"type\": \"tool_call\",\n",
"}\n",
"\n",
"tool.invoke(tool_call)"
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": [
"## Chaining\n",
"\n",
"Below is a more complete example showing how you might integrate the `DiscordReadMessages` and `DiscordSendMessage` tools in a chain or agent with an LLM. This example assumes you have a function (like `create_react_agent`) that sets up a LangChain-style agent capable of calling tools when appropriate.\n",
"\n",
"```python\n",
"# Example: Using Discord Tools in an Agent\n",
"\n",
"from langgraph.prebuilt import create_react_agent\n",
"from langchain_discord.tools.discord_read_messages import DiscordReadMessages\n",
"from langchain_discord.tools.discord_send_messages import DiscordSendMessage\n",
"\n",
"# 1. Instantiate or configure your language model\n",
"# (Replace with your actual LLM, e.g., ChatOpenAI(temperature=0))\n",
"llm = ...\n",
"\n",
"# 2. Create instances of the Discord tools\n",
"read_tool = DiscordReadMessages()\n",
"send_tool = DiscordSendMessage()\n",
"\n",
"# 3. Build an agent that has access to these tools\n",
"agent_executor = create_react_agent(llm, [read_tool, send_tool])\n",
"\n",
"# 4. Formulate a user query that may invoke one or both tools\n",
"example_query = \"Please read the last 5 messages in channel 1234567890\"\n",
"\n",
"# 5. Execute the agent in streaming mode (or however your code is structured)\n",
"events = agent_executor.stream(\n",
" {\"messages\": [(\"user\", example_query)]},\n",
" stream_mode=\"values\",\n",
")\n",
"\n",
"# 6. Print out the model's responses (and any tool outputs) as they arrive\n",
"for event in events:\n",
" event[\"messages\"][-1].pretty_print()\n",
"```"
],
"id": "659f9fbd-6fcf-445f-aa8c-72d8e60154bd"
},
{
"cell_type": "markdown",
"id": "4c01b53ad063d2c",
"metadata": {},
"source": [
"## API reference\n",
"\n",
"See the docstrings in:\n",
"- [discord_read_messages.py](https://github.com/Shikenso-Analytics/langchain-discord/blob/main/langchain_discord/tools/discord_read_messages.py)\n",
"- [discord_send_messages.py](https://github.com/Shikenso-Analytics/langchain-discord/blob/main/langchain_discord/tools/discord_send_messages.py)\n",
"- [toolkits.py](https://github.com/Shikenso-Analytics/langchain-discord/blob/main/langchain_discord/toolkits.py)\n",
"\n",
"for usage details, parameters, and advanced configurations."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "poetry-venv-311",
"language": "python",
"name": "poetry-venv-311"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -5,7 +5,7 @@
"id": "a991a6f8-1897-4f49-a191-ae3bdaeda856",
"metadata": {},
"source": [
"# Eleven Labs Text2Speech\n",
"# ElevenLabs Text2Speech\n",
"\n",
"This notebook shows how to interact with the `ElevenLabs API` to achieve text-to-speech capabilities."
]
@@ -37,7 +37,7 @@
"source": [
"import os\n",
"\n",
"os.environ[\"ELEVEN_API_KEY\"] = \"\""
"os.environ[\"ELEVENLABS_API_KEY\"] = \"\""
]
},
{

View File

@@ -64,7 +64,10 @@
"outputs": [],
"source": [
"import getpass\n",
"import os"
"import os\n",
"\n",
"if not os.environ.get(\"JINA_API_KEY\"):\n",
" os.environ[\"JINA_API_KEY\"] = getpass.getpass(\"Jina API key:\\n\")"
]
},
{

View File

@@ -0,0 +1,300 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "563f3174",
"metadata": {},
"source": [
"# Salesforce\n",
"\n",
"Tools for interacting with Salesforce.\n",
"\n",
"## Overview\n",
"\n",
"This notebook provides examples of interacting with Salesforce using LangChain.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"1. Install the required dependencies:\n",
"```bash\n",
" pip install langchain-salesforce\n",
"```\n",
"\n",
"2. Set up your Salesforce credentials as environment variables:\n",
"\n",
"```bash\n",
" export SALESFORCE_USERNAME=\"your-username\"\n",
" export SALESFORCE_PASSWORD=\"your-password\" \n",
" export SALESFORCE_SECURITY_TOKEN=\"your-security-token\"\n",
" export SALESFORCE_DOMAIN=\"test\" # Use 'test' for sandbox, remove for production\n",
"```\n",
"\n",
"These environment variables will be automatically picked up by the integration.\n",
"\n",
"## Getting Your Security Token\n",
"If you need a security token:\n",
"1. Log into Salesforce\n",
"2. Go to Settings\n",
"3. Click on \"Reset My Security Token\" under \"My Personal Information\"\n",
"4. Check your email for the new token"
]
},
{
"cell_type": "markdown",
"id": "dd32d0d8",
"metadata": {},
"source": [
"## Instantiation"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "117ecaf8",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain_salesforce import SalesforceTool\n",
"\n",
"username = os.getenv(\"SALESFORCE_USERNAME\", \"your-username\")\n",
"password = os.getenv(\"SALESFORCE_PASSWORD\", \"your-password\")\n",
"security_token = os.getenv(\"SALESFORCE_SECURITY_TOKEN\", \"your-security-token\")\n",
"domain = os.getenv(\"SALESFORCE_DOMAIN\", \"login\")\n",
"\n",
"tool = SalesforceTool(\n",
" username=username, password=password, security_token=security_token, domain=domain\n",
")"
]
},
{
"cell_type": "markdown",
"id": "28c1a13e",
"metadata": {},
"source": [
"## Invocation"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e75623af",
"metadata": {},
"outputs": [],
"source": [
"def execute_salesforce_operation(\n",
" operation, object_name=None, query=None, record_data=None, record_id=None\n",
"):\n",
" \"\"\"Executes a given Salesforce operation.\"\"\"\n",
" request = {\"operation\": operation}\n",
" if object_name:\n",
" request[\"object_name\"] = object_name\n",
" if query:\n",
" request[\"query\"] = query\n",
" if record_data:\n",
" request[\"record_data\"] = record_data\n",
" if record_id:\n",
" request[\"record_id\"] = record_id\n",
" result = tool.run(request)\n",
" return result"
]
},
{
"cell_type": "markdown",
"id": "d761883a",
"metadata": {},
"source": [
"## Query\n",
"This example queries Salesforce for 5 contacts."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5fb2e42b",
"metadata": {},
"outputs": [],
"source": [
"query_result = execute_salesforce_operation(\n",
" \"query\", query=\"SELECT Id, Name, Email FROM Contact LIMIT 5\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "b917c89e",
"metadata": {},
"source": [
"## Describe an Object\n",
"Fetches metadata for a specific Salesforce object."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ef6ca50c",
"metadata": {},
"outputs": [],
"source": [
"describe_result = execute_salesforce_operation(\"describe\", object_name=\"Account\")"
]
},
{
"cell_type": "markdown",
"id": "40ed4656",
"metadata": {},
"source": [
"## List Available Objects\n",
"Retrieves all objects available in the Salesforce instance."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a7114bbc",
"metadata": {},
"outputs": [],
"source": [
"list_objects_result = execute_salesforce_operation(\"list_objects\")"
]
},
{
"cell_type": "markdown",
"id": "6619fe12",
"metadata": {},
"source": [
"## Create a New Contact\n",
"Creates a new contact record in Salesforce."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e15980d",
"metadata": {},
"outputs": [],
"source": [
"create_result = execute_salesforce_operation(\n",
" \"create\",\n",
" object_name=\"Contact\",\n",
" record_data={\"LastName\": \"Doe\", \"Email\": \"doe@example.com\"},\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f8801882",
"metadata": {},
"source": [
"## Update a Contact\n",
"Updates an existing contact record."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f4bd54c",
"metadata": {},
"outputs": [],
"source": [
"update_result = execute_salesforce_operation(\n",
" \"update\",\n",
" object_name=\"Contact\",\n",
" record_id=\"003XXXXXXXXXXXXXXX\",\n",
" record_data={\"Email\": \"updated@example.com\"},\n",
")"
]
},
{
"cell_type": "markdown",
"id": "46dd7178",
"metadata": {},
"source": [
"## Delete a Contact\n",
"Deletes a contact record from Salesforce."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31830f80",
"metadata": {},
"outputs": [],
"source": [
"delete_result = execute_salesforce_operation(\n",
" \"delete\", object_name=\"Contact\", record_id=\"003XXXXXXXXXXXXXXX\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "7f094544",
"metadata": {},
"source": [
"## Chaining"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0e997f71",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_openai import ChatOpenAI\n",
"from langchain_salesforce import SalesforceTool\n",
"\n",
"tool = SalesforceTool(\n",
" username=username, password=password, security_token=security_token, domain=domain\n",
")\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-4o-mini\")\n",
"\n",
"prompt = PromptTemplate.from_template(\n",
" \"What is the name of the contact with the id {contact_id}?\"\n",
")\n",
"\n",
"chain = prompt | tool.invoke | llm\n",
"\n",
"result = chain.invoke({\"contact_id\": \"003XXXXXXXXXXXXXXX\"})"
]
},
{
"cell_type": "markdown",
"id": "b8467ae7",
"metadata": {},
"source": [
"## API reference\n",
"[langchain-salesforce README](https://github.com/colesmcintosh/langchain-salesforce/blob/main/README.md)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -331,7 +331,7 @@
"- Dictionary-based Filters\n",
" - You can pass a dictionary (dict) where the keys represent metadata fields and the values specify the filter condition. This method applies an equality filter between the key and the corresponding value. When multiple key-value pairs are provided, they are combined using a logical AND operation.\n",
"- SQL-based Filters\n",
" - Alternatively, you can provide a string representing an SQL WHERE clause to define more complex filtering conditions. This allows for greater flexibility, supporting SQL expressions such as comparison operators and logical operators."
" - Alternatively, you can provide a string representing an SQL WHERE clause to define more complex filtering conditions. This allows for greater flexibility, supporting SQL expressions such as comparison operators and logical operators. Learn more about [BigQuery operators](https://cloud.google.com/bigquery/docs/reference/standard-sql/operators)."
]
},
{
@@ -356,7 +356,7 @@
"source": [
"# SQL-based Filters\n",
"# This should return \"Banana\", \"Apples and oranges\" and \"Cars and airplanes\" documents.\n",
"docs = store.similarity_search_by_vector(query_vector, filter={\"len = 6 AND len > 17\"})\n",
"docs = store.similarity_search_by_vector(query_vector, filter=\"len = 6 AND len > 17\")\n",
"print(docs)"
]
},

View File

@@ -156,6 +156,15 @@
" db_name=\"vearch_cluster_langchian\",\n",
" table_name=\"tobenumone\",\n",
" flag=1,\n",
")\n",
"\n",
"# The vector data is usually already initialized, so we dont need the document parameter and can directly create the object.\n",
"vearch_cluster_b = Vearch(\n",
" embeddings,\n",
" path_or_url=\"http://test-vearch-langchain-router.vectorbase.svc.ht1.n.jd.local\",\n",
" db_name=\"vearch_cluster_langchian\",\n",
" table_name=\"tobenumone\",\n",
" flag=1,\n",
")"
]
},
@@ -244,6 +253,7 @@
],
"source": [
"query = \"你知道凌波微步吗,你知道都有谁会凌波微步?\"\n",
"# The second parameter is the top-n to retrieve, and its default value is 4.\n",
"vearch_standalone_res = vearch_standalone.similarity_search(query, 3)\n",
"for idx, tmp in enumerate(vearch_standalone_res):\n",
" print(f\"{'#'*20}第{idx+1}段相关文档{'#'*20}\\n\\n{tmp.page_content}\\n\")\n",
@@ -261,6 +271,11 @@
"for idx, tmp in enumerate(cluster_res):\n",
" print(f\"{'#'*20}第{idx+1}段相关文档{'#'*20}\\n\\n{tmp.page_content}\\n\")\n",
"\n",
"# In practical applications, we usually limit the boundary value of similarity. The following method can set this value.\n",
"cluster_res_with_bound = vearch_cluster.similarity_search_with_score(\n",
" query=query_c, k=3, min_score=0.5\n",
")\n",
"\n",
"# combine your local knowleadge and query\n",
"context_c = \"\".join([tmp.page_content for tmp in cluster_res])\n",
"new_query_c = f\"基于以下信息,尽可能准确的来回答用户的问题。背景信息:\\n {context_c} \\n 回答用户这个问题:{query_c}\\n\\n\"\n",

View File

@@ -215,7 +215,7 @@
"\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs openaiParams={`model=\"gpt-4\"`} />\n"
"<ChatModelTabs overrideParams={{openai: {model: \"gpt-4\"}}} />\n"
]
},
{

View File

@@ -108,7 +108,7 @@
"\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs openaiParams={`model=\"gpt-4o-mini\"`} />\n"
"<ChatModelTabs overrideParams={{openai: {model: \"gpt-4o-mini\"}}} />\n"
]
},
{
@@ -935,7 +935,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": ".venv",
"language": "python",
"name": "python3"
},
@@ -949,7 +949,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.11.4"
}
},
"nbformat": 4,

View File

@@ -154,7 +154,7 @@
"id": "ff3cf30d",
"metadata": {},
"source": [
"If we want dictionary output, we can just call `.dict()`"
"If we want dictionary output, we can just call `.model_dump()`"
]
},
{
@@ -179,7 +179,7 @@
"prompt = tagging_prompt.invoke({\"input\": inp})\n",
"response = llm.invoke(prompt)\n",
"\n",
"response.dict()"
"response.model_dump()"
]
},
{

View File

@@ -91,7 +91,7 @@
"\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs openaiParams={`model=\"gpt-4o-mini\"`} />\n"
"<ChatModelTabs overrideParams={{openai: {model: \"gpt-4o-mini\"}}} />\n"
]
},
{

View File

@@ -1050,6 +1050,112 @@
"graph = graph_builder.compile()"
]
},
{
"cell_type": "markdown",
"id": "28a62d34",
"metadata": {},
"source": [
"<details>\n",
"<summary>Full Code:</summary>\n",
"\n",
"```python\n",
"from typing import Literal\n",
"\n",
"import bs4\n",
"from langchain import hub\n",
"from langchain_community.document_loaders import WebBaseLoader\n",
"from langchain_core.documents import Document\n",
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"from langgraph.graph import START, StateGraph\n",
"from typing_extensions import Annotated, List, TypedDict\n",
"\n",
"# Load and chunk contents of the blog\n",
"loader = WebBaseLoader(\n",
" web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n",
" bs_kwargs=dict(\n",
" parse_only=bs4.SoupStrainer(\n",
" class_=(\"post-content\", \"post-title\", \"post-header\")\n",
" )\n",
" ),\n",
")\n",
"docs = loader.load()\n",
"\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
"all_splits = text_splitter.split_documents(docs)\n",
"\n",
"\n",
"# Update metadata (illustration purposes)\n",
"total_documents = len(all_splits)\n",
"third = total_documents // 3\n",
"\n",
"for i, document in enumerate(all_splits):\n",
" if i < third:\n",
" document.metadata[\"section\"] = \"beginning\"\n",
" elif i < 2 * third:\n",
" document.metadata[\"section\"] = \"middle\"\n",
" else:\n",
" document.metadata[\"section\"] = \"end\"\n",
"\n",
"\n",
"# Index chunks\n",
"vector_store = InMemoryVectorStore(embeddings)\n",
"_ = vector_store.add_documents(all_splits)\n",
"\n",
"\n",
"# Define schema for search\n",
"class Search(TypedDict):\n",
" \"\"\"Search query.\"\"\"\n",
"\n",
" query: Annotated[str, ..., \"Search query to run.\"]\n",
" section: Annotated[\n",
" Literal[\"beginning\", \"middle\", \"end\"],\n",
" ...,\n",
" \"Section to query.\",\n",
" ]\n",
"\n",
"# Define prompt for question-answering\n",
"prompt = hub.pull(\"rlm/rag-prompt\")\n",
"\n",
"\n",
"# Define state for application\n",
"class State(TypedDict):\n",
" question: str\n",
" query: Search\n",
" context: List[Document]\n",
" answer: str\n",
"\n",
"\n",
"def analyze_query(state: State):\n",
" structured_llm = llm.with_structured_output(Search)\n",
" query = structured_llm.invoke(state[\"question\"])\n",
" return {\"query\": query}\n",
"\n",
"\n",
"def retrieve(state: State):\n",
" query = state[\"query\"]\n",
" retrieved_docs = vector_store.similarity_search(\n",
" query[\"query\"],\n",
" filter=lambda doc: doc.metadata.get(\"section\") == query[\"section\"],\n",
" )\n",
" return {\"context\": retrieved_docs}\n",
"\n",
"\n",
"def generate(state: State):\n",
" docs_content = \"\\n\\n\".join(doc.page_content for doc in state[\"context\"])\n",
" messages = prompt.invoke({\"question\": state[\"question\"], \"context\": docs_content})\n",
" response = llm.invoke(messages)\n",
" return {\"answer\": response.content}\n",
"\n",
"\n",
"graph_builder = StateGraph(State).add_sequence([analyze_query, retrieve, generate])\n",
"graph_builder.add_edge(START, \"analyze_query\")\n",
"graph = graph_builder.compile()\n",
"```\n",
"\n",
"</details>"
]
},
{
"cell_type": "code",
"execution_count": 25,

View File

@@ -194,7 +194,7 @@
"id": "4c0766af-a3b3-4293-b253-3a10f365ab5d",
"metadata": {},
"source": [
":::hint\n",
":::tip\n",
"\n",
"This also supports streaming LLM content token by token if using langgraph >= 0.2.28.\n",
":::"

View File

@@ -36,6 +36,7 @@ def _reorder_keys(p):
"js",
"downloads",
"downloads_updated_at",
"disabled",
]
if set(keys) - set(key_order):
raise ValueError(f"Unexpected keys: {set(keys) - set(key_order)}")

View File

@@ -91,29 +91,7 @@ export const CustomDropdown = ({ selectedOption, options, onSelect, modelType })
/**
* @typedef {Object} ChatModelTabsProps - Component props.
* @property {string} [openaiParams] - Parameters for OpenAI chat model. Defaults to `model="gpt-3.5-turbo-0125"`
* @property {string} [anthropicParams] - Parameters for Anthropic chat model. Defaults to `model="claude-3-sonnet-20240229"`
* @property {string} [cohereParams] - Parameters for Cohere chat model. Defaults to `model="command-r-plus"`
* @property {string} [fireworksParams] - Parameters for Fireworks chat model. Defaults to `model="accounts/fireworks/models/mixtral-8x7b-instruct"`
* @property {string} [groqParams] - Parameters for Groq chat model. Defaults to `model="llama3-8b-8192"`
* @property {string} [mistralParams] - Parameters for Mistral chat model. Defaults to `model="mistral-large-latest"`
* @property {string} [googleParams] - Parameters for Google chat model. Defaults to `model="gemini-pro"`
* @property {string} [togetherParams] - Parameters for Together chat model. Defaults to `model="mistralai/Mixtral-8x7B-Instruct-v0.1"`
* @property {string} [nvidiaParams] - Parameters for Nvidia NIM model. Defaults to `model="meta/llama3-70b-instruct"`
* @property {string} [databricksParams] - Parameters for Databricks model. Defaults to `endpoint="databricks-meta-llama-3-1-70b-instruct"`
* @property {string} [awsBedrockParams] - Parameters for AWS Bedrock chat model.
* @property {boolean} [hideOpenai] - Whether or not to hide OpenAI chat model.
* @property {boolean} [hideAnthropic] - Whether or not to hide Anthropic chat model.
* @property {boolean} [hideCohere] - Whether or not to hide Cohere chat model.
* @property {boolean} [hideFireworks] - Whether or not to hide Fireworks chat model.
* @property {boolean} [hideGroq] - Whether or not to hide Groq chat model.
* @property {boolean} [hideMistral] - Whether or not to hide Mistral chat model.
* @property {boolean} [hideGoogle] - Whether or not to hide Google VertexAI chat model.
* @property {boolean} [hideTogether] - Whether or not to hide Together chat model.
* @property {boolean} [hideAzure] - Whether or not to hide Microsoft Azure OpenAI chat model.
* @property {boolean} [hideNvidia] - Whether or not to hide NVIDIA NIM model.
* @property {boolean} [hideAWS] - Whether or not to hide AWS models.
* @property {boolean} [hideDatabricks] - Whether or not to hide Databricks models.
* @property {Object} [overrideParams] - An object for overriding the default parameters for each chat model, e.g. `{ openai: { model: "gpt-4o-mini" } }`
* @property {string} [customVarName] - Custom variable name for the model. Defaults to `model`.
*/
@@ -121,198 +99,163 @@ export const CustomDropdown = ({ selectedOption, options, onSelect, modelType })
* @param {ChatModelTabsProps} props - Component props.
*/
export default function ChatModelTabs(props) {
const [selectedModel, setSelectedModel] = useState("Groq");
const [selectedModel, setSelectedModel] = useState("groq");
const {
openaiParams,
anthropicParams,
cohereParams,
fireworksParams,
groqParams,
mistralParams,
googleParams,
togetherParams,
azureParams,
nvidiaParams,
awsBedrockParams,
databricksParams,
hideOpenai,
hideAnthropic,
hideCohere,
hideFireworks,
hideGroq,
hideMistral,
hideGoogle,
hideTogether,
hideAzure,
hideNvidia,
hideAWS,
hideDatabricks,
overrideParams,
customVarName,
} = props;
const openAIParamsOrDefault = openaiParams ?? `model="gpt-4o-mini"`;
const anthropicParamsOrDefault =
anthropicParams ?? `model="claude-3-5-sonnet-20240620"`;
const cohereParamsOrDefault = cohereParams ?? `model="command-r-plus"`;
const fireworksParamsOrDefault =
fireworksParams ??
`model="accounts/fireworks/models/llama-v3p1-70b-instruct"`;
const groqParamsOrDefault = groqParams ?? `model="llama3-8b-8192"`;
const mistralParamsOrDefault =
mistralParams ?? `model="mistral-large-latest"`;
const googleParamsOrDefault = googleParams ?? `model="gemini-1.5-flash"`;
const togetherParamsOrDefault =
togetherParams ??
`\n base_url="https://api.together.xyz/v1",\n api_key=os.environ["TOGETHER_API_KEY"],\n model="mistralai/Mixtral-8x7B-Instruct-v0.1",\n`;
const azureParamsOrDefault =
azureParams ??
`\n azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],\n azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],\n openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],\n`;
const nvidiaParamsOrDefault = nvidiaParams ?? `model="meta/llama3-70b-instruct"`
const awsBedrockParamsOrDefault = awsBedrockParams ?? `model="anthropic.claude-3-5-sonnet-20240620-v1:0",\n beta_use_converse_api=True`;
const databricksParamsOrDefault = databricksParams ?? `endpoint="databricks-meta-llama-3-1-70b-instruct"`
const llmVarName = customVarName ?? "model";
const tabItems = [
{
value: "Groq",
value: "groq",
label: "Groq",
text: `from langchain_groq import ChatGroq\n\n${llmVarName} = ChatGroq(${groqParamsOrDefault})`,
model: "llama3-8b-8192",
apiKeyName: "GROQ_API_KEY",
packageName: "langchain-groq",
shouldHide: hideGroq,
packageName: "langchain[groq]",
},
{
value: "OpenAI",
value: "openai",
label: "OpenAI",
text: `from langchain_openai import ChatOpenAI\n\n${llmVarName} = ChatOpenAI(${openAIParamsOrDefault})`,
model: "gpt-4o-mini",
apiKeyName: "OPENAI_API_KEY",
packageName: "langchain-openai",
shouldHide: hideOpenai,
packageName: "langchain[openai]",
},
{
value: "Anthropic",
value: "anthropic",
label: "Anthropic",
text: `from langchain_anthropic import ChatAnthropic\n\n${llmVarName} = ChatAnthropic(${anthropicParamsOrDefault})`,
model: "claude-3-5-sonnet-latest",
apiKeyName: "ANTHROPIC_API_KEY",
packageName: "langchain-anthropic",
shouldHide: hideAnthropic,
packageName: "langchain[anthropic]",
},
{
value: "Azure",
value: "azure",
label: "Azure",
text: `from langchain_openai import AzureChatOpenAI\n\n${llmVarName} = AzureChatOpenAI(${azureParamsOrDefault})`,
text: `from langchain_openai import AzureChatOpenAI
${llmVarName} = AzureChatOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
)`,
apiKeyName: "AZURE_OPENAI_API_KEY",
packageName: "langchain-openai",
shouldHide: hideAzure,
packageName: "langchain[openai]",
},
{
value: "Google",
label: "Google",
text: `from langchain_google_vertexai import ChatVertexAI\n\n${llmVarName} = ChatVertexAI(${googleParamsOrDefault})`,
value: "google_vertexai",
label: "Google Vertex",
model: "gemini-2.0-flash-001",
apiKeyText: "# Ensure your VertexAI credentials are configured",
packageName: "langchain-google-vertexai",
shouldHide: hideGoogle,
packageName: "langchain[google-vertexai]",
},
{
value: "AWS",
value: "bedrock_converse",
label: "AWS",
text: `from langchain_aws import ChatBedrock\n\n${llmVarName} = ChatBedrock(${awsBedrockParamsOrDefault})`,
model: "anthropic.claude-3-5-sonnet-20240620-v1:0",
apiKeyText: "# Ensure your AWS credentials are configured",
packageName: "langchain-aws",
shouldHide: hideAWS,
packageName: "langchain[aws]",
},
{
value: "Cohere",
value: "cohere",
label: "Cohere",
text: `from langchain_cohere import ChatCohere\n\n${llmVarName} = ChatCohere(${cohereParamsOrDefault})`,
model: "command-r-plus",
apiKeyName: "COHERE_API_KEY",
packageName: "langchain-cohere",
shouldHide: hideCohere,
packageName: "langchain[cohere]",
},
{
value: "NVIDIA",
value: "nvidia",
label: "NVIDIA",
text: `from langchain_nvidia_ai_endpoints import ChatNVIDIA\n\n${llmVarName} = ChatNVIDIA(${nvidiaParamsOrDefault})`,
model: "meta/llama3-70b-instruct",
apiKeyName: "NVIDIA_API_KEY",
packageName: "langchain-nvidia-ai-endpoints",
shouldHide: hideNvidia,
},
{
value: "FireworksAI",
value: "fireworks",
label: "Fireworks AI",
text: `from langchain_fireworks import ChatFireworks\n\n${llmVarName} = ChatFireworks(${fireworksParamsOrDefault})`,
model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
apiKeyName: "FIREWORKS_API_KEY",
packageName: "langchain-fireworks",
shouldHide: hideFireworks,
packageName: "langchain[fireworks]",
},
{
value: "MistralAI",
value: "mistralai",
label: "Mistral AI",
text: `from langchain_mistralai import ChatMistralAI\n\n${llmVarName} = ChatMistralAI(${mistralParamsOrDefault})`,
model: "mistral-large-latest",
apiKeyName: "MISTRAL_API_KEY",
packageName: "langchain-mistralai",
shouldHide: hideMistral,
packageName: "langchain[mistralai]",
},
{
value: "TogetherAI",
value: "together",
label: "Together AI",
text: `from langchain_openai import ChatOpenAI\n\n${llmVarName} = ChatOpenAI(${togetherParamsOrDefault})`,
model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
apiKeyName: "TOGETHER_API_KEY",
packageName: "langchain-openai",
shouldHide: hideTogether,
packageName: "langchain[together]",
},
{
value: "Databricks",
value: "ibm",
label: "IBM watsonx",
text: `from langchain_ibm import ChatWatsonx
${llmVarName} = ChatWatsonx(
model_id="ibm/granite-34b-code-instruct",
url="https://us-south.ml.cloud.ibm.com",
project_id="<WATSONX PROJECT_ID>"
)`,
apiKeyName: "WATSONX_APIKEY",
packageName: "langchain-ibm",
},
{
value: "databricks",
label: "Databricks",
text: `from databricks_langchain import ChatDatabricks\n\nos.environ["DATABRICKS_HOST"] = "https://example.staging.cloud.databricks.com/serving-endpoints"\n\n${llmVarName} = ChatDatabricks(${databricksParamsOrDefault})`,
text: `from databricks_langchain import ChatDatabricks\n\nos.environ["DATABRICKS_HOST"] = "https://example.staging.cloud.databricks.com/serving-endpoints"\n\n${llmVarName} = ChatDatabricks(endpoint="databricks-meta-llama-3-1-70b-instruct")`,
apiKeyName: "DATABRICKS_TOKEN",
packageName: "databricks-langchain",
shouldHide: hideDatabricks,
},
];
].map((item) => ({
...item,
...overrideParams?.[item.value],
}));
const modelOptions = tabItems
.filter((item) => !item.shouldHide)
.map((item) => ({
value: item.value,
label: item.label,
text: item.text,
apiKeyName: item.apiKeyName,
apiKeyText: item.apiKeyText,
packageName: item.packageName,
}));
const selectedOption = modelOptions.find(
(option) => option.value === selectedModel
);
const selectedTabItem = tabItems.find(
(option) => option.value === selectedModel
);
let apiKeyText = "";
if (selectedOption.apiKeyName) {
if (selectedTabItem.apiKeyName) {
apiKeyText = `import getpass
import os
if not os.environ.get("${selectedOption.apiKeyName}"):
os.environ["${selectedOption.apiKeyName}"] = getpass.getpass("Enter API key for ${selectedOption.label}: ")`;
} else if (selectedOption.apiKeyText) {
apiKeyText = selectedOption.apiKeyText;
if not os.environ.get("${selectedTabItem.apiKeyName}"):
os.environ["${selectedTabItem.apiKeyName}"] = getpass.getpass("Enter API key for ${selectedTabItem.label}: ")`;
} else if (selectedTabItem.apiKeyText) {
apiKeyText = selectedTabItem.apiKeyText;
}
return (
<div>
<CustomDropdown
selectedOption={selectedOption}
options={modelOptions}
onSelect={setSelectedModel}
modelType="chat"
/>
const initModelText = selectedTabItem?.text || `from langchain.chat_models import init_chat_model
<CodeBlock language="bash">
{`pip install -qU ${selectedOption.packageName}`}
</CodeBlock>
<CodeBlock language="python">
{apiKeyText ? apiKeyText + "\n\n" + selectedOption.text : selectedOption.text}
</CodeBlock>
</div>
);
${llmVarName} = init_chat_model("${selectedTabItem.model}", model_provider="${selectedTabItem.value}"${selectedTabItem?.kwargs ? `, ${selectedTabItem.kwargs}` : ""})`;
return (
<div>
<CustomDropdown
selectedOption={selectedTabItem}
options={modelOptions}
onSelect={setSelectedModel}
modelType="chat"
/>
<CodeBlock language="bash">
{`pip install -qU "${selectedTabItem.packageName}"`}
</CodeBlock>
<CodeBlock language="python">
{apiKeyText ? apiKeyText + "\n\n" + initModelText : initModelText}
</CodeBlock>
</div>
);
}

View File

@@ -27,6 +27,8 @@ export default function EmbeddingTabs(props) {
hideNvidia,
voyageaiParams,
hideVoyageai,
ibmParams,
hideIBM,
fakeEmbeddingParams,
hideFakeEmbedding,
customVarName,
@@ -45,6 +47,8 @@ export default function EmbeddingTabs(props) {
const nomicsParamsOrDefault = nomicParams ?? `model="nomic-embed-text-v1.5"`;
const nvidiaParamsOrDefault = nvidiaParams ?? `model="NV-Embed-QA"`;
const voyageaiParamsOrDefault = voyageaiParams ?? `model="voyage-3"`;
const ibmParamsOrDefault = ibmParams ??
`\n model_id="ibm/slate-125m-english-rtrvr",\n url="https://us-south.ml.cloud.ibm.com",\n project_id="<WATSONX PROJECT_ID>",\n`;
const fakeEmbeddingParamsOrDefault = fakeEmbeddingParams ?? `size=4096`;
const embeddingVarName = customVarName ?? "embeddings";
@@ -149,6 +153,15 @@ export default function EmbeddingTabs(props) {
default: false,
shouldHide: hideVoyageai,
},
{
value: "IBM",
label: "IBM watsonx",
text: `from langchain_ibm import WatsonxEmbeddings\n\n${embeddingVarName} = WatsonxEmbeddings(${ibmParamsOrDefault})`,
apiKeyName: "WATSONX_APIKEY",
packageName: "langchain-ibm",
default: false,
shouldHide: hideIBM,
},
{
value: "Fake",
label: "Fake",

Binary file not shown.

Before

Width:  |  Height:  |  Size: 147 KiB

After

Width:  |  Height:  |  Size: 212 KiB

View File

@@ -3,10 +3,8 @@ requires = ["pdm-backend"]
build-backend = "pdm.backend"
[project]
authors = [
{name = "Erick Friis", email = "erick@langchain.dev"},
]
license = {text = "MIT"}
authors = [{ name = "Erick Friis", email = "erick@langchain.dev" }]
license = { text = "MIT" }
requires-python = "<4.0,>=3.9"
dependencies = [
"typer[all]<1.0.0,>=0.9.0",
@@ -31,33 +29,25 @@ langchain = "langchain_cli.cli:app"
langchain-cli = "langchain_cli.cli:app"
[dependency-groups]
dev = [
"pytest<8.0.0,>=7.4.2",
"pytest-watch<5.0.0,>=4.2.0",
]
lint = [
"ruff<1.0,>=0.5",
"mypy<2.0.0,>=1.13.0",
]
test = [
"langchain @ file:///${PROJECT_ROOT}/../langchain",
]
typing = [
"langchain @ file:///${PROJECT_ROOT}/../langchain",
]
dev = ["pytest<8.0.0,>=7.4.2", "pytest-watch<5.0.0,>=4.2.0"]
lint = ["ruff<1.0,>=0.5", "mypy<2.0.0,>=1.13.0"]
test = ["langchain"]
typing = ["langchain"]
test_integration = []
[tool.uv.sources]
langchain = { path = "../langchain", editable = true }
[tool.ruff.lint]
select = [
"E", # pycodestyle
"F", # pyflakes
"I", # isort
"T201", # print
"E", # pycodestyle
"F", # pyflakes
"I", # isort
"T201", # print
]
[tool.mypy]
exclude = [
"langchain_cli/integration_template",
"langchain_cli/package_template",
"langchain_cli/integration_template",
"langchain_cli/package_template",
]

127
libs/cli/uv.lock generated
View File

@@ -620,8 +620,8 @@ wheels = [
[[package]]
name = "langchain"
version = "0.3.18rc1"
source = { directory = "../langchain" }
version = "0.3.18"
source = { editable = "../langchain" }
dependencies = [
{ name = "aiohttp" },
{ name = "async-timeout", marker = "python_full_version < '3.11'" },
@@ -645,7 +645,7 @@ requires-dist = [
{ name = "langchain-aws", marker = "extra == 'aws'" },
{ name = "langchain-cohere", marker = "extra == 'cohere'" },
{ name = "langchain-community", marker = "extra == 'community'" },
{ name = "langchain-core", specifier = ">=0.3.33,<1.0.0" },
{ name = "langchain-core", editable = "../core" },
{ name = "langchain-deepseek", marker = "extra == 'deepseek'" },
{ name = "langchain-fireworks", marker = "extra == 'fireworks'" },
{ name = "langchain-google-genai", marker = "extra == 'google-genai'" },
@@ -654,8 +654,8 @@ requires-dist = [
{ name = "langchain-huggingface", marker = "extra == 'huggingface'" },
{ name = "langchain-mistralai", marker = "extra == 'mistralai'" },
{ name = "langchain-ollama", marker = "extra == 'ollama'" },
{ name = "langchain-openai", marker = "extra == 'openai'" },
{ name = "langchain-text-splitters", specifier = ">=0.3.3,<1.0.0" },
{ name = "langchain-openai", marker = "extra == 'openai'", editable = "../partners/openai" },
{ name = "langchain-text-splitters", editable = "../text-splitters" },
{ name = "langchain-together", marker = "extra == 'together'" },
{ name = "langsmith", specifier = ">=0.1.17,<0.4" },
{ name = "numpy", marker = "python_full_version < '3.12'", specifier = ">=1.26.4,<2" },
@@ -671,8 +671,8 @@ requires-dist = [
codespell = [{ name = "codespell", specifier = ">=2.2.0,<3.0.0" }]
dev = [
{ name = "jupyter", specifier = ">=1.0.0,<2.0.0" },
{ name = "langchain-core", directory = "../core" },
{ name = "langchain-text-splitters", directory = "../text-splitters" },
{ name = "langchain-core", editable = "../core" },
{ name = "langchain-text-splitters", editable = "../text-splitters" },
{ name = "playwright", specifier = ">=1.28.0,<2.0.0" },
{ name = "setuptools", specifier = ">=67.6.1,<68.0.0" },
]
@@ -682,14 +682,15 @@ lint = [
{ name = "ruff", specifier = ">=0.9.2,<1.0.0" },
]
test = [
{ name = "blockbuster", specifier = ">=1.5.14,<1.6" },
{ name = "cffi", marker = "python_full_version < '3.10'", specifier = "<1.17.1" },
{ name = "cffi", marker = "python_full_version >= '3.10'" },
{ name = "duckdb-engine", specifier = ">=0.9.2,<1.0.0" },
{ name = "freezegun", specifier = ">=1.2.2,<2.0.0" },
{ name = "langchain-core", directory = "../core" },
{ name = "langchain-openai", directory = "../partners/openai" },
{ name = "langchain-tests", directory = "../standard-tests" },
{ name = "langchain-text-splitters", directory = "../text-splitters" },
{ name = "langchain-core", editable = "../core" },
{ name = "langchain-openai", editable = "../partners/openai" },
{ name = "langchain-tests", editable = "../standard-tests" },
{ name = "langchain-text-splitters", editable = "../text-splitters" },
{ name = "lark", specifier = ">=1.1.5,<2.0.0" },
{ name = "packaging", specifier = ">=24.2" },
{ name = "pandas", specifier = ">=2.0.0,<3.0.0" },
@@ -708,8 +709,8 @@ test = [
]
test-integration = [
{ name = "cassio", specifier = ">=0.1.0,<1.0.0" },
{ name = "langchain-core", directory = "../core" },
{ name = "langchain-text-splitters", directory = "../text-splitters" },
{ name = "langchain-core", editable = "../core" },
{ name = "langchain-text-splitters", editable = "../text-splitters" },
{ name = "langchainhub", specifier = ">=0.1.16,<1.0.0" },
{ name = "pytest-vcr", specifier = ">=1.0.2,<2.0.0" },
{ name = "python-dotenv", specifier = ">=1.0.0,<2.0.0" },
@@ -717,8 +718,8 @@ test-integration = [
{ name = "wrapt", specifier = ">=1.15.0,<2.0.0" },
]
typing = [
{ name = "langchain-core", directory = "../core" },
{ name = "langchain-text-splitters", directory = "../text-splitters" },
{ name = "langchain-core", editable = "../core" },
{ name = "langchain-text-splitters", editable = "../text-splitters" },
{ name = "mypy", specifier = ">=1.10,<2.0" },
{ name = "mypy-protobuf", specifier = ">=3.0.0,<4.0.0" },
{ name = "types-chardet", specifier = ">=5.0.4.6,<6.0.0.0" },
@@ -777,14 +778,14 @@ lint = [
{ name = "mypy", specifier = ">=1.13.0,<2.0.0" },
{ name = "ruff", specifier = ">=0.5,<1.0" },
]
test = [{ name = "langchain", directory = "../langchain" }]
test = [{ name = "langchain", editable = "../langchain" }]
test-integration = []
typing = [{ name = "langchain", directory = "../langchain" }]
typing = [{ name = "langchain", editable = "../langchain" }]
[[package]]
name = "langchain-core"
version = "0.3.33"
source = { registry = "https://pypi.org/simple" }
version = "0.3.35"
source = { editable = "../core" }
dependencies = [
{ name = "jsonpatch" },
{ name = "langsmith" },
@@ -794,21 +795,93 @@ dependencies = [
{ name = "tenacity" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/57/b3/426268e07273c395affc6dd02cdf89803888121cfc59ce60922f363aeff8/langchain_core-0.3.33.tar.gz", hash = "sha256:b5dd93a4e7f8198d2fc6048723b0bfecf7aaf128b0d268cbac19c34c1579b953", size = 331492 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/98/78/463bc92174555cc04b3e234faa169bb8b58f36fff77892d7b8ae2b4f58e4/langchain_core-0.3.33-py3-none-any.whl", hash = "sha256:269706408a2223f863ff1f9616f31903a5712403199d828b50aadbc4c28b553a", size = 412656 },
[package.metadata]
requires-dist = [
{ name = "jsonpatch", specifier = ">=1.33,<2.0" },
{ name = "langsmith", specifier = ">=0.1.125,<0.4" },
{ name = "packaging", specifier = ">=23.2,<25" },
{ name = "pydantic", marker = "python_full_version < '3.12.4'", specifier = ">=2.5.2,<3.0.0" },
{ name = "pydantic", marker = "python_full_version >= '3.12.4'", specifier = ">=2.7.4,<3.0.0" },
{ name = "pyyaml", specifier = ">=5.3" },
{ name = "tenacity", specifier = ">=8.1.0,!=8.4.0,<10.0.0" },
{ name = "typing-extensions", specifier = ">=4.7" },
]
[package.metadata.requires-dev]
dev = [
{ name = "grandalf", specifier = ">=0.8,<1.0" },
{ name = "jupyter", specifier = ">=1.0.0,<2.0.0" },
{ name = "setuptools", specifier = ">=67.6.1,<68.0.0" },
]
lint = [{ name = "ruff", specifier = ">=0.9.2,<1.0.0" }]
test = [
{ name = "blockbuster", specifier = "~=1.5.11" },
{ name = "freezegun", specifier = ">=1.2.2,<2.0.0" },
{ name = "grandalf", specifier = ">=0.8,<1.0" },
{ name = "langchain-tests", directory = "../standard-tests" },
{ name = "numpy", marker = "python_full_version < '3.12'", specifier = ">=1.24.0,<2.0.0" },
{ name = "numpy", marker = "python_full_version >= '3.12'", specifier = ">=1.26.0,<3" },
{ name = "pytest", specifier = ">=8,<9" },
{ name = "pytest-asyncio", specifier = ">=0.21.1,<1.0.0" },
{ name = "pytest-mock", specifier = ">=3.10.0,<4.0.0" },
{ name = "pytest-socket", specifier = ">=0.7.0,<1.0.0" },
{ name = "pytest-watcher", specifier = ">=0.3.4,<1.0.0" },
{ name = "pytest-xdist", specifier = ">=3.6.1,<4.0.0" },
{ name = "responses", specifier = ">=0.25.0,<1.0.0" },
{ name = "syrupy", specifier = ">=4.0.2,<5.0.0" },
]
test-integration = []
typing = [
{ name = "langchain-text-splitters", directory = "../text-splitters" },
{ name = "mypy", specifier = ">=1.10,<1.11" },
{ name = "types-jinja2", specifier = ">=2.11.9,<3.0.0" },
{ name = "types-pyyaml", specifier = ">=6.0.12.2,<7.0.0.0" },
{ name = "types-requests", specifier = ">=2.28.11.5,<3.0.0.0" },
]
[[package]]
name = "langchain-text-splitters"
version = "0.3.5"
source = { registry = "https://pypi.org/simple" }
version = "0.3.6"
source = { editable = "../text-splitters" }
dependencies = [
{ name = "langchain-core" },
]
sdist = { url = "https://files.pythonhosted.org/packages/10/35/a6f8d6b1bb0e6e8c00b49bce4d1a115f8b68368b1899f65bb34dbbb44160/langchain_text_splitters-0.3.5.tar.gz", hash = "sha256:11cb7ca3694e5bdd342bc16d3875b7f7381651d4a53cbb91d34f22412ae16443", size = 26318 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/4b/83/f8081c3bea416bd9d9f0c26af795c74f42c24f9ad3c4fbf361b7d69de134/langchain_text_splitters-0.3.5-py3-none-any.whl", hash = "sha256:8c9b059827438c5fa8f327b4df857e307828a5ec815163c9b5c9569a3e82c8ee", size = 31620 },
[package.metadata]
requires-dist = [{ name = "langchain-core", editable = "../core" }]
[package.metadata.requires-dev]
dev = [
{ name = "jupyter", specifier = ">=1.0.0,<2.0.0" },
{ name = "langchain-core", editable = "../core" },
]
lint = [
{ name = "langchain-core", editable = "../core" },
{ name = "ruff", specifier = ">=0.9.2,<1.0.0" },
]
test = [
{ name = "freezegun", specifier = ">=1.2.2,<2.0.0" },
{ name = "langchain-core", editable = "../core" },
{ name = "pytest", specifier = ">=8,<9" },
{ name = "pytest-asyncio", specifier = ">=0.21.1,<1.0.0" },
{ name = "pytest-mock", specifier = ">=3.10.0,<4.0.0" },
{ name = "pytest-socket", specifier = ">=0.7.0,<1.0.0" },
{ name = "pytest-watcher", specifier = ">=0.3.4,<1.0.0" },
{ name = "pytest-xdist", specifier = ">=3.6.1,<4.0.0" },
]
test-integration = [
{ name = "nltk", specifier = ">=3.9.1,<4.0.0" },
{ name = "sentence-transformers", marker = "python_full_version < '3.13'", specifier = ">=2.6.0" },
{ name = "spacy", marker = "python_full_version < '3.10'", specifier = ">=3.0.0,<3.8.4" },
{ name = "spacy", marker = "python_full_version < '3.13'", specifier = ">=3.0.0,<4.0.0" },
{ name = "transformers", specifier = ">=4.47.0,<5.0.0" },
]
typing = [
{ name = "lxml-stubs", specifier = ">=0.5.1,<1.0.0" },
{ name = "mypy", specifier = ">=1.10,<2.0" },
{ name = "tiktoken", specifier = ">=0.8.0,<1.0.0" },
{ name = "types-requests", specifier = ">=2.31.0.20240218,<3.0.0.0" },
]
[[package]]

View File

@@ -64,7 +64,7 @@ pdfplumber>=0.11
pgvector>=0.1.6,<0.2
playwright>=1.48.0,<2
praw>=7.7.1,<8
premai>=0.3.25,<0.4
premai>=0.3.25,<0.4,!=0.3.100
psychicapi>=0.8.0,<0.9
pydantic>=2.7.4,<3
pytesseract>=0.3.13

View File

@@ -557,7 +557,7 @@ _EXTRA_OPTIONAL_TOOLS: Dict[str, Tuple[Callable[[KwArg(Any)], BaseTool], List[st
_get_dataforseo_api_search_json,
["api_login", "api_password", "aiosession"],
),
"eleven_labs_text2speech": (_get_eleven_labs_text2speech, ["eleven_api_key"]),
"eleven_labs_text2speech": (_get_eleven_labs_text2speech, ["elevenlabs_api_key"]),
"google_cloud_texttospeech": (_get_google_cloud_texttospeech, []),
"read_file": (_get_file_management_tool, []),
"reddit_search": (

View File

@@ -30,7 +30,7 @@ class NLATool(Tool): # type: ignore[override]
The API endpoint tool.
"""
expanded_name = (
f'{api_title.replace(" ", "_")}.{chain.api_operation.operation_id}'
f"{api_title.replace(' ', '_')}.{chain.api_operation.operation_id}"
)
description = (
f"I'm an AI from {api_title}. Instruct what you want,"

View File

@@ -100,7 +100,7 @@ class FiddlerCallbackHandler(BaseCallbackHandler):
if self.project not in self.fiddler_client.get_project_names():
print( # noqa: T201
f"adding project {self.project}." "This only has to be done once."
f"adding project {self.project}.This only has to be done once."
)
try:
self.fiddler_client.add_project(self.project)

View File

@@ -61,9 +61,9 @@ def get_openai_callback() -> Generator[OpenAICallbackHandler, None, None]:
@contextmanager
def get_bedrock_anthropic_callback() -> (
Generator[BedrockAnthropicTokenUsageCallbackHandler, None, None]
):
def get_bedrock_anthropic_callback() -> Generator[
BedrockAnthropicTokenUsageCallbackHandler, None, None
]:
"""Get the Bedrock anthropic callback handler in a context manager.
which conveniently exposes token and cost information.

View File

@@ -211,9 +211,9 @@ class LLMThought:
def complete(self, final_label: Optional[str] = None) -> None:
"""Finish the thought."""
if final_label is None and self._state == LLMThoughtState.RUNNING_TOOL:
assert (
self._last_tool is not None
), "_last_tool should never be null when _state == RUNNING_TOOL"
assert self._last_tool is not None, (
"_last_tool should never be null when _state == RUNNING_TOOL"
)
final_label = self._labeler.get_tool_label(
self._last_tool, is_complete=True
)

View File

@@ -467,7 +467,7 @@ class PebbloRetrievalAPIWrapper(BaseModel):
logger.warning(f"Pebblo Server: Error {response.status}")
elif response.status >= HTTPStatus.BAD_REQUEST:
logger.warning(
f"Pebblo received an invalid payload: " f"{response.text}"
f"Pebblo received an invalid payload: {response.text}"
)
elif response.status != HTTPStatus.OK:
logger.warning(

View File

@@ -37,7 +37,7 @@ class SingleFileFacebookMessengerChatLoader(BaseChatLoader):
if "content" not in m:
logger.info(
f"""Skipping Message No.
{index+1} as no content is present in the message"""
{index + 1} as no content is present in the message"""
)
continue
messages.append(

View File

@@ -87,7 +87,7 @@ class Neo4jChatMessageHistory(BaseChatMessageHistory):
query = (
f"MATCH (s:`{self._node_label}`)-[:LAST_MESSAGE]->(last_message) "
"WHERE s.id = $session_id MATCH p=(last_message)<-[:NEXT*0.."
f"{self._window*2}]-() WITH p, length(p) AS length "
f"{self._window * 2}]-() WITH p, length(p) AS length "
"ORDER BY length DESC LIMIT 1 UNWIND reverse(nodes(p)) AS node "
"RETURN {data:{content: node.content}, type:node.type} AS result"
)

View File

@@ -177,9 +177,9 @@ class SQLChatMessageHistory(BaseChatMessageHistory):
engine_args: Additional configuration for creating database engines.
async_mode: Whether it is an asynchronous connection.
"""
assert not (
connection_string and connection
), "connection_string and connection are mutually exclusive"
assert not (connection_string and connection), (
"connection_string and connection are mutually exclusive"
)
if connection_string:
global _warned_once_already
if not _warned_once_already:

View File

@@ -209,7 +209,10 @@ class AzureChatOpenAI(ChatOpenAI):
"base_url": values["openai_api_base"],
"timeout": values["request_timeout"],
"max_retries": values["max_retries"],
"default_headers": values["default_headers"],
"default_headers": {
**(values["default_headers"] or {}),
"User-Agent": "langchain-comm-python-azure-openai",
},
"default_query": values["default_query"],
"http_client": values["http_client"],
}

View File

@@ -110,9 +110,9 @@ def _format_anthropic_messages(
if not isinstance(message.content, str):
# parse as dict
assert isinstance(
message.content, list
), "Anthropic message content must be str or list of dicts"
assert isinstance(message.content, list), (
"Anthropic message content must be str or list of dicts"
)
# populate content
content = []

View File

@@ -468,8 +468,7 @@ class ChatDeepInfra(BaseChatModel):
raise ValueError(f"DeepInfra received an invalid payload: {text}")
elif code != 200:
raise Exception(
f"DeepInfra returned an unexpected response with status "
f"{code}: {text}"
f"DeepInfra returned an unexpected response with status {code}: {text}"
)
def _url(self) -> str:

View File

@@ -179,8 +179,7 @@ class ChatKonko(ChatOpenAI): # type: ignore[override]
if models_response.status_code != 200:
raise ValueError(
f"Error getting models from {models_url}: "
f"{models_response.status_code}"
f"Error getting models from {models_url}: {models_response.status_code}"
)
return {model["id"] for model in models_response.json()["data"]}

View File

@@ -3,19 +3,23 @@
from __future__ import annotations
import logging
from operator import itemgetter
from typing import (
Any,
Dict,
Iterator,
List,
Literal,
Mapping,
Optional,
Tuple,
Type,
TypeVar,
Union,
)
from langchain_core.callbacks import CallbackManagerForLLMRun
from langchain_core.language_models import LanguageModelInput
from langchain_core.language_models.chat_models import (
BaseChatModel,
generate_from_stream,
@@ -34,17 +38,27 @@ from langchain_core.messages import (
SystemMessageChunk,
ToolMessageChunk,
)
from langchain_core.output_parsers import JsonOutputParser, PydanticOutputParser
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
from langchain_core.utils import (
from_env,
get_pydantic_field_names,
from langchain_core.runnables import Runnable, RunnableMap, RunnablePassthrough
from langchain_core.utils import from_env, get_pydantic_field_names
from langchain_core.utils.pydantic import (
is_basemodel_subclass,
)
from pydantic import ConfigDict, Field, model_validator
from pydantic import BaseModel, ConfigDict, Field, TypeAdapter, model_validator
from typing_extensions import Self
_BM = TypeVar("_BM", bound=BaseModel)
_DictOrPydanticClass = Union[Dict[str, Any], Type[_BM], Type]
_DictOrPydantic = Union[Dict, _BM]
logger = logging.getLogger(__name__)
def _is_pydantic_class(obj: Any) -> bool:
return isinstance(obj, type) and is_basemodel_subclass(obj)
class ChatPerplexity(BaseChatModel):
"""`Perplexity AI` Chat models API.
@@ -282,3 +296,99 @@ class ChatPerplexity(BaseChatModel):
def _llm_type(self) -> str:
"""Return type of chat model."""
return "perplexitychat"
def with_structured_output(
self,
schema: Optional[_DictOrPydanticClass] = None,
*,
method: Literal["json_schema"] = "json_schema",
include_raw: bool = False,
strict: Optional[bool] = None,
**kwargs: Any,
) -> Runnable[LanguageModelInput, _DictOrPydantic]:
"""Model wrapper that returns outputs formatted to match the given schema for Preplexity.
Currently, Preplexity only supports "json_schema" method for structured output
as per their official documentation: https://docs.perplexity.ai/guides/structured-outputs
Args:
schema:
The output schema. Can be passed in as:
- a JSON Schema,
- a TypedDict class,
- or a Pydantic class
method: The method for steering model generation, currently only support:
- "json_schema": Use the JSON Schema to parse the model output
include_raw:
If False then only the parsed structured output is returned. If
an error occurs during model output parsing it will be raised. If True
then both the raw model response (a BaseMessage) and the parsed model
response will be returned. If an error occurs during output parsing it
will be caught and returned as well. The final output is always a dict
with keys "raw", "parsed", and "parsing_error".
kwargs: Additional keyword args aren't supported.
Returns:
A Runnable that takes same inputs as a :class:`langchain_core.language_models.chat.BaseChatModel`.
| If ``include_raw`` is False and ``schema`` is a Pydantic class, Runnable outputs an instance of ``schema`` (i.e., a Pydantic object). Otherwise, if ``include_raw`` is False then Runnable outputs a dict.
| If ``include_raw`` is True, then Runnable outputs a dict with keys:
- "raw": BaseMessage
- "parsed": None if there was a parsing error, otherwise the type depends on the ``schema`` as described above.
- "parsing_error": Optional[BaseException]
""" # noqa: E501
if method == "json_schema":
if schema is None:
raise ValueError(
"schema must be specified when method is not 'json_schema'. "
"Received None."
)
is_pydantic_schema = _is_pydantic_class(schema)
if is_pydantic_schema and hasattr(
schema, "model_json_schema"
): # accounting for pydantic v1 and v2
response_format = schema.model_json_schema() # type: ignore[union-attr]
elif is_pydantic_schema:
response_format = schema.schema() # type: ignore[union-attr]
elif isinstance(schema, dict):
response_format = schema
elif type(schema).__name__ == "_TypedDictMeta":
adapter = TypeAdapter(schema) # if use passes typeddict
response_format = adapter.json_schema()
llm = self.bind(
response_format={
"type": "json_schema",
"json_schema": {"schema": response_format},
}
)
output_parser = (
PydanticOutputParser(pydantic_object=schema) # type: ignore[arg-type]
if is_pydantic_schema
else JsonOutputParser()
)
else:
raise ValueError(
f"Unrecognized method argument. Expected 'json_schema' Received:\
'{method}'"
)
if include_raw:
parser_assign = RunnablePassthrough.assign(
parsed=itemgetter("raw") | output_parser, parsing_error=lambda _: None
)
parser_none = RunnablePassthrough.assign(parsed=lambda _: None)
parser_with_fallback = parser_assign.with_fallbacks(
[parser_none], exception_key="parsing_error"
)
return RunnableMap(raw=llm) | parser_with_fallback
else:
return llm | output_parser

View File

@@ -196,7 +196,10 @@ def _messages_to_prompt_dict(
elif isinstance(input_msg, HumanMessage):
if template_id is None:
examples_and_messages.append(
{"role": "user", "content": str(input_msg.content)}
{
"role": "user",
"content": str(input_msg.content),
}
)
else:
params: Dict[str, str] = {}
@@ -206,12 +209,19 @@ def _messages_to_prompt_dict(
)
params[str(input_msg.id)] = str(input_msg.content)
examples_and_messages.append(
{"role": "user", "template_id": template_id, "params": params}
{
"role": "user",
"template_id": template_id,
"params": params,
}
)
elif isinstance(input_msg, AIMessage):
if input_msg.tool_calls is None or len(input_msg.tool_calls) == 0:
examples_and_messages.append(
{"role": "assistant", "content": str(input_msg.content)}
{
"role": "assistant",
"content": str(input_msg.content),
}
)
else:
ai_msg_to_json = {

View File

@@ -277,7 +277,10 @@ class ChatWriter(BaseChatModel):
if not delta or not delta.content:
continue
chunk = self._convert_writer_to_langchain(
{"role": "assistant", "content": delta.content}
{
"role": "assistant",
"content": delta.content,
}
)
chunk = ChatGenerationChunk(message=chunk)
@@ -303,7 +306,10 @@ class ChatWriter(BaseChatModel):
if not delta or not delta.content:
continue
chunk = self._convert_writer_to_langchain(
{"role": "assistant", "content": delta.content}
{
"role": "assistant",
"content": delta.content,
}
)
chunk = ChatGenerationChunk(message=chunk)

View File

@@ -121,7 +121,7 @@ def _get_jwt_token(api_key: str) -> str:
import jwt
except ImportError:
raise ImportError(
"jwt package not found, please install it with" "`pip install pyjwt`"
"jwt package not found, please install it with`pip install pyjwt`"
)
try:

View File

@@ -89,7 +89,10 @@ class DashScopeRerank(BaseDocumentCompressor):
result_dicts = []
for res in results.output.results:
result_dicts.append(
{"index": res.index, "relevance_score": res.relevance_score}
{
"index": res.index,
"relevance_score": res.relevance_score,
}
)
return result_dicts

View File

@@ -101,7 +101,10 @@ class InfinityRerank(BaseDocumentCompressor):
result_dicts = []
for res in results:
result_dicts.append(
{"index": res.index, "relevance_score": res.relevance_score}
{
"index": res.index,
"relevance_score": res.relevance_score,
}
)
result_dicts.sort(key=lambda x: x["relevance_score"], reverse=True)

View File

@@ -95,7 +95,10 @@ class JinaRerank(BaseDocumentCompressor):
result_dicts = []
for res in results:
result_dicts.append(
{"index": res["index"], "relevance_score": res["relevance_score"]}
{
"index": res["index"],
"relevance_score": res["relevance_score"],
}
)
return result_dicts

View File

@@ -1,11 +1,22 @@
from typing import Any, Callable, Dict, List
from langchain_core._api import deprecated
from langchain_core.documents import Document
from pydantic import BaseModel, model_validator
from langchain_community.document_loaders.base import BaseLoader
@deprecated(
since="0.3.18",
message=(
"This class is deprecated and will be removed in a future version. "
"You can swap to using the `ApifyDatasetLoader`"
" implementation in `langchain_apify` package. "
"See <https://github.com/apify/langchain-apify>"
),
alternative_import="langchain_apify.ApifyDatasetLoader",
)
class ApifyDatasetLoader(BaseLoader, BaseModel):
"""Load datasets from `Apify` web scraping, crawling, and data extraction platform.

View File

@@ -60,7 +60,7 @@ def fetch_mime_types(file_types: Sequence[str]) -> Dict[str, str]:
if mime_type:
mime_types_mapping[ext] = mime_type
else:
raise ValueError(f"Unknown mimetype of extention {ext}")
raise ValueError(f"Unknown mimetype of extension {ext}")
return mime_types_mapping

View File

@@ -21,8 +21,7 @@ class YoutubeAudioLoader(BlobLoader):
import yt_dlp
except ImportError:
raise ImportError(
"yt_dlp package not found, please install it with "
"`pip install yt_dlp`"
"yt_dlp package not found, please install it with `pip install yt_dlp`"
)
# Use yt_dlp to download audio given a YouTube url

View File

@@ -117,6 +117,10 @@ class CHMParser(object):
for item in index:
content = self.load(item["local"])
res.append(
{"name": item["name"], "local": item["local"], "content": content}
{
"name": item["name"],
"local": item["local"],
"content": content,
}
)
return res

View File

@@ -652,7 +652,7 @@ class ConfluenceLoader(BaseLoader):
from PIL import Image # noqa: F401
except ImportError:
raise ImportError(
"`Pillow` package not found, " "please run `pip install Pillow`"
"`Pillow` package not found, please run `pip install Pillow`"
)
# depending on setup you may also need to set the correct path for

View File

@@ -164,9 +164,13 @@ class CSVLoader(BaseLoader):
f"Source column '{self.source_column}' not found in CSV file."
)
content = "\n".join(
f"""{k.strip() if k is not None else k}: {v.strip()
if isinstance(v, str) else ','.join(map(str.strip, v))
if isinstance(v, list) else v}"""
f"""{k.strip() if k is not None else k}: {
v.strip()
if isinstance(v, str)
else ",".join(map(str.strip, v))
if isinstance(v, list)
else v
}"""
for k, v in row.items()
if (
k in self.content_columns

View File

@@ -89,13 +89,13 @@ class AzureAIDocumentIntelligenceLoader(BaseLoader):
file_path is not None or url_path is not None or bytes_source is not None
), "file_path, url_path or bytes_source must be provided"
assert (
api_key is not None or azure_credential is not None
), "Either api_key or azure_credential must be provided."
assert api_key is not None or azure_credential is not None, (
"Either api_key or azure_credential must be provided."
)
assert (
api_key is None or azure_credential is None
), "Only one of api_key or azure_credential should be provided."
assert api_key is None or azure_credential is None, (
"Only one of api_key or azure_credential should be provided."
)
self.file_path = file_path
self.url_path = url_path

View File

@@ -54,7 +54,7 @@ class DropboxLoader(BaseLoader, BaseModel):
try:
from dropbox import Dropbox, exceptions
except ImportError:
raise ImportError("You must run " "`pip install dropbox")
raise ImportError("You must run `pip install dropbox")
try:
dbx = Dropbox(self.dropbox_access_token)
@@ -73,7 +73,7 @@ class DropboxLoader(BaseLoader, BaseModel):
from dropbox import exceptions
from dropbox.files import FileMetadata
except ImportError:
raise ImportError("You must run " "`pip install dropbox")
raise ImportError("You must run `pip install dropbox")
try:
results = dbx.files_list_folder(folder_path, recursive=self.recursive)
@@ -98,7 +98,7 @@ class DropboxLoader(BaseLoader, BaseModel):
try:
from dropbox import exceptions
except ImportError:
raise ImportError("You must run " "`pip install dropbox")
raise ImportError("You must run `pip install dropbox")
try:
file_metadata = dbx.files_get_metadata(file_path)

View File

@@ -65,7 +65,7 @@ class MWDumpLoader(BaseLoader):
import mwxml
except ImportError as e:
raise ImportError(
"Unable to import 'mwxml'. Please install with" " `pip install mwxml`."
"Unable to import 'mwxml'. Please install with `pip install mwxml`."
) from e
return mwxml.Dump.from_file(open(self.file_path, encoding=self.encoding))

View File

@@ -98,7 +98,10 @@ class MongodbLoader(BaseLoader):
# Optionally add database and collection names to metadata
if self.include_db_collection_in_metadata:
metadata.update(
{"database": self.db_name, "collection": self.collection_name}
{
"database": self.db_name,
"collection": self.collection_name,
}
)
# Extract text content from filtered fields or use the entire document

View File

@@ -126,7 +126,7 @@ class NotionDBLoader(BaseLoader):
value = prop_data["url"]
elif prop_type == "unique_id":
value = (
f'{prop_data["unique_id"]["prefix"]}-{prop_data["unique_id"]["number"]}'
f"{prop_data['unique_id']['prefix']}-{prop_data['unique_id']['number']}"
if prop_data["unique_id"]
else None
)

View File

@@ -19,7 +19,12 @@ class NucliaLoader(BaseLoader):
def load(self) -> List[Document]:
"""Load documents."""
data = self.nua.run(
{"action": "pull", "id": self.id, "path": None, "text": None}
{
"action": "pull",
"id": self.id,
"path": None,
"text": None,
}
)
if not data:
return []

View File

@@ -82,8 +82,7 @@ class OracleAutonomousDatabaseLoader(BaseLoader):
import oracledb
except ImportError as e:
raise ImportError(
"Could not import oracledb, "
"please install with 'pip install oracledb'"
"Could not import oracledb, please install with 'pip install oracledb'"
) from e
connect_param = {"user": self.user, "password": self.password, "dsn": self.dsn}
if self.dsn == self.tns_name:

View File

@@ -148,8 +148,7 @@ class AzureOpenAIWhisperParser(BaseBlobParser):
import openai
except ImportError:
raise ImportError(
"openai package not found, please install it with "
"`pip install openai`"
"openai package not found, please install it with `pip install openai`"
)
if is_openai_v1():
@@ -250,6 +249,7 @@ class OpenAIWhisperParser(BaseBlobParser):
Literal["json", "text", "srt", "verbose_json", "vtt"], None
] = None,
temperature: Union[float, None] = None,
model: str = "whisper-1",
):
self.api_key = api_key
self.chunk_duration_threshold = chunk_duration_threshold
@@ -260,6 +260,7 @@ class OpenAIWhisperParser(BaseBlobParser):
self.prompt = prompt
self.response_format = response_format
self.temperature = temperature
self.model = model
@property
def _create_params(self) -> Dict[str, Any]:
@@ -278,14 +279,13 @@ class OpenAIWhisperParser(BaseBlobParser):
import openai
except ImportError:
raise ImportError(
"openai package not found, please install it with "
"`pip install openai`"
"openai package not found, please install it with `pip install openai`"
)
try:
from pydub import AudioSegment
except ImportError:
raise ImportError(
"pydub package not found, please install it with " "`pip install pydub`"
"pydub package not found, please install it with `pip install pydub`"
)
if is_openai_v1():
@@ -326,10 +326,10 @@ class OpenAIWhisperParser(BaseBlobParser):
try:
if is_openai_v1():
transcript = client.audio.transcriptions.create(
model="whisper-1", file=file_obj, **self._create_params
model=self.model, file=file_obj, **self._create_params
)
else:
transcript = openai.Audio.transcribe("whisper-1", file_obj) # type: ignore[attr-defined]
transcript = openai.Audio.transcribe(self.model, file_obj) # type: ignore[attr-defined]
break
except Exception as e:
attempts += 1
@@ -402,7 +402,7 @@ class OpenAIWhisperParserLocal(BaseBlobParser):
import torch
except ImportError:
raise ImportError(
"torch package not found, please install it with " "`pip install torch`"
"torch package not found, please install it with `pip install torch`"
)
# Determine the device to use
@@ -533,7 +533,7 @@ class YandexSTTParser(BaseBlobParser):
from pydub import AudioSegment
except ImportError:
raise ImportError(
"pydub package not found, please install it with " "`pip install pydub`"
"pydub package not found, please install it with `pip install pydub`"
)
if self.api_key:

Some files were not shown because too many files have changed in this diff Show More