- **Description:** Change instances of RunnableMap to RunnableParallel,
as that should be the one used going forward. This makes it consistent
across the codebase.
### Description:
Doc addition for LCEL introduction. Adds a more basic starter guide for
using LCEL.
---------
Co-authored-by: Alex Kira <akira@Alexs-MBP.local.tld>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** just a little change of ErnieChatBot class
description, sugguesting user to use more suitable class
- **Issue:** none,
- **Dependencies:** none,
- **Tag maintainer:** @baskaryan ,
- **Twitter handle:** none
**Description**
`embed_with_retry` is for sync operations and not for async operations.
Use `async_embed_with_retry` for appropriate async operations.
I'm using `OpenAIEmbedding(http_client=httpx.AsyncClient())` with only
async operations.
However, I got an error when I use `embedding.aembed_documents` because
`embed_with_retry` uses sync OpenAI client with async http client.
Description
when the desc of arg in python docstring contains ":", the
`_parse_python_function_docstring` will raise **ValueError: too many
values to unpack (expected 2)**.
A sample desc would be:
"""
Args:
error_arg: this is an arg with an additional ":" symbol
"""
So, set `maxsplit` parameter to fix it.
The number of times I try to format a string (especially in lcel) is
embarrassingly high. Think this may be more actionable than the default
error message. Now I get nice helpful errors
```
KeyError: "Input to ChatPromptTemplate is missing variable 'input'. Expected: ['input'] Received: ['dialogue']"
```
### Description
Now if `example` in Message is False, it will not be displayed. Update
the output in this document.
```python
In [22]: m = HumanMessage(content="Text")
In [23]: m
Out[23]: HumanMessage(content='Text')
In [24]: m = HumanMessage(content="Text", example=True)
In [25]: m
Out[25]: HumanMessage(content='Text', example=True)
```
### Twitter handle
[lin_bob57617](https://twitter.com/lin_bob57617)
**Description:** By combining the document timestamp refresh within a
single call to update(), this enables batching of multiple documents in
a single SQL statement. This is important for non-local databases where
tens of milliseconds has a huge impact on performance when doing
document-by-document SQL statements.
**Issue:** #11935
**Dependencies:** None
**Tag maintainer:** @eyurtsev
- **Description:** Touch up of the documentation page for Metaphor
Search Tool integration. Removes documentation for old built-in tool
wrapper.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
CC @baskaryan @hwchase17 @jmorganca
Having a bit of trouble importing `langchain_experimental` from a
notebook, will figure it out tomorrow
~Ah and also is blocked by #13226~
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
Related to https://github.com/mlflow/mlflow/pull/10420. MLflow AI
gateway will be deprecated and replaced by the `mlflow.deployments`
module. Happy to split this PR if it's too large.
```
pip install git+https://github.com/langchain-ai/langchain.git@refs/pull/13699/merge#subdirectory=libs/langchain
```
## Dependencies
Install mlflow from https://github.com/mlflow/mlflow/pull/10420:
```
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10420/merge
```
## Testing plan
The following code works fine on local and databricks:
<details><summary>Click</summary>
<p>
```python
"""
Setup
-----
mlflow deployments start-server --config-path examples/gateway/openai/config.yaml
databricks secrets create-scope <scope>
databricks secrets put-secret <scope> openai-api-key --string-value $OPENAI_API_KEY
Run
---
python /path/to/this/file.py secrets/<scope>/openai-api-key
"""
from langchain.chat_models import ChatMlflow, ChatDatabricks
from langchain.embeddings import MlflowEmbeddings, DatabricksEmbeddings
from langchain.llms import Databricks, Mlflow
from langchain.schema.messages import HumanMessage
from langchain.chains.loading import load_chain
from mlflow.deployments import get_deploy_client
import uuid
import sys
import tempfile
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
###############################
# MLflow
###############################
chat = ChatMlflow(
target_uri="http://127.0.0.1:5000", endpoint="chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
embeddings = MlflowEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings")
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
llm = Mlflow(
target_uri="http://127.0.0.1:5000",
endpoint="completions",
params={"temperature": 0.1},
)
print(llm("I am"))
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
print(llm_chain.run(adjective="funny"))
# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
print(tmpdir)
path = f"{tmpdir}/llm.yaml"
llm_chain.save(path)
loaded_chain = load_chain(path)
print(loaded_chain("funny"))
###############################
# Databricks
###############################
secret = sys.argv[1]
client = get_deploy_client("databricks")
# External - chat
name = f"chat-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "gpt-4",
"provider": "openai",
"task": "llm/v1/chat",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
chat = ChatDatabricks(
target_uri="databricks", endpoint=name, params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
finally:
client.delete_endpoint(endpoint=name)
# External - embeddings
name = f"embeddings-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "text-embedding-ada-002",
"provider": "openai",
"task": "llm/v1/embeddings",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
embeddings = DatabricksEmbeddings(target_uri="databricks", endpoint=name)
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
finally:
client.delete_endpoint(endpoint=name)
# External - completions
name = f"completions-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "gpt-3.5-turbo-instruct",
"provider": "openai",
"task": "llm/v1/completions",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
llm = Databricks(
endpoint_name=name,
model_kwargs={"temperature": 0.1},
)
print(llm("I am"))
finally:
client.delete_endpoint(endpoint=name)
# Foundation model - chat
chat = ChatDatabricks(
endpoint="databricks-llama-2-70b-chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
# Foundation model - embeddings
embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")
print(embeddings.embed_query("hello")[:3])
# Foundation model - completions
llm = Databricks(
endpoint_name="databricks-mpt-7b-instruct", model_kwargs={"temperature": 0.1}
)
print(llm("hello"))
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
print(llm_chain.run(adjective="funny"))
# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
print(tmpdir)
path = f"{tmpdir}/llm.yaml"
llm_chain.save(path)
loaded_chain = load_chain(path)
print(loaded_chain("funny"))
```
Output:
```
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
sorry, but I cannot continue the sentence as it is incomplete. Can you please provide more information or context?
Sure, here's a classic one for you:
Why don't scientists trust atoms?
Because they make up everything!
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmpx_4no6ad
{'adjective': 'funny', 'text': "Sure, here's a classic one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!"}
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
a 23 year old female and I am currently studying for my master's degree
content="\nHello! It's nice to meet you. Is there something I can help you with or would you like to chat for a bit?"
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]
hello back
Well, I don't really know many jokes, but I do know this funny story...
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmp7_ds72ex
{'adjective': 'funny', 'text': " Well, I don't really know many jokes, but I do know this funny story..."}
```
</p>
</details>
The existing workflow doesn't break:
<details><summary>click</summary>
<p>
```python
import uuid
import mlflow
from mlflow.models import ModelSignature
from mlflow.types.schema import ColSpec, Schema
class MyModel(mlflow.pyfunc.PythonModel):
def predict(self, context, model_input):
return str(uuid.uuid4())
with mlflow.start_run():
mlflow.pyfunc.log_model(
"model",
python_model=MyModel(),
pip_requirements=["mlflow==2.8.1", "cloudpickle<3"],
signature=ModelSignature(
inputs=Schema(
[
ColSpec("string", "prompt"),
ColSpec("string", "stop"),
]
),
outputs=Schema(
[
ColSpec(name=None, type="string"),
]
),
),
registered_model_name=f"lang-{uuid.uuid4()}",
)
# Manually create a serving endpoint with the registered model and run
from langchain.llms import Databricks
llm = Databricks(endpoint_name="<name>")
llm("hello") # 9d0b2491-3d13-487c-bc02-1287f06ecae7
```
</p>
</details>
## Follow-up tasks
(This PR is too large. I'll file a separate one for follow-up tasks.)
- Update `docs/docs/integrations/providers/mlflow_ai_gateway.mdx` and
`docs/docs/integrations/providers/databricks.md`.
---------
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
…parameters.
In Langchain's `dumps()` function, I've added a `**kwargs` parameter.
This allows users to pass additional parameters to the underlying
`json.dumps()` function, providing greater flexibility and control over
JSON serialization.
Many parameters available in `json.dumps()` can be useful or even
necessary in specific situations. For example, when using an Agent with
return_intermediate_steps set to true, the output is a list of
AgentAction objects. These objects can't be serialized without using
Langchain's `dumps()` function.
The issue arises when using the Agent with a language other than
English, which may contain non-ASCII characters like 'é'. The default
behavior of `json.dumps()` sets ensure_ascii to true, converting
`{"name": "José"}` into `{"name": "Jos\u00e9"}`. This can make the
output hard to read, especially in the case of intermediate steps in
agent logs.
By allowing users to pass additional parameters to `json.dumps()` via
Langchain's dumps(), we can solve this problem. For instance, users can
set `ensure_ascii=False` to maintain the original characters.
This update also enables users to pass other useful `json.dumps()`
parameters like `sort_keys`, providing even more flexibility.
The implementation takes into account edge cases where a user might pass
a "default" parameter, which is already defined by `dumps()`, or an
"indent" parameter, which is also predefined if `pretty=True` is set.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
### Description
Hello,
The [integration_test
README](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests)
was indicating incorrect paths for the `.env.example` and `.env` files.
`tests/.env.example` ->`tests/integration_tests/.env.example`
While it’s a minor error, it could **potentially lead to confusion** for
the document’s readers, so I’ve made the necessary corrections.
Thank you! ☺️
### Related Issue
- https://github.com/langchain-ai/langchain/pull/2806
**Description:**
Added support for a Pandas DataFrame OutputParser with format
instructions, along with unit tests and a demo notebook. Namely, we've
added the ability to request data from a DataFrame, have the LLM parse
the request, and then use that request to retrieve a well-formatted
response.
Within LangChain, it seamlessly integrates with language models like
OpenAI's `text-davinci-003`, facilitating streamlined interaction using
the format instructions (just like the other output parsers).
This parser structures its requests as
`<operation/column/row>[<optional_array_params>]`. The instructions
detail permissible operations, valid columns, and array formats,
ensuring clarity and adherence to the required format.
For example:
- When the LLM receives the input: "Retrieve the mean of `num_legs` from
rows 1 to 3."
- The provided format instructions guide the LLM to structure the
request as: "mean:num_legs[1..3]".
The parser processes this formatted request, leveraging the LLM's
understanding to extract the mean of `num_legs` from rows 1 to 3 within
the Pandas DataFrame.
This integration allows users to communicate requests naturally, with
the LLM transforming these instructions into structured commands
understood by the `PandasDataFrameOutputParser`. The format instructions
act as a bridge between natural language queries and precise DataFrame
operations, optimizing communication and data retrieval.
**Issue:**
- https://github.com/langchain-ai/langchain/issues/11532
**Dependencies:**
No additional dependencies :)
**Tag maintainer:**
@baskaryan
**Twitter handle:**
No need. :)
---------
Co-authored-by: Wasee Alam <waseealam@protonmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:**
When using Vald, only insecure grpc connection was supported, so secure
connection is now supported.
In addition, grpc metadata can be added to Vald requests to enable
authentication with a token.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
grammar correction
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Response_if_no_docs_found is not implemented in
ConversationalRetrievalChain for async code paths. Implemented it and
added test cases
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
# Description
This PR implements Self-Query Retriever for MongoDB Atlas vector store.
I've implemented the comparators and operators that are supported by
MongoDB Atlas vector store according to the section titled "Atlas Vector
Search Pre-Filter" from
https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/.
Namely:
```
allowed_comparators = [
Comparator.EQ,
Comparator.NE,
Comparator.GT,
Comparator.GTE,
Comparator.LT,
Comparator.LTE,
Comparator.IN,
Comparator.NIN,
]
"""Subset of allowed logical operators."""
allowed_operators = [
Operator.AND,
Operator.OR
]
```
Translations from comparators/operators to MongoDB Atlas filter
operators(you can find the syntax in the "Atlas Vector Search
Pre-Filter" section from the previous link) are done using the following
dictionary:
```
map_dict = {
Operator.AND: "$and",
Operator.OR: "$or",
Comparator.EQ: "$eq",
Comparator.NE: "$ne",
Comparator.GTE: "$gte",
Comparator.LTE: "$lte",
Comparator.LT: "$lt",
Comparator.GT: "$gt",
Comparator.IN: "$in",
Comparator.NIN: "$nin",
}
```
In visit_structured_query() the filters are passed as "pre_filter" and
not "filter" as in the MongoDB link above since langchain's
implementation of MongoDB atlas vector
store(libs\langchain\langchain\vectorstores\mongodb_atlas.py) in
_similarity_search_with_score() sets the "filter" key to have the value
of the "pre_filter" argument.
```
params["filter"] = pre_filter
```
Test cases and documentation have also been added.
# Issue
#11616
# Dependencies
No new dependencies have been added.
# Documentation
I have created the notebook mongodb_atlas_self_query.ipynb outlining the
steps to get the self-query mechanism working.
I worked closely with [@Farhan-Faisal](https://github.com/Farhan-Faisal)
on this PR.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
# Description
We implemented a simple tool for accessing the Merriam-Webster
Collegiate Dictionary API
(https://dictionaryapi.com/products/api-collegiate-dictionary).
Here's a simple usage example:
```py
from langchain.llms import OpenAI
from langchain.agents import load_tools, initialize_agent, AgentType
llm = OpenAI()
tools = load_tools(["serpapi", "merriam-webster"], llm=llm) # Serp API gives our agent access to Google
agent = initialize_agent(
tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
agent.run("What is the english word for the german word Himbeere? Define that word.")
```
Sample output:
```
> Entering new AgentExecutor chain...
I need to find the english word for Himbeere and then get the definition of that word.
Action: Search
Action Input: "English word for Himbeere"
Observation: {'type': 'translation_result'}
Thought: Now I have the english word, I can look up the definition.
Action: MerriamWebster
Action Input: raspberry
Observation: Definitions of 'raspberry':
1. rasp-ber-ry, noun: any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries
2. rasp-ber-ry, noun: a perennial plant (genus Rubus) of the rose family that bears raspberries
3. rasp-ber-ry, noun: a sound of contempt made by protruding the tongue between the lips and expelling air forcibly to produce a vibration; broadly : an expression of disapproval or contempt
4. black raspberry, noun: a raspberry (Rubus occidentalis) of eastern North America that has a purplish-black fruit and is the source of several cultivated varieties —called also blackcap
Thought: I now know the final answer.
Final Answer: Raspberry is an english word for Himbeere and it is defined as any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries.
> Finished chain.
```
# Issue
This closes#12039.
# Dependencies
We added no extra dependencies.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Lara <63805048+larkgz@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Update the document for drop box loader + made the
messages more verbose when loading pdf file since people were getting
confused
- **Issue:** #13952
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17,
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Added a tool called RedditSearchRun and an
accompanying API wrapper, which searches Reddit for posts with support
for time filtering, post sorting, query string and subreddit filtering.
- **Issue:** #13891
- **Dependencies:** `praw` module is used to search Reddit
- **Tag maintainer:** @baskaryan , and any of the other maintainers if
needed
- **Twitter handle:** None.
Hello,
This is our first PR and we hope that our changes will be helpful to the
community. We have run `make format`, `make lint` and `make test`
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.
Our PR integrates the `praw` package which is already used by
RedditPostsLoader in LangChain. Nonetheless, we have added integration
tests and edited unit tests to test our changes. An example notebook is
also provided. These changes were put together by me, @Anika2000,
@CharlesXu123, and @Jeremy-Cheng-stack
Thank you in advance to the maintainers for their time.
---------
Co-authored-by: What-Is-A-Username <49571870+What-Is-A-Username@users.noreply.github.com>
Co-authored-by: Anika2000 <anika.sultana@mail.utoronto.ca>
Co-authored-by: Jeremy Cheng <81793294+Jeremy-Cheng-stack@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Volc Engine MaaS serves as an enterprise-grade,
large-model service platform designed for developers. You can visit its
homepage at https://www.volcengine.com/docs/82379/1099455 for details.
This change will facilitate developers to integrate quickly with the
platform.
- **Issue:** None
- **Dependencies:** volcengine
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @he1v3tica
---------
Co-authored-by: lvzhong <lvzhong@bytedance.com>
- **Description:** use post field validation for `CohereRerank`
- **Issue:** #12899 and #13058
- **Dependencies:**
- **Tag maintainer:** @baskaryan
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Update 5 pdf document loaders in
`langchain.document_loaders.pdf`, to store a url in the metadata
(instead of a temporary, local file path) if the user provides a web
path to a pdf: `PyPDFium2Loader`, `PDFMinerLoader`,
`PDFMinerPDFasHTMLLoader`, `PyMuPDFLoader`, and `PDFPlumberLoader` were
updated.
- The updates follow the approach used to update `PyPDFLoader` for the
same behavior in #12092
- The `PyMuPDFLoader` changes required additional work in updating
`langchain.document_loaders.parsers.pdf.PyMuPDFParser` to be able to
process either an `io.BufferedReader` (from local pdf) or `io.BytesIO`
(from online pdf)
- The `PDFMinerPDFasHTMLLoader` change used a simpler approach since the
metadata is assigned by the loader and not the parser
- **Issue:** Fixes#7034
- **Dependencies:** None
```python
# PyPDFium2Loader example:
# old behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': '/var/folders/7z/d5dt407n673drh1f5cm8spj40000gn/T/tmpm5oqa92f/tmp.pdf', 'page': 0}
# new behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0}
```
- **Description:** Updated to remove deprecated parameter penalty_alpha,
and use string variation of prompt rather than json object for better
flexibility. - **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** N/A
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** @symbldotai
---------
Co-authored-by: toshishjawale <toshish@symbl.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Instead of using JSON-like syntax to describe node and relationship
properties we changed to a shorter and more concise schema description
Old:
```
Node properties are the following:
[{'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Movie'}, {'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Actor'}]
Relationship properties are the following:
[]
The relationships are the following:
['(:Actor)-[:ACTED_IN]->(:Movie)']
```
New:
```
Node properties are the following:
Movie {name: STRING},Actor {name: STRING}
Relationship properties are the following:
The relationships are the following:
(:Actor)-[:ACTED_IN]->(:Movie)
```
Implements
[#12115](https://github.com/langchain-ai/langchain/issues/12115)
Who can review?
@baskaryan , @eyurtsev , @hwchase17
Integrated Stack Exchange API into Langchain, enabling access to diverse
communities within the platform. This addition enhances Langchain's
capabilities by allowing users to query Stack Exchange for specialized
information and engage in discussions. The integration provides seamless
interaction with Stack Exchange content, offering content from varied
knowledge repositories.
A notebook example and test cases were included to demonstrate the
functionality and reliability of this integration.
- Add StackExchange as a tool.
- Add unit test for the StackExchange wrapper and tool.
- Add documentation for the StackExchange wrapper and tool.
If you have time, could you please review the code and provide any
feedback as necessary! My team is welcome to any suggestions.
---------
Co-authored-by: Yuval Kamani <yuvalkamani@gmail.com>
Co-authored-by: Aryan Thakur <aryanthakur@Aryans-MacBook-Pro.local>
Co-authored-by: Manas1818 <79381912+manas1818@users.noreply.github.com>
Co-authored-by: aryan-thakur <61063777+aryan-thakur@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** The class allows to only select between a few
predefined prompts from the paper. That is not ideal, since other use
cases might need a custom prompt. The changes made allow for this. To be
able to monitor those, I also added functionality to supply a custom
run_manager.
- **Issue:** no issue, but a new feature,
- **Dependencies:** none,
- **Tag maintainer:** @hwchase17,
- **Twitter handle:** @yvesloy
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Support providing whatever extra parameters you want
to the Mathpix PDF loader API request.
- **Issue:** #12773
- **Dependencies:** None
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Adds a tqdm progress bar to GooglePalmEmbeddings when
embedding a list.
- **Issue:** #13637
- **Dependencies:** TQDM as a main dependency (instead of extra)
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
---------
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
langsmith is available on conda-forge as well and also a dependency of
the package so it gets installed either way by conda
306ed13308/recipe/meta.yaml (L43)
This PR is fixing an attributeError: object endpoint has no attribute
"_public_match_client" when using gcp matching engine with private VPC
network.
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** As of OpenAI's Python package 1.0, the existing
DallEAPIWrapper does not work correctly, so the example in the LangChain
Documentation link below does not work either.
https://python.langchain.com/docs/integrations/tools/dalle_image_generator
Also, since OpenAI only supports DALL-E version 2 or version 3, I
modified the DallEAPIWrapper to support it.
- **Issue:** #13825
- **Twitter handle:** ggeutzzang
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Small fix to _summarization_ example, `reduce_template` should use
`{docs}` variable.
Bug likely introduced as following code suggests using
`hub.pull("rlm/map-prompt")` instead of defined prompt.
- **Description:** According to the document
https://cloud.baidu.com/doc/WENXINWORKSHOP/s/6lp69is2a, add ERNIE-Bot-8K
model support for ErnieBotChat.
- **Dependencies:** Before using the ERNIE-Bot-8K, you should have the
model's access authority.
Replace this entire comment with:
- **Description:** updates `create_llm_result` function within
`openai.py` to consider latest `params`,
- **Issue:** #8928
- **Dependencies:** -,
- **Tag maintainer:** -
- **Twitter handle:** [burkomr](https://twitter.com/burkomr)
<!-- If no one reviews your PR within a few days, please @-mention one
of @baskaryan, @eyurtsev, @hwchase17. -->
---------
Co-authored-by: Burak Ömür <burakomur@retorio.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Replace this entire comment with:
- **Description:** VertexAI models are now GA, moved away from using
preview ones from the SDK
- **Issue:** #13606
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
### Description:
Hey 👋🏽 this is a small docs example fix. Hoping it helps future developers who are working with Langchain.
### Problem:
Take a look at the original example code. You were not able to get the `dialogue_turn[0]` while it was a tuple.
Original code:
```python
def _format_chat_history(chat_history: List[Tuple]) -> str:
buffer = ""
for dialogue_turn in chat_history:
human = "Human: " + dialogue_turn[0]
ai = "Assistant: " + dialogue_turn[1]
buffer += "\n" + "\n".join([human, ai])
return buffer
```
In the original code you were getting this error:
```bash
human = "Human: " + dialogue_turn[0].content
~~~~~~~~~~~~~^^^
TypeError: 'HumanMessage' object is not subscriptable
```
### Solution:
The fix is to just for loop over the chat history and look to see if its a human or ai message and add it to the buffer.
**Description:**
Repair Wikipedia document loader `load_max_docs` and improve test
coverage.
**Issue:**
The Wikipedia document loader was not respecting the `load_max_docs`
paramater (not reported) and would always return a maximum of 10
documents. This is because the API wrapper (in `utilities/wikipedia.py`)
wasn't passing `top_k_results` to the underlying [Wikipedia
library](https://wikipedia.readthedocs.io/en/latest/code.html#module-wikipedia).
By default this library returns 10 results.
The default number of results for the document loader has been reduced
from 100 to 25. This is because loading 100 results takes a very long
time and is an inconvenient default. It should possibly be 10.
In addition, the documentation for the loader reported that there was a
hard limit (300) on the number of documents returned. In actuality 300
is the maximum Wikipedia query character length set by the API wrapper.
Tests have been added for the document loader (previously missing) and
to test the correct numbers of documents are being returned by each
class, both by default, and when overridden. Also repaired is the
`assert_docs` test which has been updated to correctly test for the
default metadata (which includes `source` in recent releases).
**Dependencies:**
nil
**Tag maintainer:**
@leo-gan
**Twitter handle:**
@queenvictoria
### **Description:**
Previously `python_repl` was a built-in tool, but now it has been moved
to `langchain_experimental`.
When I use `load_tools` I get an error:
```python
In [1]: from langchain.agents import load_tools
In [2]: load_tools(["python_repl"])
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[2], line 1
----> 1 load_tools(["python_repl"])
File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:530, in load_tools(tool_names, llm, callbacks, **kwargs)
528 tool_names.extend(requests_method_tools)
529 elif name in _BASE_TOOLS:
--> 530 tools.append(_BASE_TOOLS[name]())
531 elif name in _LLM_TOOLS:
532 if llm is None:
File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:84, in _get_python_repl()
83 def _get_python_repl() -> BaseTool:
---> 84 raise ImportError(
85 "This tool has been moved to langchain experiment. "
86 "This tool has access to a python REPL. "
87 "For best practices make sure to sandbox this tool. "
88 "Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md "
89 "To keep using this code as is, install langchain experimental and "
90 "update relevant imports replacing 'langchain' with 'langchain_experimental'"
91 )
ImportError: This tool has been moved to langchain experiment. This tool has access to a python REPL. For best practices make sure to sandbox this tool. Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md To keep using this code as is, install langchain experimental and update relevant imports replacing 'langchain' with 'langchain_experimental'
```
In this case, it will be very confusing. I think it is no longer a
built-in tool now, so it can be removed from `_BASE_TOOLS`
### **Issue:**
https://github.com/langchain-ai/langchain/issues/13858,
https://github.com/langchain-ai/langchain/issues/13859,
https://github.com/langchain-ai/langchain/issues/13856
### **Twitter handle:**
[lin_bob57617](https://twitter.com/lin_bob57617)
The `integrations/vectorstores/matchingengine.ipynb` example has the
"Google Vertex AI Vector Search" title. This place this Title in the
wrong order in the ToC (it is sorted by the file name).
- Renamed `integrations/vectorstores/matchingengine.ipynb` into
`integrations/vectorstores/google_vertex_ai_vector_search.ipynb`.
- Updated a correspondent comment in docstring
- Rerouted old URL to a new URL
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
It was :
`from langchain.schema.prompts import BasePromptTemplate`
but because of the breaking change in the ns, it is now
`from langchain.schema.prompt_template import BasePromptTemplate`
This bug prevents building the API Reference for the langchain_experimental
Addressed this issue with the top menu: It allocates too much space. If the screen is small, then the top menu items are split into two lines and look unreadable.
Another issue is with several top menu items: "Chat our docs" and "Also by LangChain". They are compound of several words which also hurts readability. The top menu items should be 1-word size.
Updates:
- "Chat our docs" -> "Chat" (the meaning is clean after clicking/opening the item)
- "Also by LangChain" -> "🦜️🔗"
- "🦜️🔗" moved before "Chat" item. This new item is partially copied from the first left item, the "🦜️🔗 LangChain". This design (with two 🦜️🔗 elements, visually splits the top menu into two parts. The first item in each part holds the 🦜️🔗 symbols and, when we click the second 🦜️🔗 item, it opens the drop-down menu. So, we've got two visually similar parts, which visually split the top menu on the right side: the LangChain Docs (and Doc-related items) and the lift side: other LangChain.ai (company) products/docs.
There are the following main changes in this PR:
1. Rewrite of the DocugamiLoader to not do any XML parsing of the DGML
format internally, and instead use the `dgml-utils` library we are
separately working on. This is a very lightweight dependency.
2. Added MMR search type as an option to multi-vector retriever, similar
to other retrievers. MMR is especially useful when using Docugami for
RAG since we deal with large sets of documents within which a few might
be duplicates and straight similarity based search doesn't give great
results in many cases.
We are @docugami on twitter, and I am @tjaffri
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
- **Description:** Adds a retriever implementation for [Knowledge Bases
for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/), a
new service announced at AWS re:Invent, shortly before this PR was
opened. This depends on the `bedrock-agent-runtime` service, which will
be included in a future version of `boto3` and of `botocore`. We will
open a follow-up PR documenting the minimum required versions of `boto3`
and `botocore` after that information is available.
- **Issue:** N/A
- **Dependencies:** `boto3>=1.33.2, botocore>=1.33.2`
- **Tag maintainer:** @baskaryan
- **Twitter handles:** `@pjain7` `@dead_letter_q`
This PR includes a documentation notebook under
`docs/docs/integrations/retrievers`, which I (@dlqqq) have verified
independently.
EDIT: `bedrock-agent-runtime` service is now included in
`boto3>=1.33.2`:
5cf793f493
---------
Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Addressing incorrect order being sent to callbacks / tracers, due to the
nature of threading
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Add arg to omit streamed_output list, in cases where final_output is
enough this saves bandwidth
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR rearranges the docstring for the `AstraDB` vector store class so
as to have all useful information in the _class_ docstring for ease of
reading.
(incidentally, due to an oversight, the docstring that was in the
constructor ended up buried below some lines of code, thereby
disappearing altogether from accessibility. Apologies.)
- **Description:** Updates to `AnthropicFunctions` to be compatible with
the OpenAI `function_call` functionality.
- **Issue:** The functionality to indicate `auto`, `none` and a forced
function_call was not completely implemented in the existing code.
- **Dependencies:** None
- **Tag maintainer:** @baskaryan , and any of the other maintainers if
needed.
- **Twitter handle:** None
I have specifically tested this functionality via AWS Bedrock with the
Claude-2 and Claude-Instant models.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** dead link replacement
- **Issue:** no open issue
**Note:**
Hi langchain team,
Sorry to open a PR for this concern but we realized that one of the
links present in the documentation booklet was broken 😄
- **Description:** We are adding functionality to extract message
content from the `attributedBody` field of the database, in case the
content is not in the `text` field.
- **Issue:** Closes#13326 and #10680
- **Dependencies:** None.
- **Tag maintainer:** @eyurtsev, @hwchase17
---------
Co-authored-by: onotate <johnp.pham@mail.utoronto.ca>
- **Description:** Reduce image asset file size used in documentation by
running them via lossless image optimization
([tinypng](https://www.npmjs.com/package/tinypng-cli) was used in this
case). Images wider than 1916px (the maximum width of an image displayed
in documentation) where downsized.
- **Issue:** No issue is created for this, but the large image file
assets caused slow documentation load times
- **Dependencies:** No dependencies affected
- **Description:** Previously `MarkdownHeaderTextSplitter` did not
consider tilde-fenced code blocks
(https://spec.commonmark.org/0.30/#fenced-code-blocks). This PR fixes
that.
````md
# Bug caused by previous implementation:
~~~py
foo()
# This is a comment that would be considered header
bar()
~~~
````
- **Tag maintainer:** @baskaryan
Several bug fixes:
- emails: instead of `bcc` the `cc` is used.
- errors in the truncation descriptions
- no truncation of the `message_search`
Several updates:
- generalized UTC format
- truncation limit can be changed now in _call()
Fixes#13407.
This workaround consists in letting the RunnableLambda create its
self.afunc from its self.func when self.afunc is not provided; the
change has no dependency.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
```
---- chunk 1
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]}
---- chunk 2
{'messages': [FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”")]}
---- chunk 3
{'actions': [AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]}
---- chunk 4
{'messages': [FunctionMessage(content='25 years', name='Search')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years')]}
---- chunk 5
{'actions': [AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]}
---- chunk 6
{'messages': [FunctionMessage(content='Answer: 3.991298452658078', name='Calculator')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
---- chunk 7
{'messages': [AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
'power is approximately 3.99.'}
---- final
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]),
AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]),
AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}}),
FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search'),
AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}}),
FunctionMessage(content='25 years', name='Search'),
AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}}),
FunctionMessage(content='Answer: 3.991298452658078', name='Calculator'),
AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
'power is approximately 3.99.',
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”"),
AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years'),
AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
```
- **Description:** Existing model used for Prompt Injection is quite
outdated but we fine-tuned and open-source a new model based on the same
model deberta-v3-base from Microsoft -
[laiyer/deberta-v3-base-prompt-injection](https://huggingface.co/laiyer/deberta-v3-base-prompt-injection).
It supports more up-to-date injections and less prone to
false-positives.
- **Dependencies:** No
- **Tag maintainer:** -
- **Twitter handle:** @alex_yaremchuk
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** The experimental package needs to be compatible with
the usage of importing agents
For example, if i use `from langchain.agents import
create_pandas_dataframe_agent`, running the program will prompt the
following information:
```
Traceback (most recent call last):
File "/Users/dongwm/test/main.py", line 1, in <module>
from langchain.agents import create_pandas_dataframe_agent
File "/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain/agents/__init__.py", line 87, in __getattr__
raise ImportError(
ImportError: create_pandas_dataframe_agent has been moved to langchain experimental. See https://github.com/langchain-ai/langchain/discussions/11680 for more information.
Please update your import statement from: `langchain.agents.create_pandas_dataframe_agent` to `langchain_experimental.agents.create_pandas_dataframe_agent`.
```
But when I changed to `from langchain_experimental.agents import
create_pandas_dataframe_agent`, it was actually wrong:
```python
Traceback (most recent call last):
File "/Users/dongwm/test/main.py", line 2, in <module>
from langchain_experimental.agents import create_pandas_dataframe_agent
ImportError: cannot import name 'create_pandas_dataframe_agent' from 'langchain_experimental.agents' (/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain_experimental/agents/__init__.py)
```
I should use `from langchain_experimental.agents.agent_toolkits import
create_pandas_dataframe_agent`. In order to solve the problem and make
it compatible, I added additional import code to the
langchain_experimental package. Now it can be like this Used `from
langchain_experimental.agents import create_pandas_dataframe_agent`
- **Twitter handle:** [lin_bob57617](https://twitter.com/lin_bob57617)
- **Description:** Adds a tqdm progress bar to OllamaEmbeddings when
embedding a list.
- **Issue:** Related to #13637, but extended to Ollama.
- **Dependencies:** `tqdm` made a necessary dependency.
Thanks to @ugm2 for helping identify a common problem. Embeddings take a
very long time to finish on local machines, and require a progress bar
to help identify if one should even attempt the workload.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Adding rag-opensearch template.
---------
Signed-off-by: kalyanr <kalyan.ben10@live.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Current docs for adapters are in the `Guides/Adapters which is not a
good place.
- moved Adapters into `Integratons/Components/Adapters/
- simplified the OpenAI adapter notebook
- rerouted the old OpenAI adapter page URL to a new one.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** Added a line to pass the tenant parameter to
add_data_object
- **Issue:** An extra line added from the fix for #9956
- **Dependencies:** n/a
- **Tag maintainer:** @baskaryan
Tested locally, works as expected with the line change.
---------
Co-authored-by: Simon Dai <simon6752@gmail.com>
Description: Some Elastic indexes do not return a 'metadata' field in
'_source'. However, prior to this PR, the code assumed there always is a
'metadata' field. This PR adds support for cases where the field is
missing by adding it manually.
Issue: #13869
**Description:**
This PR adds Databricks Vector Search as a new vector store in
LangChain.
- [x] Add `DatabricksVectorSearch` in `langchain/vectorstores/`
- [x] Unit tests
- [x] Add
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)
as a new optional dependency
We ran the following checks:
- `make format` passed ✅
- `make lint` failed but the failures were caused by other files
+ Files touched by this PR passed the linter ✅
- `make test` passed ✅
- `make coverage` failed but the failures were caused by other files.
Tests added by or related to this PR all passed
+ langchain/vectorstores/databricks_vector_search.py test coverage 94% ✅
- `make spell_check` passed ✅
The example notebook and updates to the [provider's documentation
page](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/providers/databricks.md)
will be added later in a separate PR.
**Dependencies:**
Optional dependency:
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
The cookbook had some code to upload files, and wait for the processing
to finish.
This code is now moved to the `docugami` library so removing from the
cookbook to simplify.
Thanks @rlancemartin for suggesting this when working on evals.
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
This pull request addresses an issue found in the example code within
the docstring of `libs/core/langchain_core/runnables/passthrough.py`
The original code snippet caused a `NameError` due to the missing import
of `RunnableLambda`. The error was as follows:
```
12 return "completion"
13
---> 14 chain = RunnableLambda(fake_llm) | {
15 'original': RunnablePassthrough(), # Original LLM output
16 'parsed': lambda text: text[::-1] # Parsing logic
NameError: name 'RunnableLambda' is not defined
```
To resolve this, I have modified the example code to include the
necessary import statement for `RunnableLambda`. Additionally, I have
adjusted the indentation in the code snippet to ensure consistency and
readability.
The modified code now successfully defines and utilizes
`RunnableLambda`, ensuring that users referencing the docstring will
have a functional and clear example to follow.
There are no related GitHub issues for this particular change.
Modified Code:
```python
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.runnables import RunnableLambda
runnable = RunnableParallel(
origin=RunnablePassthrough(),
modified=lambda x: x+1
)
runnable.invoke(1) # {'origin': 1, 'modified': 2}
def fake_llm(prompt: str) -> str: # Fake LLM for the example
return "completion"
chain = RunnableLambda(fake_llm) | {
'original': RunnablePassthrough(), # Original LLM output
'parsed': lambda text: text[::-1] # Parsing logic
}
chain.invoke('hello') # {'original': 'completion', 'parsed': 'noitelpmoc'}
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Added a retriever for the Outline API to ask
questions on knowledge base
- **Issue:** resolves#11814
- **Dependencies:** None
- **Tag maintainer:** @baskaryan
- **Description:**
I encountered an issue while running the existing sample code on the
page https://python.langchain.com/docs/modules/agents/how_to/agent_iter
in an environment with Pydantic 2.0 installed. The following error was
triggered:
```python
ValidationError Traceback (most recent call last)
<ipython-input-12-2ffff2c87e76> in <cell line: 43>()
41
42 tools = [
---> 43 Tool(
44 name="GetPrime",
45 func=get_prime,
2 frames
/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py in __init__(__pydantic_self__, **data)
339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
340 if validation_error:
--> 341 raise validation_error
342 try:
343 object_setattr(__pydantic_self__, '__dict__', values)
ValidationError: 1 validation error for Tool
args_schema
subclass of BaseModel expected (type=type_error.subclass; expected_class=BaseModel)
```
I have made modifications to the example code to ensure it functions
correctly in environments with Pydantic 2.0.
- **Description:** Simple change, I just added title metadata to
GoogleDriveLoader for optional File Loaders
- **Dependencies:** no dependencies
- **Tag maintainer:** @hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR provides idiomatic implementations for the exact-match and the
semantic LLM caches using Astra DB as backend through the database's
HTTP JSON API. These caches require the `astrapy` library as dependency.
Comes with integration tests and example usage in the `llm_cache.ipynb`
in the docs.
@baskaryan this is the Astra DB counterpart for the Cassandra classes
you merged some time ago, tagging you for your familiarity with the
topic. Thank you!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR adds a chat message history component that uses Astra DB for
persistence through the JSON API.
The `astrapy` package is required for this class to work.
I have added tests and a small notebook, and updated the relevant
references in the other docs pages.
(@rlancemartin this is the counterpart of the Cassandra equivalent class
you so helpfully reviewed back at the end of June)
Thank you!
- **Description:** This commit fixed the problem that Redis vector store
will change the value of a metadata from 0 to empty when saving the
document, which should be an un-intended behavior.
- **Issue:** N/A
- **Dependencies:** N/A
**Description:** Currently, if we pass in a ToolMessage back to the
chain, it crashes with error
`Got unsupported message type: `
This fixes it.
Tested locally
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** BaseStringMessagePromptTemplate.from_template was
passing the value of partial_variables into cls(...) via **kwargs,
rather than passing it to PromptTemplate.from_template. Which resulted
in those *partial_variables being* lost and becoming required
*input_variables*.
Co-authored-by: Josep Pon Farreny <josep.pon-farreny@siemens.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Fix some circular deps:
- move PromptValue into top level module bc both PromptTemplates and
OutputParsers import
- move tracer context vars to `tracers.context` and import them in
functions in `callbacks.manager`
- add core import tests
Adds a cookbook for semi-structured RAG via Docugami. This follows the
same outline as the semi-structured RAG with Unstructured cookbook:
https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_Structured_RAG.ipynb
The main change is this cookbook uses Docugami instead of Unstructured
to find text and tables, and shows how XML markup in the output helps
with retrieval and generation.
We are \@docugami on twitter, I am \@tjaffri
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
- **Description:** We need to update the Dockerfile for templates to
also copy your README.md. This is because poetry requires that a readme
exists if it is specified in the pyproject.toml
Changes:
- remove langchain_core/schema since no clear distinction b/n schema and
non-schema modules
- make every module that doesn't end in -y plural
- where easy have 1-2 classes per file
- no more than one level of nesting in directories
- only import from top level core modules in langchain
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** fix a bug that prevented as_retriever() in Vectara to
use the desired input arguments
- **Issue:** as_retriever did not pass the arguments properly
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @ofermend
I encountered this during summarization with VertexAI. I was receiving
an INVALID_ARGUMENT error, as it was trying to send a list of about
17000 single characters.
The [count_tokens
method](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/language_models/_language_models.py#L658)
made available by Google takes in a list of prompts. It does not fail
for small texts, but it does for longer documents because the argument
list will be exceeding Googles allowed limit. Enforcing the list type
makes it work successfully.
This change will cast the input text to count to a list of that single
text so that the input format is always correct.
[Twitter](https://www.x.com/stijn_tratsaert)
- **Description:** ERNIE-Bot-Chat-4 Large Language Model adds the
ability of `Function Calling` by passing parameters through the
`functions` parameter in the request. To simplify function calling for
ERNIE-Bot-Chat-4, the `create_ernie_fn_chain()` function has been added.
The definition and usage of the `create_ernie_fn_chain()` function is
similar to that of the `create_openai_fn_chain()` function.
Examples as the follows:
```
import json
from langchain.chains.ernie_functions import (
create_ernie_fn_chain,
)
from langchain.chat_models import ErnieBotChat
from langchain.prompts import ChatPromptTemplate
def get_current_news(location: str) -> str:
"""Get the current news based on the location.'
Args:
location (str): The location to query.
Returs:
str: Current news based on the location.
"""
news_info = {
"location": location,
"news": [
"I have a Book.",
"It's a nice day, today."
]
}
return json.dumps(news_info)
def get_current_weather(location: str, unit: str="celsius") -> str:
"""Get the current weather in a given location
Args:
location (str): location of the weather.
unit (str): unit of the tempuature.
Returns:
str: weather in the given location.
"""
weather_info = {
"location": location,
"temperature": "27",
"unit": unit,
"forecast": ["sunny", "windy"],
}
return json.dumps(weather_info)
llm = ErnieBotChat(model_name="ERNIE-Bot-4")
prompt = ChatPromptTemplate.from_messages(
[
("human", "{query}"),
]
)
chain = create_ernie_fn_chain([get_current_weather, get_current_news], llm, prompt, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```
The running results of the above program are shown below:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?
> Finished chain.
{'name': 'get_current_news', 'thoughts': '用户想要知道北京今天的新闻。我可以使用get_current_news工具来获取这些信息。', 'arguments': {'location': '北京'}}
```
- **Description:** during search with DeepLake some people are facing
backwards compatibility issues, this PR fixes it by making search
accessible for the older datasets
---------
Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>
- **Description:**
- Fixes a `key_prefix` bug where passing it in on
`Redis.from_existing(...)` did not work properly. Updates doc strings
accordingly.
- Updates Redis filter classes logic with best practices on typing,
string formatting, and handling "empty" filters.
- Fixes a bug that would prevent multiple tag filters from being applied
together in some scenarios.
- Added a whole new filter unit testing module. Also updated code
formatting for a number of modules that were failing the `make`
commands.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @tchutch94
- **Description:** Fix typo in MongoDB memory docs
- **Tag maintainer:** @eyurtsev
<!-- Thank you for contributing to LangChain!
- **Description:** Fix typo in MongoDB memory docs
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** @baskaryan
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
In the `FORMAT_INSTRUCTIONS` template, 4 curly braces (escaping) are
used to get single curly brace after formatting:
```
"{{{ ... }}}}" -> format_instructions.format() -> "{{ ... }}" -> template.format() -> "{ ... }".
```
Tool's `args_schema` string contains single braces `{ ... }`, and is
also transformed to `{{{{ ... }}}}` form. But this is not really correct
since there is only one `format()` call:
```
"{{{{ ... }}}}" -> template.format() -> "{{ ... }}".
```
As a result we get double curly braces in the prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:
foo: Test tool FOO, args: {{'tool_input': {{'type': 'string'}}}} # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:
```
{
"action": $TOOL_NAME,
"action_input": $INPUT
}
```
````
This PR fixes curly braces escaping in the `args_schema` to have single
braces in the final prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:
foo: Test tool FOO, args: {'tool_input': {'type': 'string'}} # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:
```
{
"action": $TOOL_NAME,
"action_input": $INPUT
}
```
````
---------
Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
Hi 👋 We are working with Llama2 on Bedrock, and would like to add it to
Langchain. We saw a [pull
request](https://github.com/langchain-ai/langchain/pull/13322) to add it
to the `llm.Bedrock` class, but since it concerns a chat model, we would
like to add it to `BedrockChat` as well.
- **Description:** Add support for Llama2 to `BedrockChat` in
`chat_models`
- **Issue:** the issue # it fixes (if applicable)
[#13316](https://github.com/langchain-ai/langchain/issues/13316)
- **Dependencies:** any dependencies required for this change `None`
- **Tag maintainer:** /
- **Twitter handle:** `@SimonBockaert @WouterDurnez`
---------
Co-authored-by: wouter.durnez <wouter.durnez@showpad.com>
Co-authored-by: Simon Bockaert <simon.bockaert@showpad.com>
- **Description:** This change adds an agent to the Azure Cognitive
Services toolkit for identifying healthcare entities
- **Dependencies:** azure-ai-textanalytics (Optional)
---------
Co-authored-by: James Beck <James.Beck@sa.gov.au>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Hi!
This short PR aims at:
* Fixing `OpenAIEmbeddings`' check on `chunk_size` when used with Azure
OpenAI (thus with openai < 1.0). Azure OpenAI embeddings support at most
16 chunks per batch, I believe we are supposed to take the min between
the passed value/default value and 16, not the max - which, I suppose,
was introduced by accident while refactoring the previous version of
this check from this other PR of mine: #10707
* Porting this fix to the newest class (`AzureOpenAIEmbeddings`) for
openai >= 1.0
This fixes#13539 (closed but the issue persists).
@baskaryan @hwchase17
- **Description:** There are several mistakes in the sample code in the
doc-string of `DashVector` class, and this pull request aims to correct
them.
The correction code has been tested against latest version (at the time
of creation of this pull request) of: `langchain==0.0.336`
`dashvector==1.0.6` .
- **Issue:** No issue is created for this.
- **Dependencies:** No dependency is required for this change,
<!-- - **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below), -->
- **Twitter handle:** `zeyanglin`
<!-- Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** AstraDB is going to deprecate the `$similarity`
projection property in favor of the ´includeSimilarity´ option flag. I
moved all the queries to the new format.
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
Added a `search_kwargs` field to BingSearchAPIWrapper in
`bing_search.py,` enabling users to include extra keyword arguments in
Bing search queries. This update, like specifying language preferences,
adds more customization to searches. The `search_kwargs` seamlessly
merge with standard parameters in `_bing_search_results` method.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Fix Astra integration tests that are failing. The
`delete` always return True as the deletion is successful if no errors
are thrown. I aligned the test to verify this behaviour
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
The issue was accuring because of `openai` update in Completions. its
not accepting `api_key` and 'api_base' args.
The fix is we check for the openai version and if ats v1 then remove
these keys from args before passing them to `Compilation.create(...)`
when sending from `VLLMOpenAI`
Fixed: #13507
@eyu
@efriis
@hwchase17
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** In this pull request, we address an issue related to
assigning a schema to the SQLDatabase class when utilizing an Oracle
database. The current implementation encounters a bug where, upon
attempting to execute a query, the alter session parse is not
appropriately defined for Oracle, leading to an error,
- **Issue:** #7928,
- **Dependencies:** No dependencies,
- **Tag maintainer:** @baskaryan,
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** This change allows for the `MWDumpLoader` to load all
namespaces including custom by default instead of only loading the
[default
namespaces](https://www.mediawiki.org/wiki/Help:Namespaces#Localisation).
- **Tag maintainer:** @hwchase17
**Description:**
This commit adds embedchain retriever along with tests and docs.
Embedchain is a RAG framework to create data pipelines.
**Twitter handle:**
- [Taranjeet's twitter](https://twitter.com/taranjeetio) and
[Embedchain's twitter](https://twitter.com/embedchain)
**Reviewer**
@hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
Enhance the functionality of YoutubeLoader to enable the translation of
available transcripts by refining the existing logic.
**Issue:**
Encountering a problem with YoutubeLoader (#13523) where the translation
feature is not functioning as expected.
Tag maintainers/contributors who might be interested:
@eyurtsev
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
BUG: langchain.agents.openai_assistant has a reference as
`from langchain_experimental.openai_assistant.base import
OpenAIAssistantRunnable`
should be
`from langchain.agents.openai_assistant.base import
OpenAIAssistantRunnable`
This prevents building of the API Reference docs
## Update 2023-09-08
This PR now supports further models in addition to Lllama-2 chat models.
See [this comment](#issuecomment-1668988543) for further details. The
title of this PR has been updated accordingly.
## Original PR description
This PR adds a generic `Llama2Chat` model, a wrapper for LLMs able to
serve Llama-2 chat models (like `LlamaCPP`,
`HuggingFaceTextGenInference`, ...). It implements `BaseChatModel`,
converts a list of chat messages into the [required Llama-2 chat prompt
format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) and
forwards the formatted prompt as `str` to the wrapped `LLM`. Usage
example:
```python
# uses a locally hosted Llama2 chat model
llm = HuggingFaceTextGenInference(
inference_server_url="http://127.0.0.1:8080/",
max_new_tokens=512,
top_k=50,
temperature=0.1,
repetition_penalty=1.03,
)
# Wrap llm to support Llama2 chat prompt format.
# Resulting model is a chat model
model = Llama2Chat(llm=llm)
messages = [
SystemMessage(content="You are a helpful assistant."),
MessagesPlaceholder(variable_name="chat_history"),
HumanMessagePromptTemplate.from_template("{text}"),
]
prompt = ChatPromptTemplate.from_messages(messages)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chain = LLMChain(llm=model, prompt=prompt, memory=memory)
# use chat model in a conversation
# ...
```
Also part of this PR are tests and a demo notebook.
- Tag maintainer: @hwchase17
- Twitter handle: `@mrt1nz`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Added a method `fetch_valid_documents` to
`WebResearchRetriever` class that will test the connection for every url
in `new_urls` and remove those that raise a `ConnectionError`.
- **Issue:** [Previous
PR](https://github.com/langchain-ai/langchain/pull/13353),
- **Dependencies:** None,
- **Tag maintainer:** @efriis
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
## Description
This PR adds an option to allow unsigned requests to the Neptune
database when using the `NeptuneGraph` class.
```python
graph = NeptuneGraph(
host='<my-cluster>',
port=8182,
sign=False
)
```
Also, added is an option in the `NeptuneOpenCypherQAChain` to provide
additional domain instructions to the graph query generation prompt.
This will be injected in the prompt as-is, so you should include any
provider specific tags, for example `<instructions>` or `<INSTR>`.
```python
chain = NeptuneOpenCypherQAChain.from_llm(
llm=llm,
graph=graph,
extra_instructions="""
Follow these instructions to build the query:
1. Countries contain airports, not the other way around
2. Use the airport code for identifying airports
"""
)
```
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description**
MongoDB drivers are used in various flavors and languages. Making sure
we exercise our due diligence in identifying the "origin" of the library
calls makes it best to understand how our Atlas servers get accessed.
The original notebook has the `faiss` title which is duplicated in
the`faiss.jpynb`. As a result, we have two `faiss` items in the
vectorstore ToC. And the first item breaks the searching order (it is
placed between `A...` items).
- I updated title to `Asynchronous Faiss`.
**Description/Issue:**
When OpenAI calls a function with no args, the args are `""` rather than
`"{}"`. Then `json.loads("")` blows up. This PR handles it correctly.
**Dependencies:** None
- Fixed titles for two notebooks. They were inconsistent with other
titles and clogged ToC.
- Added `Upstash` description and link
- Moved the authentication text up in the `Elasticsearch` nb, right
after package installation. It was on the end of the page which was a
wrong place.
This PR brings a few minor improvements to the docs, namely class/method
docstrings and the demo notebook.
- A note on how to control concurrency levels to tune performance in
bulk inserts, both in the class docstring and the demo notebook;
- Slightly increased concurrency defaults after careful experimentation
(still on the conservative side even for clients running on
less-than-typical network/hardware specs)
- renamed the DB token variable to the standardized
`ASTRA_DB_APPLICATION_TOKEN` name (used elsewhere, e.g. in the Astra DB
docs)
- added a note and a reference (add_text docstring, demo notebook) on
allowed metadata field names.
Thank you!
The current `integrations/document_loaders/` sidebar has the
`example_data` item, which is a menu with a single item: "Notebook".
It is happening because the `integrations/document_loaders/` folder has
the `example_data/notebook.md` file that is used to autogenerate the
above menu item.
- removed an example_data/notebook.md file. Docusaurus doesn't have
simple ways to fix this problem (to exclude folders/files from an
autogenerated sidebar). Removing this file didn't break any existing
examples, so this fix is safe.
Updated several notebooks:
- fixed titles which are inconsistent or break the ToC sorting order.
- added missed soruce descriptions and links
- fixed formatting
- the `SemaDB` notebook was placed in additional subfolder which breaks
the vectorstore ToC. I moved file up, removed this unnecessary
subfolder; updated the `vercel.json` with rerouting for the new URL
- Added SemaDB description and link
- improved text consistency
- Fixed the title of the notebook. It created an ugly ToC element as
`Activeloop DeepLake's DeepMemory + LangChain + ragas or how to get +27%
on RAG recall.`
- Added Activeloop description
- improved consistency in text
- fixed ToC (it was using HTML tagas that break left-side in-page ToC).
Now in-page ToC works
- Fixed headers (was more then 1 Titles)
- Removed security token value. It was OK to have it, because it is
temporary token, but the automatic security swippers raise warnings on
that.
- Added `ClickUp` service description and link.
…rnative LLMs until used
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
fix#13356
Add supports following properties for metadata to NotionDBLoader.
- `checkbox`
- `email`
- `number`
- `select`
There are no relevant tests for this code to be updated.
The `Integrations` site is hidden now.
I've added it into the `More` menu.
The name is `Integration Cards` otherwise, it is confused with the
`Integrations` menu.
---------
Co-authored-by: Erick Friis <erickfriis@gmail.com>
- **Description:** Adds `limit_to_domains` param to the APIChain based
tools (open_meteo, TMDB, podcast_docs, and news_api)
- **Issue:** I didn't open an issue, but after upgrading to 0.0.328
using these tools would throw an error.
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan
**Note**: I included the trailing / simply because the docs here did
fc886cc303/docs/docs/use_cases/apis.ipynb (L246)
, but I checked the code and it is using `urlparse`. SoI followed the
docs since it comes down to stylee.
The `langchain` repo was being flagged for using vulnerable
dependencies, some of which were in this template's lockfile. Updating
to newer versions should fix that.
Just `poetry lock` and moving `langchain` to the latest version, in case
folks copy this template.
This resolves some vulnerable dependency alerts GitHub code scanning was
flagging.
My thought is that the ==version would prevent pip from finding the
package on regular [pypi.org](http://pypi.org/), so it would look at
[test.pypi.org](http://test.pypi.org/) for that. Otherwise it'll pull
package from [pypi.org](http://pypi.org/) (e.g. sub deps)
Right now, the cli release is failing because it's going to
test.pypi.org by default, so it finds this incorrect FASTAPI package
instead of the real one: https://test.pypi.org/project/FASTAPI/
The new ruff version fixed the blocking bugs, and I was able to fairly
easily us to a passing state: ruff fixed some issues on its own, I fixed
a handful by hand, and I added a list of narrowly-targeted exclusions
for files that are currently failing ruff rules that we probably should
look into eventually.
I went pretty lenient on the docs / cookbooks rules, allowing dead code
and such things. Perhaps in the future we may want to tighten the rules
further, but this is already a good set of checks that found real issues
and will prevent them going forward.
Hi,
this PR adds support for OpenAI API v1 for Azure OpenAI completion API.
@baskaryan @hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Bumps [pyarrow](https://github.com/apache/arrow) from 13.0.0 to 14.0.1.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ba53748361"><code>ba53748</code></a>
MINOR: [Release] Update versions for 14.0.1</li>
<li><a
href="529f3768fa"><code>529f376</code></a>
MINOR: [Release] Update .deb/.rpm changelogs for 14.0.1</li>
<li><a
href="b84bbcac64"><code>b84bbca</code></a>
MINOR: [Release] Update CHANGELOG.md for 14.0.1</li>
<li><a
href="f141709763"><code>f141709</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/38607">GH-38607</a>:
[Python] Disable PyExtensionType autoload (<a
href="https://redirect.github.com/apache/arrow/issues/38608">#38608</a>)</li>
<li><a
href="5a37e74198"><code>5a37e74</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/38431">GH-38431</a>:
[Python][CI] Update fs.type_name checks for s3fs tests (<a
href="https://redirect.github.com/apache/arrow/issues/38455">#38455</a>)</li>
<li><a
href="2dcee3f82c"><code>2dcee3f</code></a>
MINOR: [Release] Update versions for 14.0.0</li>
<li><a
href="297428cbf2"><code>297428c</code></a>
MINOR: [Release] Update .deb/.rpm changelogs for 14.0.0</li>
<li><a
href="3e9734f883"><code>3e9734f</code></a>
MINOR: [Release] Update CHANGELOG.md for 14.0.0</li>
<li><a
href="9f90995c8c"><code>9f90995</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/38332">GH-38332</a>:
[CI][Release] Resolve symlinks in RAT lint (<a
href="https://redirect.github.com/apache/arrow/issues/38337">#38337</a>)</li>
<li><a
href="bd61239a32"><code>bd61239</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/35531">GH-35531</a>:
[Python] C Data Interface PyCapsule Protocol (<a
href="https://redirect.github.com/apache/arrow/issues/37797">#37797</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/apache/arrow/compare/go/v13.0.0...go/v14.0.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/langchain-ai/langchain/network/alerts).
</details>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
Hey @rlancemartin, @eyurtsev ,
I did some minimal changes to the `ElasticVectorSearch` client so that
it plays better with existing ES indices.
Main changes are as follows:
1. You can pass the dense vector field name into `_default_script_query`
2. You can pass a custom script query implementation and the respective
parameters to `similarity_search_with_score`
3. You can pass functions for building page content and metadata for the
resulting `Document`
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
4. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @dev2049
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @dev2049
- Memory: @hwchase17
- Agents / Tools / Toolkits: @vowelparrot
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
hi!
This is pretty straight-forward: The sdist package does not contain the
license file (which is needed by e.g. conda) because the package is
built from the subdir and can't see the license.
I _copied_ the license but since I'm unfamiliar with the projects
direction, I'm not sure that's correct.
thanks!
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Fixes: #8207
Description:
Pinecone returns scores (not distances) with cosine similarity. The
values according to the docs are [-1, 1], although I could never
reproduce negative values.
This PR ensures that the score returned from Pinecone is preserved,
rather than inverted, so the most relevant documents can be filtered (eg
when using similarity thresholds)
I'll leave this as a draft PR as I couldn't run the tests (my pinecone
account might not be enough - some errors were being thrown around
namespaces) so hopefully someone who _can_ will pick this up.
Maintainers:
@rlancemartin, @eyurtsev
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Fixed a serialization issue in the add_texts method
of the Matching Engine Vector Store caused by a typo, leading to an
attempt to serialize the json module itself.
- **Issue:** #12154
- **Dependencies:** ./.
- **Tag maintainer:**
- **Description:** Refine Weaviate tutorial and add an example for
Retrieval-Augmented Generation (RAG)
- **Issue:** (not applicable),
- **Dependencies:** none
- **Tag maintainer:** @baskaryan <!--
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Twitter handle:** @helloiamleonie
Co-authored-by: Leonie <leonie@Leonies-MBP-2.fritz.box>
Due to the possibility of external inputs including UUIDs, there may be
additional values in **kwargs, while Weaviate's `__init__` method does
not support passing extra **kwarg parameters.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
When calling max_marginal_relevance_search from PGVector the filter
param is not carried over to max_marginal_relevance_search_by_vector
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Uses `endpoint_url` if provided with a boto3 session.
When running dynamodb locally, credentials are required even if invalid.
With this change, it will be possible to pass a boto3 session with
credentials and specify an endpoint_url
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Replace this entire comment with:
- **Description:** Add MyScaleWithoutJSON which allows user to wrap
columns into Document's Metadata
- **Tag maintainer:** @baskaryan
**Description**
Bumps the Momento dependency to the latest version and refactors the
usage of `SearchHit` in the Momento Vector Index (MVI) vector store
integration. This change is a one liner where we use the preferred
attribute `score` to read the query-document similarity instead of
`distance`. The latest versions of Momento clients will use this
attribute going forward.
**Dependencies**
Updated the Momento dependency to latest version.
**Tests**
💚 I re-ran the existing MVI integration tests
(`tests/integration_tests/vectorstores/test_momento_vector_index.py`)
and they pass.
**Review**
cc @baskaryan @eyurtsev
On the [Defining Custom
Tools](https://python.langchain.com/docs/modules/agents/tools/custom_tools)
page, there's a 'Subclassing the BaseTool class' paragraph under the
'Completely New Tools - String Input and Output' header. Also there's
another 'Subclassing the BaseTool' paragraph under no header, which I
think may belong to the 'Custom Structured Tools' header.
Another thing is, there's a 'Using the tool decorator' and a 'Using the
decorator' paragraph, I think should belong to 'Completely New Tools -
String Input and Output' and 'Custom Structured Tools' separately.
This PR moves those paragraphs to corresponding headers.
**Description**
the ollama api now supports passing system prompt and template directly
instead of modifying the model file , but the ollama integration in
langchain did not have this change updated . The update just adds these
two parameters to it ( there are 2 more parameters that are pending to
be updated, I was not sure about their utility wrt to langchain )
Refer :
8713ac23a8
**Issue** : None Applicable
**Dependencies** : None Changed
**Twitter handle** : https://twitter.com/violetto96
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
added Parallel Function Calling for Structured Data Extraction notebook
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
`_get_kwarg_value` function is useless, one can rely on python builtin
functionalities to do the exact same thing.
- **Description:** Removed `_get_kwarg_value`. Helps with code
readability.
- **Issue:** the issue # it fixes (if applicable),
- **Twitter handle:** @Guillem_96
Improve CSV reader which can't call .strip() on NoneType if there are
less cells in the row compared to the header
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:**
I have a CSV file as followed
```
headerA,headerB,headerC
v1A,v1B,v1C,
v2A,v2B
v3A,v3B,v3C
```
In this case, row 2 is missing a value, which results in reading a None
type. The strip() method can not be called on None, hence raising. In
this PR I am making the change to only call strip if the value if not
None.
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
We updated MyScale free knowledge base, where you can try your RAG with
36 million paragraphs from wikipedia and 2 million paragraphs from
ArXiv.
The pod has two tables
```sql
CREATE TABLE default.ChatArXiv (
`abstract` String,
`id` String,
`vector` Array(Float32),
`metadata` Object('JSON'),
`pubdate` DateTime,
`title` String,
`categories` Array(String),
`authors` Array(String),
`comment` String,
`primary_category` String,
VECTOR INDEX vec_idx vector TYPE MSTG('metric_type=Cosine'),
CONSTRAINT vec_len CHECK length(vector) = 768)
ENGINE = ReplacingMergeTree ORDER BY id;
CREATE TABLE wiki.Wikipedia (
`id` String,
`title` String,
`text` String,
`url` String,
`wiki_id` UInt64,
`views` Float32,
`paragraph_id` UInt64,
`langs` UInt32,
`emb` Array(Float32),
VECTOR INDEX emb_idx emb TYPE MSTG('metric_type=Cosine'),
CONSTRAINT emb_len CHECK length(emb) = 768)
ENGINE = ReplacingMergeTree ORDER BY id;
```
You can connect those two tables using credentials below (just the same
to the old one)
URL: `msc-4a9e710a.us-east-1.aws.staging.myscale.cloud`
Port: `443`
Username: `chatdata`
Password: `myscale_rocks`
It's FREE and you can also use it with
ChatData: https://github.com/myscale/ChatData
Retrieval-QA-Benchmark:
https://github.com/myscale/Retrieval-QA-Benchmark
... and also LangChain!
Request for review @baskaryan
**Description:** This is like the rag-conversation template in many
ways. What's different is:
- support for a timescale vector store.
- support for time-based filters.
- support for metadata filters.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Changed the fleet_context documentation to use
`context.download_embeddings()` from the latest release from our
package. More details here:
https://github.com/fleet-ai/context/tree/main#api
- **Issue:** n/a
- **Dependencies:** n/a
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @andrewthezhou
Added a Docusaurus Loader
Issue: #6353
I had to implement this for working with the Ionic documentation, and
wanted to open this up as a draft to get some guidance on building this
out further. I wasn't sure if having it be a light extension of the
SitemapLoader was in the spirit of a proper feature for the library --
but I'm grateful for the opportunities Langchain has given me and I'd
love to build this out properly for the sake of the community.
Any feedback welcome!
- **Description:** `AzureMLChatOnlineEndpoint` object from
langchain/chat_models/azureml_endpoint.py safe to print
without having any secrets included in raw format in the string
representation.
- **Issue:** #12165,
- **Tag maintainer:** @eyurtsev
---------
Co-authored-by: Faysal Bougamale <faysal.bougamale@horiba.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Adding documentation to the runnable.
- Documentation is not organized in the best way for the runnable; i.e.,
in
terms of LCEL vs. other standard methods, will follow up with more
edits.
**Description:** Removing the single quote wrapper around the table
names in the SQL agent toolkit.py file as it misleads the LLM into
querying against tables with single quotes around their names.
**Issue:** #7457
**Dependencies:** None
**Tag maintainer:** @hwchase17
**Twitter handle:** None
- Implement config_specs to include session_id
- Remove Runnable method and update notebook
- Add more details to notebook, eg. show input schema and config schema
before and after adding message history
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This commit fixes the issue that langchain.llms OpenAI completion
stopped working since the V1 openai client update.
Replace this entire comment with:
- **Description:** This PR fixes the issue [AttributeError: module
'openai' has no attribute
'Completion'](https://github.com/langchain-ai/langchain/issues/12967)
similar to
8e0cb2eb84
and https://github.com/langchain-ai/langchain/pull/12969,
- **Issue:** https://github.com/langchain-ai/langchain/issues/12967,
- **Dependencies:** `openai` v1.x.x client,
- **Tag maintainer:** @baskaryan,
- **Twitter handle:** @dosuken123
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
This adds the response message as a document to the rag retriever so
users can choose to use this. Also drops document limit.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
We need to centralize the API we use to get the project name for our
tracers. This PR makes it so we always get this from a shared function
in the langsmith sdk.
## Dependencies
Upgraded langsmith from 0.52 to 0.62 to include the new API
`get_tracer_project`
- **Description:**
Recently Chroma rolled out a breaking change on the way we handle
embedding functions, in order to support multi-modal collections.
This broke the way LangChain's `Chroma` objects get created, because we
were passing the EF down into the Chroma collection:
https://docs.trychroma.com/migration#migration-to-0416---november-7-2023
However, internally, we are never actually using embeddings on the
chroma collection - LangChain's `Chroma` object calls it instead. Thus
we just don't pass an `embedding_function` to Chroma itself, which fixes
the issue.
- **Description:** The issue was not listing the proper import error for
amazon textract loader.
- **Issue:** Time wasted trying to figure out what to install...
(langchain docs don't list the dependency either)
- **Dependencies:** N/A
- **Tag maintainer:** @sbusso
- **Twitter handle:** @h9ste
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
# Astra DB Vector store integration
- **Description:** This PR adds a `VectorStore` implementation for
DataStax Astra DB using its HTTP API
- **Issue:** (no related issue)
- **Dependencies:** A new required dependency is `astrapy` (`>=0.5.3`)
which was added to pyptoject.toml, optional, as per guidelines
- **Tag maintainer:** I recently mentioned to @baskaryan this
integration was coming
- **Twitter handle:** `@rsprrs` if you want to mention me
This PR introduces the `AstraDB` vector store class, extensive
integration test coverage, a reworking of the documentation which
conflates Cassandra and Astra DB on a single "provider" page and a new,
completely reworked vector-store example notebook (common to the
Cassandra store, since parts of the flow is shared by the two APIs). I
also took care in ensuring docs (and redirects therein) are behaving
correctly.
All style, linting, typechecks and tests pass as far as the `AstraDB`
integration is concerned.
I could build the documentation and check it all right (but ran into
trouble with the `api_docs_build` makefile target which I could not
verify: `Error: Unable to import module
'plan_and_execute.agent_executor' with error: No module named
'langchain_experimental'` was the first of many similar errors)
Thank you for a review!
Stefano
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Correct naming for package in README
- **Issue:** README wasn't aligned with pyproject.toml, resulting in not
being able to install the rag-supabase package.
- **Tag maintainer:** @gregnr
Cohere released the new embedding API (Embed v3:
https://txt.cohere.com/introducing-embed-v3/) that treats document and
query embeddings differently. This PR updated the `CohereEmbeddings` to
use them appropriately. It also works with the old models.
Description: This PR masks API key secrets for the Nebula model from
Symbl.ai
Issue: #12165
Maintainer: @eyurtsev
---------
Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>
* ChatAnyscale was missing coercion to SecretStr for anyscale api key
* The model inherits from ChatOpenAI so it should not force the openai
api key to be secret str until openai model has the same changes
https://github.com/langchain-ai/langchain/issues/12841
- **Description:** Remove text "LangChain currently does not support"
which appears to be vestigial leftovers from a previous change.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan, @eyurtsev
- **Twitter handle:** thezanke
- **Description:** Noticed that the Hugging Face Pipeline documentation
was a bit out of date.
Updated with information about passing in a pipeline directly
(consistent with docstring) and a recent contribution of mine on adding
support for multi-gpu specifications with Accelerate in
21eeba075c
Qdrant was incorrectly calculating the cosine similarity and returning
`0.0` for the best match, instead of `1.0`. Internally Qdrant returns a
cosine score from `-1.0` (worst match) to `1.0` (best match), and the
current formula reflects it.
Possibility to pass on_artifacts to a conversation. It can be then
achieved by adding this way:
```python
result = agent.run(
input=message.text,
metadata={
"on_artifact": CALLBACK_FUNCTION
},
)
```
The line removed is not required as there are no other alternative
solutions above than that.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
@@ -17,13 +17,16 @@ For more info, check out the [GitHub documentation](https://docs.github.com/en/f
## VS Code Dev Containers
[](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
Note: If you click this link you will open the main repo and not your local cloned repo, you can use this link and replace with your username and cloned repo name:
Note: If you click the link above you will open the main repo (langchain-ai/langchain) and not your local cloned repo. This is fine if you only want to run and test the library, but if you want to contribute you can use the link below and replace with your username and cloned repo name:
Then you will have a local cloned repo where you can contribute and then create pull requests.
If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
You can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
Alternatively you can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).
Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.
## 🤔 What is LangChain?
This library aims to assist in the development of those types of applications. Common examples of these applications include:
**LangChain** is a framework for developing applications powered by language models. It enables applications that:
- **Are context-aware**: connect a language model to sources of context (prompt instructions, few shot examples, content to ground its response in, etc.)
- **Reason**: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc.)
**❓ Question Answering over specific documents**
This framework consists of several parts.
- **LangChain Libraries**: The Python and JavaScript libraries. Contains interfaces and integrations for a myriad of components, a basic run time for combining these components into chains and agents, and off-the-shelf implementations of chains and agents.
- **[LangChain Templates](templates)**: A collection of easily deployable reference architectures for a wide variety of tasks.
- **[LangServe](https://github.com/langchain-ai/langserve)**: A library for deploying LangChain chains as a REST API.
- **[LangSmith](https://smith.langchain.com)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
**This repo contains the `langchain` ([here](libs/langchain)), `langchain-experimental` ([here](libs/experimental)), and `langchain-cli` ([here](libs/cli)) Python packages, as well as [LangChain Templates](templates).**
- End-to-end Example: [Web LangChain (web researcher chatbot)](https://weblangchain.vercel.app) and [repo](https://github.com/langchain-ai/weblangchain)
## 📖 Documentation
And much more! Head to the [Use cases](https://python.langchain.com/docs/use_cases/) section of the docs for more.
Please see [here](https://python.langchain.com) for full documentation on:
## 🚀 How does LangChain help?
The main value props of the LangChain libraries are:
1.**Components**: composable tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
2.**Off-the-shelf chains**: built-in assemblages of components for accomplishing higher-level tasks
- Getting started (installation, setting up the environment, simple examples)
- Resources (high-level explanation of core concepts)
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
## 🚀 What can this help with?
Components fall into the following **modules**:
There are six main areas that LangChain is designed to help with.
These are, in increasing order of complexity:
**📃 LLMs and Prompts:**
**📃 Model I/O:**
This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.
**🔗 Chains:**
Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.
**📚 Data Augmented Generation:**
**📚 Retrieval:**
Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.
@@ -87,18 +88,23 @@ Data Augmented Generation involves specific types of chains that first interact
Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.
**🧠 Memory:**
## 📖 Documentation
Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.
Please see [here](https://python.langchain.com) for full documentation, which includes:
**🧐 Evaluation:**
- [Getting started](https://python.langchain.com/docs/get_started/introduction): installation, setting up the environment, simple examples
- Overview of the [interfaces](https://python.langchain.com/docs/expression_language/), [modules](https://python.langchain.com/docs/modules/) and [integrations](https://python.langchain.com/docs/integrations/providers)
- [Use case](https://python.langchain.com/docs/use_cases/qa_structured/sql) walkthroughs and best practice [guides](https://python.langchain.com/docs/guides/adapters/openai)
- [LangSmith](https://python.langchain.com/docs/langsmith/), [LangServe](https://python.langchain.com/docs/langserve), and [LangChain Template](https://python.langchain.com/docs/templates/) overviews
- [Reference](https://api.python.langchain.com): full API docs
[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is by using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
For more information on these concepts, please see our [full documentation](https://python.langchain.com).
## 💁 Contributing
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see [here](.github/CONTRIBUTING.md).
[Semi_Structured_RAG.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_Structured_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data, including text and tables, using unstructured for parsing, multi-vector retriever for storing, and lcel for implementing chains.
[Semi_structured_and_multi_moda...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using unstructured for parsing, multi-vector retriever for storage and retrieval, and lcel for implementing chains.
[Semi_structured_multi_modal_RA...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using various tools and methods such as unstructured for parsing, multi-vector retriever for storing, lcel for implementing chains, and open source language models like llama2, llava, and gpt4all.
[analyze_document.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/analyze_document.ipynb) | Analyze a single long document.
[autogpt/autogpt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/autogpt.ipynb) | Implement autogpt, a language model, with langchain primitives such as llms, prompttemplates, vectorstores, embeddings, and tools.
[autogpt/marathon_times.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/marathon_times.ipynb) | Implement autogpt for finding winning marathon times.
[baby_agi.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/baby_agi.ipynb) | Implement babyagi, an ai agent that can generate and execute tasks based on a given objective, with the flexibility to swap out specific vectorstores/model providers.
@@ -20,6 +21,7 @@ Notebook | Description
[databricks_sql_db.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/databricks_sql_db.ipynb) | Connect to databricks runtimes and databricks sql.
[deeplake_semantic_search_over_...](https://github.com/langchain-ai/langchain/tree/master/cookbook/deeplake_semantic_search_over_chat.ipynb) | Perform semantic search and question-answering over a group chat using activeloop's deep lake with gpt4.
[elasticsearch_db_qa.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/elasticsearch_db_qa.ipynb) | Interact with elasticsearch analytics databases in natural language and build search queries via the elasticsearch dsl API.
[extraction_openai_tools.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/extraction_openai_tools.ipynb) | Structured Data Extraction with OpenAI Tools
[forward_looking_retrieval_augm...](https://github.com/langchain-ai/langchain/tree/master/cookbook/forward_looking_retrieval_augmented_generation.ipynb) | Implement the forward-looking active retrieval augmented generation (flare) method, which generates answers to questions, identifies uncertain tokens, generates hypothetical questions based on these tokens, and retrieves relevant documents to continue generating the answer.
[generative_agents_interactive_...](https://github.com/langchain-ai/langchain/tree/master/cookbook/generative_agents_interactive_simulacra_of_human_behavior.ipynb) | Implement a generative agent that simulates human behavior, based on a research paper, using a time-weighted memory object backed by a langchain retriever.
[gymnasium_agent_simulation.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/gymnasium_agent_simulation.ipynb) | Create a simple agent-environment interaction loop in simulated environments like text-based games with gymnasium.
@@ -38,10 +40,12 @@ Notebook | Description
[multiagent_bidding.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_bidding.ipynb) | Implement a multi-agent simulation where agents bid to speak, with the highest bidder speaking next, demonstrated through a fictitious presidential debate example.
[myscale_vector_sql.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/myscale_vector_sql.ipynb) | Access and interact with the myscale integrated vector database, which can enhance the performance of language model (llm) applications.
[openai_functions_retrieval_qa....](https://github.com/langchain-ai/langchain/tree/master/cookbook/openai_functions_retrieval_qa.ipynb) | Structure response output in a question-answering system by incorporating openai functions into a retrieval pipeline.
[openai_v1_cookbook.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/openai_v1_cookbook.ipynb) | Explore new functionality released alongside the V1 release of the OpenAI Python library.
[petting_zoo.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/petting_zoo.ipynb) | Create multi-agent simulations with simulated environments using the petting zoo library.
[plan_and_execute_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/plan_and_execute_agent.ipynb) | Create plan-and-execute agents that accomplish objectives by planning tasks with a language model (llm) and executing them with a separate agent.
[press_releases.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/press_releases.ipynb) | Retrieve and query company press release data powered by [Kay.ai](https://kay.ai).
[program_aided_language_model.i...](https://github.com/langchain-ai/langchain/tree/master/cookbook/program_aided_language_model.ipynb) | Implement program-aided language models as described in the provided research paper.
[qa_citations.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/qa_citations.ipynb) | Different ways to get a model to cite its sources.
[retrieval_in_sql.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/retrieval_in_sql.ipynb) | Perform retrieval-augmented-generation (rag) on a PostgreSQL database using pgvector.
[sales_agent_with_context.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/sales_agent_with_context.ipynb) | Implement a context-aware ai sales agent, salesgpt, that can have natural sales conversations, interact with other systems, and use a product knowledge base to discuss a company's offerings.
[self_query_hotel_search.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/self_query_hotel_search.ipynb) | Build a hotel room search feature with self-querying retrieval, using a specific hotel recommendation dataset.
" # Execute the command and save the output to the defined output file\n",
" /Users/rlm/Desktop/Code/llama.cpp/bin/llava -m ../models/llava-7b/ggml-model-q5_k.gguf --mmproj ../models/llava-7b/mmproj-model-f16.gguf --temp 0.1 -p \"Describe the image in detail. Be specific about graphs, such as bar plots.\" --image \"$img\" > \"$output_file\"\n",
" # Execute the command and save the output to the defined output file\n",
" /Users/rlm/Desktop/Code/llama.cpp/bin/llava -m ../models/llava-7b/ggml-model-q5_k.gguf --mmproj ../models/llava-7b/mmproj-model-f16.gguf --temp 0.1 -p \"Describe the image in detail. Be specific about graphs, such as bar plots.\" --image \"$img\" > \"$output_file\"\n",
"'The President said, \"Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service.\"'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"qa_document_chain.run(\n",
" input_document=state_of_the_union,\n",
" question=\"what did the president say about justice breyer?\",\n",
" \"You are a planner who is an expert at coming up with a todo list for a given objective. Come up with a todo list for this objective: {objective}\"\n",
"PROMPT_TEMPLATE = \"\"\"Given an input question, create a syntactically correct Elasticsearch query to run. Unless the user specifies in their question a specific number of examples they wish to obtain, always limit your query to at most {top_k} results. You can order the results by a relevant column to return the most interesting examples in the database.\n",
"Performing extraction has never been easier! OpenAI's tool calling ability is the perfect thing to use as it allows for extracting multiple different elements from text that are different types. \n",
"\n",
"Models after 1106 use tools and support \"parallel function calling\" which makes this super easy."
"_PROMPT_TEMPLATE = \"\"\"If someone asks you to perform a task, your job is to come up with a series of bash commands that will perform the task. There is no need to put \"#!/bin/bash\" in your answer. Make sure to reason step by step, using this format:\n",
"Question: \"copy the files in the directory named 'target' into a new directory at the same level as target called 'myNewDirectory'\"\n",
"_template = \"\"\"Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.\\\n",
"On 11.06.23 OpenAI released a number of new features, and along with it bumped their Python SDK to 1.0.0. This notebook shows off the new features and how to use them with LangChain."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ee897729-263a-4073-898f-bb4cf01ed829",
"metadata": {},
"outputs": [],
"source": [
"# need openai>=1.1.0, langchain>=0.0.335, langchain-experimental>=0.0.39\n",
"OpenAI released multi-modal models, which can take a sequence of text and images as input."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "1c8c3965-d3c9-4186-b5f3-5e67855ef916",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='The image appears to be a diagram representing the architecture or components of a software system or framework related to language processing, possibly named LangChain or associated with a project or product called LangChain, based on the prominent appearance of that term. The diagram is organized into several layers or aspects, each containing various elements or modules:\\n\\n1. **Protocol**: This may be the foundational layer, which includes \"LCEL\" and terms like parallelization, fallbacks, tracing, batching, streaming, async, and composition. These seem related to communication and execution protocols for the system.\\n\\n2. **Integrations Components**: This layer includes \"Model I/O\" with elements such as the model, output parser, prompt, and example selector. It also has a \"Retrieval\" section with a document loader, retriever, embedding model, vector store, and text splitter. Lastly, there\\'s an \"Agent Tooling\" section. These components likely deal with the interaction with external data, models, and tools.\\n\\n3. **Application**: The application layer features \"LangChain\" with chains, agents, agent executors, and common application logic. This suggests that the system uses a modular approach with chains and agents to process language tasks.\\n\\n4. **Deployment**: This contains \"Lang')"
"> The Assistants API allows you to build AI assistants within your own applications. An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries. The Assistants API currently supports three types of tools: Code Interpreter, Retrieval, and Function calling\n",
"\n",
"\n",
"You can interact with OpenAI Assistants using OpenAI tools or custom tools. When using exclusively OpenAI tools, you can just invoke the assistant directly and get final answers. When using custom tools, you can run the assistant and tool execution loop using the built-in AgentExecutor or easily write your own executor.\n",
"\n",
"Below we show the different ways to interact with Assistants. As a simple example, let's build a math tutor that can write and run code."
"[ThreadMessage(id='msg_g9OJv0rpPgnc3mHmocFv7OVd', assistant_id='asst_hTwZeNMMphxzSOqJ01uBMsJI', content=[MessageContentText(text=Text(annotations=[], value='The result of \\\\(10 - 4^{2.7}\\\\) is approximately \\\\(-32.224\\\\).'), type='text')], created_at=1699460600, file_ids=[], metadata={}, object='thread.message', role='assistant', run_id='run_nBIT7SiAwtUfSCTrQNSPLOfe', thread_id='thread_14n4GgXwxgNL0s30WJW5F6p0')]"
" instructions=\"You are a personal math tutor. Write and run code to answer math questions.\",\n",
" tools=[{\"type\": \"code_interpreter\"}],\n",
" model=\"gpt-4-1106-preview\",\n",
")\n",
"output = interpreter_assistant.invoke({\"content\": \"What's 10 - 4 raised to the 2.7\"})\n",
"output"
]
},
{
"cell_type": "markdown",
"id": "a8ddd181-ac63-4ab6-a40d-a236120379c1",
"metadata": {},
"source": [
"### As a LangChain agent with arbitrary tools\n",
"\n",
"Now let's recreate this functionality using our own tools. For this example we'll use the [E2B sandbox runtime tool](https://e2b.dev/docs?ref=landing-page-get-started)."
" instructions=\"You are a personal math tutor. Write and run code to answer math questions. You can also search the internet.\",\n",
" tools=tools,\n",
" model=\"gpt-4-1106-preview\",\n",
" as_agent=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "1ac71d8b-4b4b-4f98-b826-6b3c57a34166",
"metadata": {},
"source": [
"#### Using AgentExecutor"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "1f137f94-801f-4766-9ff5-2de9df5e8079",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'content': \"What's the weather in SF today divided by 2.7\",\n",
" 'output': \"The weather in San Francisco today is reported to have temperatures as high as 66 °F. To get the temperature divided by 2.7, we will calculate that:\\n\\n66 °F / 2.7 = 24.44 °F\\n\\nSo, when the high temperature of 66 °F is divided by 2.7, the result is approximately 24.44 °F. Please note that this doesn't have a meteorological meaning; it's purely a mathematical operation based on the given temperature.\"}"
"OpenAI sometimes changes model configurations in a way that impacts outputs. Whenever this happens, the system_fingerprint associated with a generation will change."
" content=\"Extract the 'name' and 'origin' of any companies mentioned in the following statement. Return a JSON list.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Google was founded in the USA, while Deepmind was founded in the UK\"\n",
" ),\n",
" ]\n",
" ]\n",
")\n",
"print(output.llm_output)"
]
},
{
"cell_type": "markdown",
"id": "aa6565be-985d-4127-848e-c3bca9d7b434",
"metadata": {},
"source": [
"## Breaking changes to Azure classes\n",
"\n",
"OpenAI V1 rewrote their clients and separated Azure and OpenAI clients. This has led to some changes in LangChain interfaces when using OpenAI V1.\n",
"\n",
"BREAKING CHANGES:\n",
"- To use Azure embeddings with OpenAI V1, you'll need to use the new `AzureOpenAIEmbeddings` instead of the existing `OpenAIEmbeddings`. `OpenAIEmbeddings` continue to work when using Azure with `openai<1`.\n",
"- When using `AzureChatOpenAI` or `AzureOpenAI`, if passing in an Azure endpoint (eg https://example-resource.azure.openai.com/) this should be specified via the `azure_endpoint` parameter or the `AZURE_OPENAI_ENDPOINT`. We're maintaining backwards compatibility for now with specifying this via `openai_api_base`/`base_url` or env var `OPENAI_API_BASE` but this shouldn't be relied upon.\n",
"- When using Azure chat or embedding models, pass in API keys either via `openai_api_key` parameter or `AZURE_OPENAI_API_KEY` parameter. We're maintaining backwards compatibility for now with specifying this via `OPENAI_API_KEY` but this shouldn't be relied upon."
]
},
{
"cell_type": "markdown",
"id": "49944887-3972-497e-8da2-6d32d44345a9",
"metadata": {},
"source": [
"## Tools\n",
"\n",
"Use tools for parallel function calling."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "916292d8-0f89-40a6-af1c-5a1122327de8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[GetCurrentWeather(location='New York, NY', unit='fahrenheit'),\n",
"This notebook is an implementation of Retrieval augmented generation (RAG) using Baidu Qianfan Platform combined with Baidu ElasricSearch, where the original data is located on BOS.\n",
"## Baidu Qianfan\n",
"Baidu AI Cloud Qianfan Platform is a one-stop large model development and service operation platform for enterprise developers. Qianfan not only provides including the model of Wenxin Yiyan (ERNIE-Bot) and the third-party open-source models, but also provides various AI development tools and the whole set of development environment, which facilitates customers to use and develop large model applications easily.\n",
"\n",
"## Baidu ElasticSearch\n",
"[Baidu Cloud VectorSearch](https://cloud.baidu.com/doc/BES/index.html?from=productToDoc) is a fully managed, enterprise-level distributed search and analysis service which is 100% compatible to open source. Baidu Cloud VectorSearch provides low-cost, high-performance, and reliable retrieval and analysis platform level product services for structured/unstructured data. As a vector database , it supports multiple index types and similarity distance methods. "
Below are links to tutorials and courses on LangChain. For written guides on common use cases for LangChain, check out the [use cases guides](/docs/use_cases/qa_structured/sql).
Below are links to tutorials and courses on LangChain. For written guides on common use cases for LangChain, check out the [use cases guides](/docs/use_cases).
⛓ icon marks a new addition [last update 2023-09-21]
---------------------
### [LangChain on Wikipedia](https://en.wikipedia.org/wiki/LangChain)
### DeepLearning.AI courses
by [Harrison Chase](https://github.com/hwchase17) and [Andrew Ng](https://en.wikipedia.org/wiki/Andrew_Ng)
by [Harrison Chase](https://en.wikipedia.org/wiki/LangChain) and [Andrew Ng](https://en.wikipedia.org/wiki/Andrew_Ng)
- [LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain)
- [LangChain Chat with Your Data](https://learn.deeplearning.ai/langchain-chat-with-your-data)
- ⛓ [Functions, Tools and Agents with LangChain](https://learn.deeplearning.ai/functions-tools-agents-langchain)
### Handbook
[LangChain AI Handbook](https://www.pinecone.io/learn/langchain/) By **James Briggs** and **Francisco Ingham**
"physics_template = \"\"\"You are a very smart physics professor. \\\n",
"You are great at answering questions about physics in a concise and easy to understand manner. \\\n",
"When you don't know the answer to a question you admit that you don't know.\n",
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.