The namespaces like `langchain.agents.format_scratchpad` clogging the
API Reference sidebar.
This change removes those 3-level namespaces from sidebar (this issue
was discussed with @efriis )
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
This reverts commit 38813d7090. This is a
temporary fix, as I don't see a clear way on how to use multiple keys
with `Qdrant.from_texts`.
Context: #14378
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Keeping it simple for now.
Still iterating on our docs build in pursuit of making everything mdxv2
compatible for docusaurus 3, and the fewer custom scripts we're reliant
on through that, the less likely the docs will break again.
Other things to consider in future:
Quarto rewriting in ipynbs:
https://quarto.org/docs/extensions/nbfilter.html (but this won't do
md/mdx files)
Docusaurus plugins for rewriting these paths
- **Description:** In Qdrant allows to input list of keys as the
content_payload_key to retrieve multiple fields (the generated document
will contain the dictionary {field: value} in a string),
- **Issue:** Previously we were able to retrieve only one field from the
vector database when making a search
- **Dependencies:**
- **Tag maintainer:**
- **Twitter handle:** @jb_dlb
---------
Co-authored-by: Jean Baptiste De La Broise <jeanbaptiste.delabroise@mdpi.com>
Description: This PR masked baidu qianfan - Chat_Models API Key and
added unit tests.
Issue: the issue langchain-ai#12165.
Tag maintainer: @eyurtsev
---------
Co-authored-by: xiayi <xiayi@bytedance.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
We found a request with `max_tokens=None` results in the following error
in Anthropic:
```
HTTPError: 400 Client Error: Bad Request for url: https://oregon.staging.cloud.databricks.com/serving-endpoints/corey-anthropic/invocations.
Response text: {"error_code":"INVALID_PARAMETER_VALUE","message":"INVALID_PARAMETER_VALUE: max_tokens was not of type Integer: null"}
```
This PR excludes `max_tokens` if it's None.
- **Description:** new parameters in OpenAIEmbeddings() constructor
(retry_min_seconds and retry_max_seconds) that allow parametrization by
the user of the former min_seconds and max_seconds that were hidden in
_create_retry_decorator() and _async_retry_decorator()
- **Issue:** #9298, #12986
- **Dependencies:** none
- **Tag maintainer:** @hwchase17
- **Twitter handle:** @adumont
make format ✅
make lint ✅
make test ✅
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Description :
Updated the functions with new Clarifai python SDK.
Enabled initialisation of Clarifai class with model URL.
Updated docs with new functions examples.
Remove whitespaces from the input of the ListSQLDatabaseTool for better
support.
for example, the input "table1,table2,table3" will throw an exception
whiteout the change although it's a valid input.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** add gitlab url from env,
- **Issue:** no issue,
- **Dependencies:** no,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR adds support for metadata filters of the form:
`{"filter": {"key": { "NIN" : ["list", "of", "values"]}}}`
"IN" is already supported, so this is a quick & related update to add
"NIN"
- **Description:**
1. Add system parameters to the ERNIE LLM API to set the role of the
LLM.
2. Add support for the ERNIE-Bot-turbo-AI model according from the
document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Alp0kdm0n.
3. For the function call of ErnieBotChat, align with the
QianfanChatEndpoint.
With this PR, the `QianfanChatEndpoint()` can use the `function calling`
ability with `create_ernie_fn_chain()`. The example is as the following:
```
from langchain.prompts import ChatPromptTemplate
import json
from langchain.prompts.chat import (
ChatPromptTemplate,
)
from langchain.chat_models import QianfanChatEndpoint
from langchain.chains.ernie_functions import (
create_ernie_fn_chain,
)
def get_current_news(location: str) -> str:
"""Get the current news based on the location.'
Args:
location (str): The location to query.
Returs:
str: Current news based on the location.
"""
news_info = {
"location": location,
"news": [
"I have a Book.",
"It's a nice day, today."
]
}
return json.dumps(news_info)
def get_current_weather(location: str, unit: str="celsius") -> str:
"""Get the current weather in a given location
Args:
location (str): location of the weather.
unit (str): unit of the tempuature.
Returns:
str: weather in the given location.
"""
weather_info = {
"location": location,
"temperature": "27",
"unit": unit,
"forecast": ["sunny", "windy"],
}
return json.dumps(weather_info)
template = ChatPromptTemplate.from_messages([
("user", "{user_input}"),
])
chat = QianfanChatEndpoint(model="ERNIE-Bot-4")
chain = create_ernie_fn_chain([get_current_weather, get_current_news], chat, template, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```
The result of the above code:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?
> Finished chain.
{'name': 'get_current_news', 'arguments': {'location': '北京'}}
```
For the `ErnieBotChat`, now can use the `system` parameter to set the
role of the LLM.
```
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ErnieBotChat
llm = ErnieBotChat(model_name="ERNIE-Bot-turbo-AI", system="你是一个能力很强的机器人,你的名字叫 小叮当。无论问你什么问题,你都可以给出答案。")
prompt = ChatPromptTemplate.from_messages(
[
("human", "{query}"),
]
)
chain = LLMChain(llm=llm, prompt=prompt, verbose=True)
res = chain.run(query="你是谁?")
print(res)
```
The result of the above code:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 你是谁?
> Finished chain.
我是小叮当,一个智能机器人。我可以为你提供各种服务,包括回答问题、提供信息、进行计算等。如果你需要任何帮助,请随时告诉我,我会尽力为你提供最好的服务。
```
- **Description:** Added a notebook to illustrate how to use
`text-embeddings-inference` from huggingface. As
`HuggingFaceHubEmbeddings` was using a deprecated client, I made the
most of this PR updating that too.
- **Issue:** #13286
- **Dependencies**: None
- **Tag maintainer:** @baskaryan
- **Description:** Update code to correctly pass the kwargs
- **Issue:** #14295
- **Dependencies:** -
- **Tag maintainer:**
<--
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
#issue-14295
### Description
Fixed 3 doc issues:
1. `ConfigurableField ` needs to be imported in
`docs/docs/expression_language/how_to/configure.ipynb`
2. use `error` instead of `RateLimitError()` in
`docs/docs/expression_language/how_to/fallbacks.ipynb`
3. I think it might be better to output the fixed json data(when I
looked at this example, I didn't understand its purpose at first, but
then I suddenly realized):
<img width="1219" alt="Screenshot 2023-12-05 at 10 34 13 PM"
src="https://github.com/langchain-ai/langchain/assets/10000925/7623ba13-7b56-4964-8c98-b7430fabc6de">
- **Description:** allows not enforcing function usage when a single
function is passed to an openAI function executable (or corresponding
legacy chain). This is a desired feature in the case where the model
does not have enough information to call a function, and needs to get
back to the user.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** N/A
Add metadata to the blob object. This makes it easier
to make a pipeline that properly propagates metadata information
from raw content to the derived content.
- Fixes `input_variables=[""]` crashing validations with a template
`"{}"`
- Uses `__cause__` for proper `Exception` chaining in
`check_valid_template`
- **Description:** Fix#11737 issue (extra_tools option of
create_pandas_dataframe_agent is not working),
- **Issue:** #11737 ,
- **Dependencies:** no,
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17 I needed this
method at work, so I modified it myself and used it. There is a similar
issue(#11737) and PR(#13018) of @PyroGenesis, so I combined my code at
the original PR.
You may be busy, but it would be great help for me if you checked. Thank
you.
- **Twitter handle:** @lunara_x
If you need an .ipynb example about this, please tag me.
I will share what I am working on after removing any work-related
content.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Enhanced `create_sync_playwright_browser` and
`create_async_playwright_browser` functions to accept a list of
arguments. These arguments are now forwarded to
`browser.chromium.launch()` for customizable browser instantiation.
- **Issue:** #13143
- **Dependencies:** None
- **Tag maintainer:** @eyurtsev,
- **Twitter handle:** Dr_Bearden
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:**
Reference library azure-search-documents has been adapted in version
11.4.0:
1. Notebook explaining Azure AI Search updated with most recent info
2. HnswVectorSearchAlgorithmConfiguration --> HnswAlgorithmConfiguration
3. PrioritizedFields(prioritized_content_fields) -->
SemanticPrioritizedFields(content_fields)
4. SemanticSettings --> SemanticSearch
5. VectorSearch(algorithm_configurations) -->
VectorSearch(configurations)
--> Changes now reflected on Langchain: default vector search config
from langchain is now compatible with officially released library from
Azure.
- **Issue:**
Issue creating a new index (due to wrong class used for default vector
search configuration) if using latest version of azure-search-documents
with current langchain version
- **Dependencies:** azure-search-documents>=11.4.0,
- **Tag maintainer:** ,
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** This PR modifies the LLM validation in OpenAI
function agents to check whether the LLM supports OpenAI functions based
on a property (`supports_oia_functions`) instead of whether the LLM
passed to the agent `isinstance` of `ChatOpenAI`. This allows classes
that extend `BaseChatModel` to be passed to these agents as long as
they've been integrated with the OpenAI APIs and have this property set,
even if they don't extend `ChatOpenAI`.
- **Issue:** N/A
- **Dependencies:** none
for issue https://github.com/langchain-ai/langchain/issues/13162
migrate openai audio api, as [openai v1.0.0 Migration
Guide](https://github.com/openai/openai-python/discussions/742)
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Double Max <max@ground-map.com>
- **Description:** In openapi/planner deal with json in markdown output
cases
- **Issue:** In some cases LLMs could return json in markdown which
can't be loaded.
- **Dependencies:**
- **Tag maintainer:** @eyurtsev
- **Twitter handle:**
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Adds doc key to metadata field when adding document
to Azure Search.
- **Issue:** -,
- **Dependencies:** -,
- **Tag maintainer:** @eyurtsev,
- **Twitter handle:** @finnless
Right now the document key with the name FIELDS_ID is not included in
the FIELDS_METADATA field, and therefore is not included in the Document
returned from a query. This is really annoying if you want to be able to
modify that item in the vectorstore.
Other's thoughts on this are welcome.
Description: There's a copy-paste typo where on_llm_error() calls
_on_chain_error() instead of _on_llm_error().
Issue: #13580
Dependencies: None
Tag maintainer: @hwchase17
Twitter handle: @jwatte
"Run `make format`, `make lint` and `make test` to check this locally."
The test scripts don't work in a plain Ubuntu LTS 20.04 system.
It looks like the dev container pulling is stuck. Or maybe the internet
is just ornery today.
---------
Co-authored-by: jwatte <jwatte@observeinc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
here it is validating shapely.geometry.point.Point: if not
isinstance(data_frame[page_content_column].iloc[0], gpd.GeoSeries):
raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries" you need
it to validate the geoSeries and not the shapely.geometry.point.Point
if not isinstance(data_frame[page_content_column], gpd.GeoSeries):
raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries"
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description**
Implements `max_marginal_relevance_search` and
`max_marginal_relevance_search_by_vector` for the Momento Vector Index
vectorstore.
Additionally bumps the `momento` dependency in the lock file and adds
logging to the implementation.
**Dependencies**
✅ updates `momento` dependency in lock file
**Tag maintainer**
@baskaryan
**Twitter handle**
Please tag @momentohq for Momento Vector Index and @mloml for the
contribution 🙇
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Hi! I'm Alex, Python SDK Team Lead from
[Comet](https://www.comet.com/site/).
This PR contains our new integration between langchain and Comet -
`CometTracer` class which uses new `comet_llm` python package for
submitting data to Comet.
No additional dependencies for the langchain package are required
directly, but if the user wants to use `CometTracer`, `comet-llm>=2.0.0`
should be installed. Otherwise an exception will be raised from
`CometTracer.__init__`.
A test for the feature is included.
There is also an already existing callback (and .ipynb file with
example) which ideally should be deprecated in favor of a new tracer. I
wasn't sure how exactly you'd prefer to do it. For example we could open
a separate PR for that.
I'm open to your ideas :)
Running a large number of requests to Embaas' servers (or any server)
can result in intermittent network failures (both from local and
external network/service issues). This PR implements exponential backoff
retries to help mitigate this issue.
The Github utilities are fantastic, so I'm adding support for deeper
interaction with pull requests. Agents should read "regular" comments
and review comments, and the content of PR files (with summarization or
`ctags` abbreviations).
Progress:
- [x] Add functions to read pull requests and the full content of
modified files.
- [x] Function to use Github's built in code / issues search.
Out of scope:
- Smarter summarization of file contents of large pull requests (`tree`
output, or ctags).
- Smarter functions to checkout PRs and edit the files incrementally
before bulk committing all changes.
- Docs example for creating two agents:
- One watches issues: For every new issue, open a PR with your best
attempt at fixing it.
- The other watches PRs: For every new PR && every new comment on a PR,
check the status and try to finish the job.
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
The `/docs/integrations/toolkits/vectorstore` page is not the
Integration page. The best place is in `/docs/modules/agents/how_to/`
- Moved the file
- Rerouted the page URL
Allow users to pass a generic `BaseStore[str, bytes]` to
MultiVectorRetriever, removing the need to use the `create_kv_docstore`
method. This encoding will now happen internally.
@rlancemartin @eyurtsev
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
When a RunnableLambda only receives a synchronous callback, this
callback is wrapped into an async one since #13408. However, this
wrapping with `(*args, **kwargs)` causes the `accepts_config` check at
[/libs/core/langchain_core/runnables/config.py#L342](ee94ef55ee/libs/core/langchain_core/runnables/config.py (L342))
to fail, as this checks for the presence of a "config" argument in the
method signature.
Adding a `functools.wraps` around it, resolves it.
If we are not going to make the existing Docstore class also implement
`BaseStore[str, Document]`, IMO all base store implementations should
always be `[str, bytes]` so that they are more interchangeable.
CC @rlancemartin @eyurtsev
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** The existing version hardcoded search.windows.net in
the base url. This is not compatible with the gov cloud. I am allowing
the user to override the default for gov cloud support.,
- **Issue:** N/A, did not write up in an issue,
- **Dependencies:** None
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Nicholas Ceccarelli <nceccarelli2@moog.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Obsidian templates can include
[variables](https://help.obsidian.md/Plugins/Templates#Template+variables)
using double curly braces. `ObsidianLoader` uses PyYaml to parse the
frontmatter of documents. This parsing throws an error when encountering
variables' curly braces. This is avoided by temporarily substituting
safe strings before parsing.
- **Issue:** #13887
- **Tag maintainer:** @hwchase17
Switches to a more maintained solution for building ipynb -> md files
(`quarto`)
Also bumps us down to python3.8 because it's significantly faster in the
vercel build step. Uses default openssl version instead of upgrading as
well.
**Description:**
Adds the document loader for [Couchbase](http://couchbase.com/), a
distributed NoSQL database.
**Dependencies:**
Added the Couchbase SDK as an optional dependency.
**Twitter handle:** nithishr
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Our PR is an integration of a Steam API Tool that
makes recommendations on steam games based on user's Steam profile and
provides information on games based on user provided queries.
- **Issue:** the issue # our PR implements:
https://github.com/langchain-ai/langchain/issues/12120
- **Dependencies:** python-steam-api library, steamspypi library and
decouple library
- **Tag maintainer:** @baskaryan, @hwchase17
- **Twitter handle:** N/A
Hello langchain Maintainers,
We are a team of 4 University of Toronto students contributing to
langchain as part of our course [CSCD01 (link to course
page)](https://cscd01.com/work/open-source-project). We hope our changes
help the community. We have run make format, make lint and make test
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.
Our PR integrates the python-steam-api, steamspypi and decouple
packages. We have added integration tests to test our python API
integration into langchain and an example notebook is also provided.
Our amazing team that contributed to this PR: @JohnY2002, @shenceyang,
@andrewqian2001 and @muntaqamahmood
Thank you in advance to all the maintainers for reviewing our PR!
---------
Co-authored-by: Shence <ysc1412799032@163.com>
Co-authored-by: JohnY2002 <johnyuan0526@gmail.com>
Co-authored-by: Andrew Qian <andrewqian2001@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: JohnY <94477598+JohnY2002@users.noreply.github.com>
### Description
Starting from [openai version
1.0.0](17ac677995 (module-level-client)),
the camel case form of `openai.ChatCompletion` is no longer supported
and has been changed to lowercase `openai.chat.completions`. In
addition, the returned object only accepts attribute access instead of
index access:
```python
import openai
# optional; defaults to `os.environ['OPENAI_API_KEY']`
openai.api_key = '...'
# all client options can be configured just like the `OpenAI` instantiation counterpart
openai.base_url = "https://..."
openai.default_headers = {"x-foo": "true"}
completion = openai.chat.completions.create(
model="gpt-4",
messages=[
{
"role": "user",
"content": "How do I output all files in a directory using Python?",
},
],
)
print(completion.choices[0].message.content)
```
So I implemented a compatible adapter that supports both attribute
access and index access:
```python
In [1]: from langchain.adapters import openai as lc_openai
...: messages = [{"role": "user", "content": "hi"}]
In [2]: result = lc_openai.chat.completions.create(
...: messages=messages, model="gpt-3.5-turbo", temperature=0
...: )
In [3]: result.choices[0].message
Out[3]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [4]: result["choices"][0]["message"]
Out[4]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [5]: result = await lc_openai.chat.completions.acreate(
...: messages=messages, model="gpt-3.5-turbo", temperature=0
...: )
In [6]: result.choices[0].message
Out[6]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [7]: result["choices"][0]["message"]
Out[7]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [8]: for rs in lc_openai.chat.completions.create(
...: messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
...: ):
...: print(rs.choices[0].delta)
...: print(rs["choices"][0]["delta"])
...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}
In [20]: async for rs in await lc_openai.chat.completions.acreate(
...: messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
...: ):
...: print(rs.choices[0].delta)
...: print(rs["choices"][0]["delta"])
...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}
...
```
### Twitter handle
[lin_bob57617](https://twitter.com/lin_bob57617)
- **Description:** to support not only publicly available Hugging Face
endpoints, but also protected ones (created with "Inference Endpoints"
Hugging Face feature), I have added ability to specify custom api_url.
But if not specified, default behaviour won't change
- **Issue:** #9181,
- **Dependencies:** no extra dependencies
**Description:** The way the condition is checked in the
`return_stopped_response` function of `OpenAIAgent` may not be correct,
when the value returned is `AgentFinish` from the tools it does not work
properly.
Thanks for review, @baskaryan, @eyurtsev, @hwchase17.
- **Description:** As part of my conversation with Cerebrium team,
`model_api_request` will be no longer available in cerebrium lib so it
needs to be replaced.
- **Issue:** #12705 12705,
- **Dependencies:** Cerebrium team (agreed)
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** No official Twitter account sorry :D
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
The `AWS` platform page has many missed integrations.
- added missed integration references to the `AWS` platform page
- added/updated descriptions and links in the referenced notebooks
- renamed two notebook files. They have file names != page Title, which
generate unordered ToC.
- reroute the URLs for renamed files
- fixed `amazon_textract` notebook: removed failed cell outputs
**Description:** Adding a possibility to use asynchronous callback
handler in human-in-the-loop validation tool. Very useful, for example,
if you want to implement a validation over Telegram bot.
**Issue:** -
**Dependencies:** -
---------
Co-authored-by: Daniyar_Supiyev <daniyar_supiyev@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description**: This PR addresses an issue with the OpenAI API
streaming response, where initially the key (arguments) is provided but
the value is None. Subsequently, it updates with {"arguments": "{\n"},
leading to a type inconsistency that causes an exception. The specific
error encountered is ValueError: additional_kwargs["arguments"] already
exists in this message, but with a different type. This change aims to
resolve this inconsistency and ensure smooth API interactions.
- **Issue**: None.
- **Dependencies**: None.
- **Tag maintainer**: @eyurtsev
This is an updated version of #13229 based on the refactored code.
Credit goes to @superken01.
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** some vector stores have a flag for try deleting the
collection before creating it (such as ´vectorpg´). This is a useful
flag when prototyping indexing pipelines and also for integration tests.
Added the bool flag `pre_delete_collection ` to the constructor (default
False)
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** This extends `OpenAIEmbeddings` to add support for
non-`tiktoken` based embeddings, specifically for use with the new
`text-generation-webui` API (`--extensions openai`) which does not
support `tiktoken` encodings, but rather strings
- **Issue:** Not found,
- **Dependencies:** HuggingFace `transformers.AutoTokenizer` is new
dependency for running the model without `tiktoken`
- **Tag maintainer:** @baskaryan based on last commit for
`langchain-core` refactor
- **Twitter handle:** @xychelsea
Modified the tokenization process to be model-agnostic, allowing for
both OpenAI and non-OpenAI model tokenizations, by setting the new
default `bool` flag `tiktoken_enabled` to `False`. This requeires
HuggingFace’s AutoTokenizer and handling tokenization for models
requiring different preprocessing steps to generate a chunked string
request rather than a list of integers.
Updated the embeddings generation process to accommodate non-OpenAI
models. This includes converting tokenized text into embeddings using
OpenAI’s and Hugging Face’s model architectures.
-->
Hi,
I made some code changes on the Hologres vector store to improve the
data insertion performance.
Also, this version of the code uses `hologres-vector` library. This
library is more convenient for us to update, and more efficient in
performance.
The code has passed the format/lint/spell check. I have run the unit
test for Hologres connecting to my own database.
Please check this PR again and tell me if anything needs to change.
Best,
Changgeng,
Developer @ Alibaba Cloud
Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
`Hugging Face` is definitely a platform. It includes many integrations
for many modules (LLM, Embedding, DocumentLoader, Tool)
So, a doc page was added that defines Hugging Face as a platform.
- **Description:** Fixes the Mathpix PDF loader API integration.
Specifically, ensures that Mathpix auth headers are provided for every
request, and ensures that we recognize all errors that can occur during
a request. Also, the option to provide API keys as kwargs never actually
worked before, but now that's fixed too.
- **Issue:** #11249
- **Dependencies:** None
- **Description:**
This PR introduces the Slack toolkit to LangChain, which allows users to
read and write to Slack using the Slack API. Specifically, we've added
the following tools.
1. get_channel: Provides a summary of all the channels in a workspace.
2. get_message: Gets the message history of a channel.
3. send_message: Sends a message to a channel.
4. schedule_message: Sends a message to a channel at a specific time and
date.
- **Issue:** This pull request addresses [Add Slack Toolkit
#11747](https://github.com/langchain-ai/langchain/issues/11747)
- **Dependencies:** package`slack_sdk`
Note: For this toolkit to function you will need to add a Slack app to
your workspace. Additional info can be found
[here](https://slack.com/help/articles/202035138-Add-apps-to-your-Slack-workspace).
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ArianneLavada <ariannelavada@gmail.com>
Co-authored-by: ArianneLavada <84357335+ArianneLavada@users.noreply.github.com>
Co-authored-by: ariannelavada@gmail.com <you@example.com>
- **Description:** : As described in the issue below,
https://python.langchain.com/docs/use_cases/summarization
I've modified the Python code in the above notebook to perform well.
I also modified the OpenAI LLM model to the latest version as shown
below.
`gpt-3.5-turbo-16k --> gpt-3.5-turbo-1106`
This is because it seems to be a bit more responsive.
- **Issue:** : #14066
Unnecessarily overridden methods:
- Give the idea the subclass is doing something special (when it isn't)
- Block CTRL-click to the actual method
This PR removes some unnecessarily overridden methods in
`StdOutCallbackHandler`
Supercedes https://github.com/langchain-ai/langchain/pull/12858
### Description
The `RateLimitError` initialization method has changed after openai v1,
and the usage of `patch` needs to be changed.
### Twitter handle
[lin_bob57617](https://twitter.com/lin_bob57617)
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Hi,
There is some unintended behavior in Html2TextTransformer.
The current code is **directly modifying the original documents that are
passed as arguments to the function.**
Therefore, not only the return of the function but also the input
variables are being modified simultaneously.
**To resolve this, I added unit test code as well.**
reference link: [Shallow vs Deep Copying of Python
Objects](https://realpython.com/copying-python-objects/)
Thanks! ☺️
Before, we need to use `params` to pass extra parameters:
```python
from langchain.llms import Databricks
Databricks(..., params={"temperature": 0.0})
```
Now, we can directly specify extra params:
```python
from langchain.llms import Databricks
Databricks(..., temperature=0.0)
```
This PR adds an "Azure AI data" document loader, which allows Azure AI
users to load their registered data assets as a document object in
langchain.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
See PR title.
From what I can see, `poetry` will auto-include this. Please let me know
if I am missing something here.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
… properly
Fixed a bug that was causing the streaming transfer to not work
properly.
- **Description:
1、The on_llm_new_token method in the streaming callback can now be
called properly in streaming transfer mode.
2、In streaming transfer mode, LLM can now correctly output the complete
response instead of just the first token.
- **Tag maintainer: @wangxuqi
- **Twitter handle: @kGX7XJjuYxzX9Km
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
* Add support for passing a specific file to the file system blob loader
* Allow specifying a class parameter for the parser for the generic
loader
```python
class AudioLoader(GenericLoader):
@staticmethod
def get_parser(**kwargs):
return MyAudioParser(**kwargs):
```
The intent of the GenericLoader is to provide on-ramps from different
sources (e.g., web, s3, file system).
An alternative is to use pipelining syntax or creating a Pipeline
```
FileSystemBlobLoader(...) | MyAudioParser
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Change instances of RunnableMap to RunnableParallel,
as that should be the one used going forward. This makes it consistent
across the codebase.
### Description:
Doc addition for LCEL introduction. Adds a more basic starter guide for
using LCEL.
---------
Co-authored-by: Alex Kira <akira@Alexs-MBP.local.tld>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** just a little change of ErnieChatBot class
description, sugguesting user to use more suitable class
- **Issue:** none,
- **Dependencies:** none,
- **Tag maintainer:** @baskaryan ,
- **Twitter handle:** none
**Description**
`embed_with_retry` is for sync operations and not for async operations.
Use `async_embed_with_retry` for appropriate async operations.
I'm using `OpenAIEmbedding(http_client=httpx.AsyncClient())` with only
async operations.
However, I got an error when I use `embedding.aembed_documents` because
`embed_with_retry` uses sync OpenAI client with async http client.
Description
when the desc of arg in python docstring contains ":", the
`_parse_python_function_docstring` will raise **ValueError: too many
values to unpack (expected 2)**.
A sample desc would be:
"""
Args:
error_arg: this is an arg with an additional ":" symbol
"""
So, set `maxsplit` parameter to fix it.
The number of times I try to format a string (especially in lcel) is
embarrassingly high. Think this may be more actionable than the default
error message. Now I get nice helpful errors
```
KeyError: "Input to ChatPromptTemplate is missing variable 'input'. Expected: ['input'] Received: ['dialogue']"
```
### Description
Now if `example` in Message is False, it will not be displayed. Update
the output in this document.
```python
In [22]: m = HumanMessage(content="Text")
In [23]: m
Out[23]: HumanMessage(content='Text')
In [24]: m = HumanMessage(content="Text", example=True)
In [25]: m
Out[25]: HumanMessage(content='Text', example=True)
```
### Twitter handle
[lin_bob57617](https://twitter.com/lin_bob57617)
**Description:** By combining the document timestamp refresh within a
single call to update(), this enables batching of multiple documents in
a single SQL statement. This is important for non-local databases where
tens of milliseconds has a huge impact on performance when doing
document-by-document SQL statements.
**Issue:** #11935
**Dependencies:** None
**Tag maintainer:** @eyurtsev
- **Description:** Touch up of the documentation page for Metaphor
Search Tool integration. Removes documentation for old built-in tool
wrapper.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
CC @baskaryan @hwchase17 @jmorganca
Having a bit of trouble importing `langchain_experimental` from a
notebook, will figure it out tomorrow
~Ah and also is blocked by #13226~
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
Related to https://github.com/mlflow/mlflow/pull/10420. MLflow AI
gateway will be deprecated and replaced by the `mlflow.deployments`
module. Happy to split this PR if it's too large.
```
pip install git+https://github.com/langchain-ai/langchain.git@refs/pull/13699/merge#subdirectory=libs/langchain
```
## Dependencies
Install mlflow from https://github.com/mlflow/mlflow/pull/10420:
```
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10420/merge
```
## Testing plan
The following code works fine on local and databricks:
<details><summary>Click</summary>
<p>
```python
"""
Setup
-----
mlflow deployments start-server --config-path examples/gateway/openai/config.yaml
databricks secrets create-scope <scope>
databricks secrets put-secret <scope> openai-api-key --string-value $OPENAI_API_KEY
Run
---
python /path/to/this/file.py secrets/<scope>/openai-api-key
"""
from langchain.chat_models import ChatMlflow, ChatDatabricks
from langchain.embeddings import MlflowEmbeddings, DatabricksEmbeddings
from langchain.llms import Databricks, Mlflow
from langchain.schema.messages import HumanMessage
from langchain.chains.loading import load_chain
from mlflow.deployments import get_deploy_client
import uuid
import sys
import tempfile
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
###############################
# MLflow
###############################
chat = ChatMlflow(
target_uri="http://127.0.0.1:5000", endpoint="chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
embeddings = MlflowEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings")
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
llm = Mlflow(
target_uri="http://127.0.0.1:5000",
endpoint="completions",
params={"temperature": 0.1},
)
print(llm("I am"))
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
print(llm_chain.run(adjective="funny"))
# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
print(tmpdir)
path = f"{tmpdir}/llm.yaml"
llm_chain.save(path)
loaded_chain = load_chain(path)
print(loaded_chain("funny"))
###############################
# Databricks
###############################
secret = sys.argv[1]
client = get_deploy_client("databricks")
# External - chat
name = f"chat-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "gpt-4",
"provider": "openai",
"task": "llm/v1/chat",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
chat = ChatDatabricks(
target_uri="databricks", endpoint=name, params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
finally:
client.delete_endpoint(endpoint=name)
# External - embeddings
name = f"embeddings-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "text-embedding-ada-002",
"provider": "openai",
"task": "llm/v1/embeddings",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
embeddings = DatabricksEmbeddings(target_uri="databricks", endpoint=name)
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
finally:
client.delete_endpoint(endpoint=name)
# External - completions
name = f"completions-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "gpt-3.5-turbo-instruct",
"provider": "openai",
"task": "llm/v1/completions",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
llm = Databricks(
endpoint_name=name,
model_kwargs={"temperature": 0.1},
)
print(llm("I am"))
finally:
client.delete_endpoint(endpoint=name)
# Foundation model - chat
chat = ChatDatabricks(
endpoint="databricks-llama-2-70b-chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
# Foundation model - embeddings
embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")
print(embeddings.embed_query("hello")[:3])
# Foundation model - completions
llm = Databricks(
endpoint_name="databricks-mpt-7b-instruct", model_kwargs={"temperature": 0.1}
)
print(llm("hello"))
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
print(llm_chain.run(adjective="funny"))
# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
print(tmpdir)
path = f"{tmpdir}/llm.yaml"
llm_chain.save(path)
loaded_chain = load_chain(path)
print(loaded_chain("funny"))
```
Output:
```
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
sorry, but I cannot continue the sentence as it is incomplete. Can you please provide more information or context?
Sure, here's a classic one for you:
Why don't scientists trust atoms?
Because they make up everything!
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmpx_4no6ad
{'adjective': 'funny', 'text': "Sure, here's a classic one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!"}
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
a 23 year old female and I am currently studying for my master's degree
content="\nHello! It's nice to meet you. Is there something I can help you with or would you like to chat for a bit?"
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]
hello back
Well, I don't really know many jokes, but I do know this funny story...
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmp7_ds72ex
{'adjective': 'funny', 'text': " Well, I don't really know many jokes, but I do know this funny story..."}
```
</p>
</details>
The existing workflow doesn't break:
<details><summary>click</summary>
<p>
```python
import uuid
import mlflow
from mlflow.models import ModelSignature
from mlflow.types.schema import ColSpec, Schema
class MyModel(mlflow.pyfunc.PythonModel):
def predict(self, context, model_input):
return str(uuid.uuid4())
with mlflow.start_run():
mlflow.pyfunc.log_model(
"model",
python_model=MyModel(),
pip_requirements=["mlflow==2.8.1", "cloudpickle<3"],
signature=ModelSignature(
inputs=Schema(
[
ColSpec("string", "prompt"),
ColSpec("string", "stop"),
]
),
outputs=Schema(
[
ColSpec(name=None, type="string"),
]
),
),
registered_model_name=f"lang-{uuid.uuid4()}",
)
# Manually create a serving endpoint with the registered model and run
from langchain.llms import Databricks
llm = Databricks(endpoint_name="<name>")
llm("hello") # 9d0b2491-3d13-487c-bc02-1287f06ecae7
```
</p>
</details>
## Follow-up tasks
(This PR is too large. I'll file a separate one for follow-up tasks.)
- Update `docs/docs/integrations/providers/mlflow_ai_gateway.mdx` and
`docs/docs/integrations/providers/databricks.md`.
---------
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
…parameters.
In Langchain's `dumps()` function, I've added a `**kwargs` parameter.
This allows users to pass additional parameters to the underlying
`json.dumps()` function, providing greater flexibility and control over
JSON serialization.
Many parameters available in `json.dumps()` can be useful or even
necessary in specific situations. For example, when using an Agent with
return_intermediate_steps set to true, the output is a list of
AgentAction objects. These objects can't be serialized without using
Langchain's `dumps()` function.
The issue arises when using the Agent with a language other than
English, which may contain non-ASCII characters like 'é'. The default
behavior of `json.dumps()` sets ensure_ascii to true, converting
`{"name": "José"}` into `{"name": "Jos\u00e9"}`. This can make the
output hard to read, especially in the case of intermediate steps in
agent logs.
By allowing users to pass additional parameters to `json.dumps()` via
Langchain's dumps(), we can solve this problem. For instance, users can
set `ensure_ascii=False` to maintain the original characters.
This update also enables users to pass other useful `json.dumps()`
parameters like `sort_keys`, providing even more flexibility.
The implementation takes into account edge cases where a user might pass
a "default" parameter, which is already defined by `dumps()`, or an
"indent" parameter, which is also predefined if `pretty=True` is set.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
### Description
Hello,
The [integration_test
README](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests)
was indicating incorrect paths for the `.env.example` and `.env` files.
`tests/.env.example` ->`tests/integration_tests/.env.example`
While it’s a minor error, it could **potentially lead to confusion** for
the document’s readers, so I’ve made the necessary corrections.
Thank you! ☺️
### Related Issue
- https://github.com/langchain-ai/langchain/pull/2806
**Description:**
Added support for a Pandas DataFrame OutputParser with format
instructions, along with unit tests and a demo notebook. Namely, we've
added the ability to request data from a DataFrame, have the LLM parse
the request, and then use that request to retrieve a well-formatted
response.
Within LangChain, it seamlessly integrates with language models like
OpenAI's `text-davinci-003`, facilitating streamlined interaction using
the format instructions (just like the other output parsers).
This parser structures its requests as
`<operation/column/row>[<optional_array_params>]`. The instructions
detail permissible operations, valid columns, and array formats,
ensuring clarity and adherence to the required format.
For example:
- When the LLM receives the input: "Retrieve the mean of `num_legs` from
rows 1 to 3."
- The provided format instructions guide the LLM to structure the
request as: "mean:num_legs[1..3]".
The parser processes this formatted request, leveraging the LLM's
understanding to extract the mean of `num_legs` from rows 1 to 3 within
the Pandas DataFrame.
This integration allows users to communicate requests naturally, with
the LLM transforming these instructions into structured commands
understood by the `PandasDataFrameOutputParser`. The format instructions
act as a bridge between natural language queries and precise DataFrame
operations, optimizing communication and data retrieval.
**Issue:**
- https://github.com/langchain-ai/langchain/issues/11532
**Dependencies:**
No additional dependencies :)
**Tag maintainer:**
@baskaryan
**Twitter handle:**
No need. :)
---------
Co-authored-by: Wasee Alam <waseealam@protonmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:**
When using Vald, only insecure grpc connection was supported, so secure
connection is now supported.
In addition, grpc metadata can be added to Vald requests to enable
authentication with a token.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
grammar correction
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Response_if_no_docs_found is not implemented in
ConversationalRetrievalChain for async code paths. Implemented it and
added test cases
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
# Description
This PR implements Self-Query Retriever for MongoDB Atlas vector store.
I've implemented the comparators and operators that are supported by
MongoDB Atlas vector store according to the section titled "Atlas Vector
Search Pre-Filter" from
https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/.
Namely:
```
allowed_comparators = [
Comparator.EQ,
Comparator.NE,
Comparator.GT,
Comparator.GTE,
Comparator.LT,
Comparator.LTE,
Comparator.IN,
Comparator.NIN,
]
"""Subset of allowed logical operators."""
allowed_operators = [
Operator.AND,
Operator.OR
]
```
Translations from comparators/operators to MongoDB Atlas filter
operators(you can find the syntax in the "Atlas Vector Search
Pre-Filter" section from the previous link) are done using the following
dictionary:
```
map_dict = {
Operator.AND: "$and",
Operator.OR: "$or",
Comparator.EQ: "$eq",
Comparator.NE: "$ne",
Comparator.GTE: "$gte",
Comparator.LTE: "$lte",
Comparator.LT: "$lt",
Comparator.GT: "$gt",
Comparator.IN: "$in",
Comparator.NIN: "$nin",
}
```
In visit_structured_query() the filters are passed as "pre_filter" and
not "filter" as in the MongoDB link above since langchain's
implementation of MongoDB atlas vector
store(libs\langchain\langchain\vectorstores\mongodb_atlas.py) in
_similarity_search_with_score() sets the "filter" key to have the value
of the "pre_filter" argument.
```
params["filter"] = pre_filter
```
Test cases and documentation have also been added.
# Issue
#11616
# Dependencies
No new dependencies have been added.
# Documentation
I have created the notebook mongodb_atlas_self_query.ipynb outlining the
steps to get the self-query mechanism working.
I worked closely with [@Farhan-Faisal](https://github.com/Farhan-Faisal)
on this PR.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
# Description
We implemented a simple tool for accessing the Merriam-Webster
Collegiate Dictionary API
(https://dictionaryapi.com/products/api-collegiate-dictionary).
Here's a simple usage example:
```py
from langchain.llms import OpenAI
from langchain.agents import load_tools, initialize_agent, AgentType
llm = OpenAI()
tools = load_tools(["serpapi", "merriam-webster"], llm=llm) # Serp API gives our agent access to Google
agent = initialize_agent(
tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
agent.run("What is the english word for the german word Himbeere? Define that word.")
```
Sample output:
```
> Entering new AgentExecutor chain...
I need to find the english word for Himbeere and then get the definition of that word.
Action: Search
Action Input: "English word for Himbeere"
Observation: {'type': 'translation_result'}
Thought: Now I have the english word, I can look up the definition.
Action: MerriamWebster
Action Input: raspberry
Observation: Definitions of 'raspberry':
1. rasp-ber-ry, noun: any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries
2. rasp-ber-ry, noun: a perennial plant (genus Rubus) of the rose family that bears raspberries
3. rasp-ber-ry, noun: a sound of contempt made by protruding the tongue between the lips and expelling air forcibly to produce a vibration; broadly : an expression of disapproval or contempt
4. black raspberry, noun: a raspberry (Rubus occidentalis) of eastern North America that has a purplish-black fruit and is the source of several cultivated varieties —called also blackcap
Thought: I now know the final answer.
Final Answer: Raspberry is an english word for Himbeere and it is defined as any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries.
> Finished chain.
```
# Issue
This closes#12039.
# Dependencies
We added no extra dependencies.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Lara <63805048+larkgz@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Update the document for drop box loader + made the
messages more verbose when loading pdf file since people were getting
confused
- **Issue:** #13952
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17,
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Added a tool called RedditSearchRun and an
accompanying API wrapper, which searches Reddit for posts with support
for time filtering, post sorting, query string and subreddit filtering.
- **Issue:** #13891
- **Dependencies:** `praw` module is used to search Reddit
- **Tag maintainer:** @baskaryan , and any of the other maintainers if
needed
- **Twitter handle:** None.
Hello,
This is our first PR and we hope that our changes will be helpful to the
community. We have run `make format`, `make lint` and `make test`
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.
Our PR integrates the `praw` package which is already used by
RedditPostsLoader in LangChain. Nonetheless, we have added integration
tests and edited unit tests to test our changes. An example notebook is
also provided. These changes were put together by me, @Anika2000,
@CharlesXu123, and @Jeremy-Cheng-stack
Thank you in advance to the maintainers for their time.
---------
Co-authored-by: What-Is-A-Username <49571870+What-Is-A-Username@users.noreply.github.com>
Co-authored-by: Anika2000 <anika.sultana@mail.utoronto.ca>
Co-authored-by: Jeremy Cheng <81793294+Jeremy-Cheng-stack@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Volc Engine MaaS serves as an enterprise-grade,
large-model service platform designed for developers. You can visit its
homepage at https://www.volcengine.com/docs/82379/1099455 for details.
This change will facilitate developers to integrate quickly with the
platform.
- **Issue:** None
- **Dependencies:** volcengine
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @he1v3tica
---------
Co-authored-by: lvzhong <lvzhong@bytedance.com>
- **Description:** use post field validation for `CohereRerank`
- **Issue:** #12899 and #13058
- **Dependencies:**
- **Tag maintainer:** @baskaryan
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Update 5 pdf document loaders in
`langchain.document_loaders.pdf`, to store a url in the metadata
(instead of a temporary, local file path) if the user provides a web
path to a pdf: `PyPDFium2Loader`, `PDFMinerLoader`,
`PDFMinerPDFasHTMLLoader`, `PyMuPDFLoader`, and `PDFPlumberLoader` were
updated.
- The updates follow the approach used to update `PyPDFLoader` for the
same behavior in #12092
- The `PyMuPDFLoader` changes required additional work in updating
`langchain.document_loaders.parsers.pdf.PyMuPDFParser` to be able to
process either an `io.BufferedReader` (from local pdf) or `io.BytesIO`
(from online pdf)
- The `PDFMinerPDFasHTMLLoader` change used a simpler approach since the
metadata is assigned by the loader and not the parser
- **Issue:** Fixes#7034
- **Dependencies:** None
```python
# PyPDFium2Loader example:
# old behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': '/var/folders/7z/d5dt407n673drh1f5cm8spj40000gn/T/tmpm5oqa92f/tmp.pdf', 'page': 0}
# new behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0}
```
- **Description:** Updated to remove deprecated parameter penalty_alpha,
and use string variation of prompt rather than json object for better
flexibility. - **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** N/A
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** @symbldotai
---------
Co-authored-by: toshishjawale <toshish@symbl.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Instead of using JSON-like syntax to describe node and relationship
properties we changed to a shorter and more concise schema description
Old:
```
Node properties are the following:
[{'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Movie'}, {'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Actor'}]
Relationship properties are the following:
[]
The relationships are the following:
['(:Actor)-[:ACTED_IN]->(:Movie)']
```
New:
```
Node properties are the following:
Movie {name: STRING},Actor {name: STRING}
Relationship properties are the following:
The relationships are the following:
(:Actor)-[:ACTED_IN]->(:Movie)
```
Implements
[#12115](https://github.com/langchain-ai/langchain/issues/12115)
Who can review?
@baskaryan , @eyurtsev , @hwchase17
Integrated Stack Exchange API into Langchain, enabling access to diverse
communities within the platform. This addition enhances Langchain's
capabilities by allowing users to query Stack Exchange for specialized
information and engage in discussions. The integration provides seamless
interaction with Stack Exchange content, offering content from varied
knowledge repositories.
A notebook example and test cases were included to demonstrate the
functionality and reliability of this integration.
- Add StackExchange as a tool.
- Add unit test for the StackExchange wrapper and tool.
- Add documentation for the StackExchange wrapper and tool.
If you have time, could you please review the code and provide any
feedback as necessary! My team is welcome to any suggestions.
---------
Co-authored-by: Yuval Kamani <yuvalkamani@gmail.com>
Co-authored-by: Aryan Thakur <aryanthakur@Aryans-MacBook-Pro.local>
Co-authored-by: Manas1818 <79381912+manas1818@users.noreply.github.com>
Co-authored-by: aryan-thakur <61063777+aryan-thakur@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** The class allows to only select between a few
predefined prompts from the paper. That is not ideal, since other use
cases might need a custom prompt. The changes made allow for this. To be
able to monitor those, I also added functionality to supply a custom
run_manager.
- **Issue:** no issue, but a new feature,
- **Dependencies:** none,
- **Tag maintainer:** @hwchase17,
- **Twitter handle:** @yvesloy
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Support providing whatever extra parameters you want
to the Mathpix PDF loader API request.
- **Issue:** #12773
- **Dependencies:** None
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Adds a tqdm progress bar to GooglePalmEmbeddings when
embedding a list.
- **Issue:** #13637
- **Dependencies:** TQDM as a main dependency (instead of extra)
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
---------
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
langsmith is available on conda-forge as well and also a dependency of
the package so it gets installed either way by conda
306ed13308/recipe/meta.yaml (L43)
This PR is fixing an attributeError: object endpoint has no attribute
"_public_match_client" when using gcp matching engine with private VPC
network.
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** As of OpenAI's Python package 1.0, the existing
DallEAPIWrapper does not work correctly, so the example in the LangChain
Documentation link below does not work either.
https://python.langchain.com/docs/integrations/tools/dalle_image_generator
Also, since OpenAI only supports DALL-E version 2 or version 3, I
modified the DallEAPIWrapper to support it.
- **Issue:** #13825
- **Twitter handle:** ggeutzzang
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Small fix to _summarization_ example, `reduce_template` should use
`{docs}` variable.
Bug likely introduced as following code suggests using
`hub.pull("rlm/map-prompt")` instead of defined prompt.
- **Description:** According to the document
https://cloud.baidu.com/doc/WENXINWORKSHOP/s/6lp69is2a, add ERNIE-Bot-8K
model support for ErnieBotChat.
- **Dependencies:** Before using the ERNIE-Bot-8K, you should have the
model's access authority.
Replace this entire comment with:
- **Description:** updates `create_llm_result` function within
`openai.py` to consider latest `params`,
- **Issue:** #8928
- **Dependencies:** -,
- **Tag maintainer:** -
- **Twitter handle:** [burkomr](https://twitter.com/burkomr)
<!-- If no one reviews your PR within a few days, please @-mention one
of @baskaryan, @eyurtsev, @hwchase17. -->
---------
Co-authored-by: Burak Ömür <burakomur@retorio.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Replace this entire comment with:
- **Description:** VertexAI models are now GA, moved away from using
preview ones from the SDK
- **Issue:** #13606
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
### Description:
Hey 👋🏽 this is a small docs example fix. Hoping it helps future developers who are working with Langchain.
### Problem:
Take a look at the original example code. You were not able to get the `dialogue_turn[0]` while it was a tuple.
Original code:
```python
def _format_chat_history(chat_history: List[Tuple]) -> str:
buffer = ""
for dialogue_turn in chat_history:
human = "Human: " + dialogue_turn[0]
ai = "Assistant: " + dialogue_turn[1]
buffer += "\n" + "\n".join([human, ai])
return buffer
```
In the original code you were getting this error:
```bash
human = "Human: " + dialogue_turn[0].content
~~~~~~~~~~~~~^^^
TypeError: 'HumanMessage' object is not subscriptable
```
### Solution:
The fix is to just for loop over the chat history and look to see if its a human or ai message and add it to the buffer.
**Description:**
Repair Wikipedia document loader `load_max_docs` and improve test
coverage.
**Issue:**
The Wikipedia document loader was not respecting the `load_max_docs`
paramater (not reported) and would always return a maximum of 10
documents. This is because the API wrapper (in `utilities/wikipedia.py`)
wasn't passing `top_k_results` to the underlying [Wikipedia
library](https://wikipedia.readthedocs.io/en/latest/code.html#module-wikipedia).
By default this library returns 10 results.
The default number of results for the document loader has been reduced
from 100 to 25. This is because loading 100 results takes a very long
time and is an inconvenient default. It should possibly be 10.
In addition, the documentation for the loader reported that there was a
hard limit (300) on the number of documents returned. In actuality 300
is the maximum Wikipedia query character length set by the API wrapper.
Tests have been added for the document loader (previously missing) and
to test the correct numbers of documents are being returned by each
class, both by default, and when overridden. Also repaired is the
`assert_docs` test which has been updated to correctly test for the
default metadata (which includes `source` in recent releases).
**Dependencies:**
nil
**Tag maintainer:**
@leo-gan
**Twitter handle:**
@queenvictoria
### **Description:**
Previously `python_repl` was a built-in tool, but now it has been moved
to `langchain_experimental`.
When I use `load_tools` I get an error:
```python
In [1]: from langchain.agents import load_tools
In [2]: load_tools(["python_repl"])
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[2], line 1
----> 1 load_tools(["python_repl"])
File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:530, in load_tools(tool_names, llm, callbacks, **kwargs)
528 tool_names.extend(requests_method_tools)
529 elif name in _BASE_TOOLS:
--> 530 tools.append(_BASE_TOOLS[name]())
531 elif name in _LLM_TOOLS:
532 if llm is None:
File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:84, in _get_python_repl()
83 def _get_python_repl() -> BaseTool:
---> 84 raise ImportError(
85 "This tool has been moved to langchain experiment. "
86 "This tool has access to a python REPL. "
87 "For best practices make sure to sandbox this tool. "
88 "Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md "
89 "To keep using this code as is, install langchain experimental and "
90 "update relevant imports replacing 'langchain' with 'langchain_experimental'"
91 )
ImportError: This tool has been moved to langchain experiment. This tool has access to a python REPL. For best practices make sure to sandbox this tool. Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md To keep using this code as is, install langchain experimental and update relevant imports replacing 'langchain' with 'langchain_experimental'
```
In this case, it will be very confusing. I think it is no longer a
built-in tool now, so it can be removed from `_BASE_TOOLS`
### **Issue:**
https://github.com/langchain-ai/langchain/issues/13858,
https://github.com/langchain-ai/langchain/issues/13859,
https://github.com/langchain-ai/langchain/issues/13856
### **Twitter handle:**
[lin_bob57617](https://twitter.com/lin_bob57617)
The `integrations/vectorstores/matchingengine.ipynb` example has the
"Google Vertex AI Vector Search" title. This place this Title in the
wrong order in the ToC (it is sorted by the file name).
- Renamed `integrations/vectorstores/matchingengine.ipynb` into
`integrations/vectorstores/google_vertex_ai_vector_search.ipynb`.
- Updated a correspondent comment in docstring
- Rerouted old URL to a new URL
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
It was :
`from langchain.schema.prompts import BasePromptTemplate`
but because of the breaking change in the ns, it is now
`from langchain.schema.prompt_template import BasePromptTemplate`
This bug prevents building the API Reference for the langchain_experimental
Addressed this issue with the top menu: It allocates too much space. If the screen is small, then the top menu items are split into two lines and look unreadable.
Another issue is with several top menu items: "Chat our docs" and "Also by LangChain". They are compound of several words which also hurts readability. The top menu items should be 1-word size.
Updates:
- "Chat our docs" -> "Chat" (the meaning is clean after clicking/opening the item)
- "Also by LangChain" -> "🦜️🔗"
- "🦜️🔗" moved before "Chat" item. This new item is partially copied from the first left item, the "🦜️🔗 LangChain". This design (with two 🦜️🔗 elements, visually splits the top menu into two parts. The first item in each part holds the 🦜️🔗 symbols and, when we click the second 🦜️🔗 item, it opens the drop-down menu. So, we've got two visually similar parts, which visually split the top menu on the right side: the LangChain Docs (and Doc-related items) and the lift side: other LangChain.ai (company) products/docs.
There are the following main changes in this PR:
1. Rewrite of the DocugamiLoader to not do any XML parsing of the DGML
format internally, and instead use the `dgml-utils` library we are
separately working on. This is a very lightweight dependency.
2. Added MMR search type as an option to multi-vector retriever, similar
to other retrievers. MMR is especially useful when using Docugami for
RAG since we deal with large sets of documents within which a few might
be duplicates and straight similarity based search doesn't give great
results in many cases.
We are @docugami on twitter, and I am @tjaffri
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
- **Description:** Adds a retriever implementation for [Knowledge Bases
for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/), a
new service announced at AWS re:Invent, shortly before this PR was
opened. This depends on the `bedrock-agent-runtime` service, which will
be included in a future version of `boto3` and of `botocore`. We will
open a follow-up PR documenting the minimum required versions of `boto3`
and `botocore` after that information is available.
- **Issue:** N/A
- **Dependencies:** `boto3>=1.33.2, botocore>=1.33.2`
- **Tag maintainer:** @baskaryan
- **Twitter handles:** `@pjain7` `@dead_letter_q`
This PR includes a documentation notebook under
`docs/docs/integrations/retrievers`, which I (@dlqqq) have verified
independently.
EDIT: `bedrock-agent-runtime` service is now included in
`boto3>=1.33.2`:
5cf793f493
---------
Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Addressing incorrect order being sent to callbacks / tracers, due to the
nature of threading
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Add arg to omit streamed_output list, in cases where final_output is
enough this saves bandwidth
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR rearranges the docstring for the `AstraDB` vector store class so
as to have all useful information in the _class_ docstring for ease of
reading.
(incidentally, due to an oversight, the docstring that was in the
constructor ended up buried below some lines of code, thereby
disappearing altogether from accessibility. Apologies.)
- **Description:** Updates to `AnthropicFunctions` to be compatible with
the OpenAI `function_call` functionality.
- **Issue:** The functionality to indicate `auto`, `none` and a forced
function_call was not completely implemented in the existing code.
- **Dependencies:** None
- **Tag maintainer:** @baskaryan , and any of the other maintainers if
needed.
- **Twitter handle:** None
I have specifically tested this functionality via AWS Bedrock with the
Claude-2 and Claude-Instant models.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** dead link replacement
- **Issue:** no open issue
**Note:**
Hi langchain team,
Sorry to open a PR for this concern but we realized that one of the
links present in the documentation booklet was broken 😄
- **Description:** We are adding functionality to extract message
content from the `attributedBody` field of the database, in case the
content is not in the `text` field.
- **Issue:** Closes#13326 and #10680
- **Dependencies:** None.
- **Tag maintainer:** @eyurtsev, @hwchase17
---------
Co-authored-by: onotate <johnp.pham@mail.utoronto.ca>
- **Description:** Reduce image asset file size used in documentation by
running them via lossless image optimization
([tinypng](https://www.npmjs.com/package/tinypng-cli) was used in this
case). Images wider than 1916px (the maximum width of an image displayed
in documentation) where downsized.
- **Issue:** No issue is created for this, but the large image file
assets caused slow documentation load times
- **Dependencies:** No dependencies affected
- **Description:** Previously `MarkdownHeaderTextSplitter` did not
consider tilde-fenced code blocks
(https://spec.commonmark.org/0.30/#fenced-code-blocks). This PR fixes
that.
````md
# Bug caused by previous implementation:
~~~py
foo()
# This is a comment that would be considered header
bar()
~~~
````
- **Tag maintainer:** @baskaryan
Several bug fixes:
- emails: instead of `bcc` the `cc` is used.
- errors in the truncation descriptions
- no truncation of the `message_search`
Several updates:
- generalized UTC format
- truncation limit can be changed now in _call()
Fixes#13407.
This workaround consists in letting the RunnableLambda create its
self.afunc from its self.func when self.afunc is not provided; the
change has no dependency.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
```
---- chunk 1
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]}
---- chunk 2
{'messages': [FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”")]}
---- chunk 3
{'actions': [AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]}
---- chunk 4
{'messages': [FunctionMessage(content='25 years', name='Search')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years')]}
---- chunk 5
{'actions': [AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]}
---- chunk 6
{'messages': [FunctionMessage(content='Answer: 3.991298452658078', name='Calculator')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
---- chunk 7
{'messages': [AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
'power is approximately 3.99.'}
---- final
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]),
AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]),
AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}}),
FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search'),
AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}}),
FunctionMessage(content='25 years', name='Search'),
AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}}),
FunctionMessage(content='Answer: 3.991298452658078', name='Calculator'),
AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
'power is approximately 3.99.',
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”"),
AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years'),
AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
```
- **Description:** Existing model used for Prompt Injection is quite
outdated but we fine-tuned and open-source a new model based on the same
model deberta-v3-base from Microsoft -
[laiyer/deberta-v3-base-prompt-injection](https://huggingface.co/laiyer/deberta-v3-base-prompt-injection).
It supports more up-to-date injections and less prone to
false-positives.
- **Dependencies:** No
- **Tag maintainer:** -
- **Twitter handle:** @alex_yaremchuk
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** The experimental package needs to be compatible with
the usage of importing agents
For example, if i use `from langchain.agents import
create_pandas_dataframe_agent`, running the program will prompt the
following information:
```
Traceback (most recent call last):
File "/Users/dongwm/test/main.py", line 1, in <module>
from langchain.agents import create_pandas_dataframe_agent
File "/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain/agents/__init__.py", line 87, in __getattr__
raise ImportError(
ImportError: create_pandas_dataframe_agent has been moved to langchain experimental. See https://github.com/langchain-ai/langchain/discussions/11680 for more information.
Please update your import statement from: `langchain.agents.create_pandas_dataframe_agent` to `langchain_experimental.agents.create_pandas_dataframe_agent`.
```
But when I changed to `from langchain_experimental.agents import
create_pandas_dataframe_agent`, it was actually wrong:
```python
Traceback (most recent call last):
File "/Users/dongwm/test/main.py", line 2, in <module>
from langchain_experimental.agents import create_pandas_dataframe_agent
ImportError: cannot import name 'create_pandas_dataframe_agent' from 'langchain_experimental.agents' (/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain_experimental/agents/__init__.py)
```
I should use `from langchain_experimental.agents.agent_toolkits import
create_pandas_dataframe_agent`. In order to solve the problem and make
it compatible, I added additional import code to the
langchain_experimental package. Now it can be like this Used `from
langchain_experimental.agents import create_pandas_dataframe_agent`
- **Twitter handle:** [lin_bob57617](https://twitter.com/lin_bob57617)
- **Description:** Adds a tqdm progress bar to OllamaEmbeddings when
embedding a list.
- **Issue:** Related to #13637, but extended to Ollama.
- **Dependencies:** `tqdm` made a necessary dependency.
Thanks to @ugm2 for helping identify a common problem. Embeddings take a
very long time to finish on local machines, and require a progress bar
to help identify if one should even attempt the workload.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Adding rag-opensearch template.
---------
Signed-off-by: kalyanr <kalyan.ben10@live.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Current docs for adapters are in the `Guides/Adapters which is not a
good place.
- moved Adapters into `Integratons/Components/Adapters/
- simplified the OpenAI adapter notebook
- rerouted the old OpenAI adapter page URL to a new one.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** Added a line to pass the tenant parameter to
add_data_object
- **Issue:** An extra line added from the fix for #9956
- **Dependencies:** n/a
- **Tag maintainer:** @baskaryan
Tested locally, works as expected with the line change.
---------
Co-authored-by: Simon Dai <simon6752@gmail.com>
Description: Some Elastic indexes do not return a 'metadata' field in
'_source'. However, prior to this PR, the code assumed there always is a
'metadata' field. This PR adds support for cases where the field is
missing by adding it manually.
Issue: #13869
**Description:**
This PR adds Databricks Vector Search as a new vector store in
LangChain.
- [x] Add `DatabricksVectorSearch` in `langchain/vectorstores/`
- [x] Unit tests
- [x] Add
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)
as a new optional dependency
We ran the following checks:
- `make format` passed ✅
- `make lint` failed but the failures were caused by other files
+ Files touched by this PR passed the linter ✅
- `make test` passed ✅
- `make coverage` failed but the failures were caused by other files.
Tests added by or related to this PR all passed
+ langchain/vectorstores/databricks_vector_search.py test coverage 94% ✅
- `make spell_check` passed ✅
The example notebook and updates to the [provider's documentation
page](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/providers/databricks.md)
will be added later in a separate PR.
**Dependencies:**
Optional dependency:
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
The cookbook had some code to upload files, and wait for the processing
to finish.
This code is now moved to the `docugami` library so removing from the
cookbook to simplify.
Thanks @rlancemartin for suggesting this when working on evals.
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
This pull request addresses an issue found in the example code within
the docstring of `libs/core/langchain_core/runnables/passthrough.py`
The original code snippet caused a `NameError` due to the missing import
of `RunnableLambda`. The error was as follows:
```
12 return "completion"
13
---> 14 chain = RunnableLambda(fake_llm) | {
15 'original': RunnablePassthrough(), # Original LLM output
16 'parsed': lambda text: text[::-1] # Parsing logic
NameError: name 'RunnableLambda' is not defined
```
To resolve this, I have modified the example code to include the
necessary import statement for `RunnableLambda`. Additionally, I have
adjusted the indentation in the code snippet to ensure consistency and
readability.
The modified code now successfully defines and utilizes
`RunnableLambda`, ensuring that users referencing the docstring will
have a functional and clear example to follow.
There are no related GitHub issues for this particular change.
Modified Code:
```python
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.runnables import RunnableLambda
runnable = RunnableParallel(
origin=RunnablePassthrough(),
modified=lambda x: x+1
)
runnable.invoke(1) # {'origin': 1, 'modified': 2}
def fake_llm(prompt: str) -> str: # Fake LLM for the example
return "completion"
chain = RunnableLambda(fake_llm) | {
'original': RunnablePassthrough(), # Original LLM output
'parsed': lambda text: text[::-1] # Parsing logic
}
chain.invoke('hello') # {'original': 'completion', 'parsed': 'noitelpmoc'}
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Added a retriever for the Outline API to ask
questions on knowledge base
- **Issue:** resolves#11814
- **Dependencies:** None
- **Tag maintainer:** @baskaryan
- **Description:**
I encountered an issue while running the existing sample code on the
page https://python.langchain.com/docs/modules/agents/how_to/agent_iter
in an environment with Pydantic 2.0 installed. The following error was
triggered:
```python
ValidationError Traceback (most recent call last)
<ipython-input-12-2ffff2c87e76> in <cell line: 43>()
41
42 tools = [
---> 43 Tool(
44 name="GetPrime",
45 func=get_prime,
2 frames
/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py in __init__(__pydantic_self__, **data)
339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
340 if validation_error:
--> 341 raise validation_error
342 try:
343 object_setattr(__pydantic_self__, '__dict__', values)
ValidationError: 1 validation error for Tool
args_schema
subclass of BaseModel expected (type=type_error.subclass; expected_class=BaseModel)
```
I have made modifications to the example code to ensure it functions
correctly in environments with Pydantic 2.0.
- **Description:** Simple change, I just added title metadata to
GoogleDriveLoader for optional File Loaders
- **Dependencies:** no dependencies
- **Tag maintainer:** @hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR provides idiomatic implementations for the exact-match and the
semantic LLM caches using Astra DB as backend through the database's
HTTP JSON API. These caches require the `astrapy` library as dependency.
Comes with integration tests and example usage in the `llm_cache.ipynb`
in the docs.
@baskaryan this is the Astra DB counterpart for the Cassandra classes
you merged some time ago, tagging you for your familiarity with the
topic. Thank you!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR adds a chat message history component that uses Astra DB for
persistence through the JSON API.
The `astrapy` package is required for this class to work.
I have added tests and a small notebook, and updated the relevant
references in the other docs pages.
(@rlancemartin this is the counterpart of the Cassandra equivalent class
you so helpfully reviewed back at the end of June)
Thank you!
- **Description:** This commit fixed the problem that Redis vector store
will change the value of a metadata from 0 to empty when saving the
document, which should be an un-intended behavior.
- **Issue:** N/A
- **Dependencies:** N/A
**Description:** Currently, if we pass in a ToolMessage back to the
chain, it crashes with error
`Got unsupported message type: `
This fixes it.
Tested locally
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** BaseStringMessagePromptTemplate.from_template was
passing the value of partial_variables into cls(...) via **kwargs,
rather than passing it to PromptTemplate.from_template. Which resulted
in those *partial_variables being* lost and becoming required
*input_variables*.
Co-authored-by: Josep Pon Farreny <josep.pon-farreny@siemens.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Fix some circular deps:
- move PromptValue into top level module bc both PromptTemplates and
OutputParsers import
- move tracer context vars to `tracers.context` and import them in
functions in `callbacks.manager`
- add core import tests
Adds a cookbook for semi-structured RAG via Docugami. This follows the
same outline as the semi-structured RAG with Unstructured cookbook:
https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_Structured_RAG.ipynb
The main change is this cookbook uses Docugami instead of Unstructured
to find text and tables, and shows how XML markup in the output helps
with retrieval and generation.
We are \@docugami on twitter, I am \@tjaffri
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
- **Description:** We need to update the Dockerfile for templates to
also copy your README.md. This is because poetry requires that a readme
exists if it is specified in the pyproject.toml
Changes:
- remove langchain_core/schema since no clear distinction b/n schema and
non-schema modules
- make every module that doesn't end in -y plural
- where easy have 1-2 classes per file
- no more than one level of nesting in directories
- only import from top level core modules in langchain
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** fix a bug that prevented as_retriever() in Vectara to
use the desired input arguments
- **Issue:** as_retriever did not pass the arguments properly
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @ofermend
I encountered this during summarization with VertexAI. I was receiving
an INVALID_ARGUMENT error, as it was trying to send a list of about
17000 single characters.
The [count_tokens
method](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/language_models/_language_models.py#L658)
made available by Google takes in a list of prompts. It does not fail
for small texts, but it does for longer documents because the argument
list will be exceeding Googles allowed limit. Enforcing the list type
makes it work successfully.
This change will cast the input text to count to a list of that single
text so that the input format is always correct.
[Twitter](https://www.x.com/stijn_tratsaert)
- **Description:** ERNIE-Bot-Chat-4 Large Language Model adds the
ability of `Function Calling` by passing parameters through the
`functions` parameter in the request. To simplify function calling for
ERNIE-Bot-Chat-4, the `create_ernie_fn_chain()` function has been added.
The definition and usage of the `create_ernie_fn_chain()` function is
similar to that of the `create_openai_fn_chain()` function.
Examples as the follows:
```
import json
from langchain.chains.ernie_functions import (
create_ernie_fn_chain,
)
from langchain.chat_models import ErnieBotChat
from langchain.prompts import ChatPromptTemplate
def get_current_news(location: str) -> str:
"""Get the current news based on the location.'
Args:
location (str): The location to query.
Returs:
str: Current news based on the location.
"""
news_info = {
"location": location,
"news": [
"I have a Book.",
"It's a nice day, today."
]
}
return json.dumps(news_info)
def get_current_weather(location: str, unit: str="celsius") -> str:
"""Get the current weather in a given location
Args:
location (str): location of the weather.
unit (str): unit of the tempuature.
Returns:
str: weather in the given location.
"""
weather_info = {
"location": location,
"temperature": "27",
"unit": unit,
"forecast": ["sunny", "windy"],
}
return json.dumps(weather_info)
llm = ErnieBotChat(model_name="ERNIE-Bot-4")
prompt = ChatPromptTemplate.from_messages(
[
("human", "{query}"),
]
)
chain = create_ernie_fn_chain([get_current_weather, get_current_news], llm, prompt, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```
The running results of the above program are shown below:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?
> Finished chain.
{'name': 'get_current_news', 'thoughts': '用户想要知道北京今天的新闻。我可以使用get_current_news工具来获取这些信息。', 'arguments': {'location': '北京'}}
```
- **Description:** during search with DeepLake some people are facing
backwards compatibility issues, this PR fixes it by making search
accessible for the older datasets
---------
Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>
- **Description:**
- Fixes a `key_prefix` bug where passing it in on
`Redis.from_existing(...)` did not work properly. Updates doc strings
accordingly.
- Updates Redis filter classes logic with best practices on typing,
string formatting, and handling "empty" filters.
- Fixes a bug that would prevent multiple tag filters from being applied
together in some scenarios.
- Added a whole new filter unit testing module. Also updated code
formatting for a number of modules that were failing the `make`
commands.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @tchutch94
- **Description:** Fix typo in MongoDB memory docs
- **Tag maintainer:** @eyurtsev
<!-- Thank you for contributing to LangChain!
- **Description:** Fix typo in MongoDB memory docs
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** @baskaryan
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
In the `FORMAT_INSTRUCTIONS` template, 4 curly braces (escaping) are
used to get single curly brace after formatting:
```
"{{{ ... }}}}" -> format_instructions.format() -> "{{ ... }}" -> template.format() -> "{ ... }".
```
Tool's `args_schema` string contains single braces `{ ... }`, and is
also transformed to `{{{{ ... }}}}` form. But this is not really correct
since there is only one `format()` call:
```
"{{{{ ... }}}}" -> template.format() -> "{{ ... }}".
```
As a result we get double curly braces in the prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:
foo: Test tool FOO, args: {{'tool_input': {{'type': 'string'}}}} # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:
```
{
"action": $TOOL_NAME,
"action_input": $INPUT
}
```
````
This PR fixes curly braces escaping in the `args_schema` to have single
braces in the final prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:
foo: Test tool FOO, args: {'tool_input': {'type': 'string'}} # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:
```
{
"action": $TOOL_NAME,
"action_input": $INPUT
}
```
````
---------
Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
Hi 👋 We are working with Llama2 on Bedrock, and would like to add it to
Langchain. We saw a [pull
request](https://github.com/langchain-ai/langchain/pull/13322) to add it
to the `llm.Bedrock` class, but since it concerns a chat model, we would
like to add it to `BedrockChat` as well.
- **Description:** Add support for Llama2 to `BedrockChat` in
`chat_models`
- **Issue:** the issue # it fixes (if applicable)
[#13316](https://github.com/langchain-ai/langchain/issues/13316)
- **Dependencies:** any dependencies required for this change `None`
- **Tag maintainer:** /
- **Twitter handle:** `@SimonBockaert @WouterDurnez`
---------
Co-authored-by: wouter.durnez <wouter.durnez@showpad.com>
Co-authored-by: Simon Bockaert <simon.bockaert@showpad.com>
- **Description:** This change adds an agent to the Azure Cognitive
Services toolkit for identifying healthcare entities
- **Dependencies:** azure-ai-textanalytics (Optional)
---------
Co-authored-by: James Beck <James.Beck@sa.gov.au>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Hi!
This short PR aims at:
* Fixing `OpenAIEmbeddings`' check on `chunk_size` when used with Azure
OpenAI (thus with openai < 1.0). Azure OpenAI embeddings support at most
16 chunks per batch, I believe we are supposed to take the min between
the passed value/default value and 16, not the max - which, I suppose,
was introduced by accident while refactoring the previous version of
this check from this other PR of mine: #10707
* Porting this fix to the newest class (`AzureOpenAIEmbeddings`) for
openai >= 1.0
This fixes#13539 (closed but the issue persists).
@baskaryan @hwchase17
- **Description:** There are several mistakes in the sample code in the
doc-string of `DashVector` class, and this pull request aims to correct
them.
The correction code has been tested against latest version (at the time
of creation of this pull request) of: `langchain==0.0.336`
`dashvector==1.0.6` .
- **Issue:** No issue is created for this.
- **Dependencies:** No dependency is required for this change,
<!-- - **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below), -->
- **Twitter handle:** `zeyanglin`
<!-- Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** AstraDB is going to deprecate the `$similarity`
projection property in favor of the ´includeSimilarity´ option flag. I
moved all the queries to the new format.
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
Added a `search_kwargs` field to BingSearchAPIWrapper in
`bing_search.py,` enabling users to include extra keyword arguments in
Bing search queries. This update, like specifying language preferences,
adds more customization to searches. The `search_kwargs` seamlessly
merge with standard parameters in `_bing_search_results` method.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Fix Astra integration tests that are failing. The
`delete` always return True as the deletion is successful if no errors
are thrown. I aligned the test to verify this behaviour
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
The issue was accuring because of `openai` update in Completions. its
not accepting `api_key` and 'api_base' args.
The fix is we check for the openai version and if ats v1 then remove
these keys from args before passing them to `Compilation.create(...)`
when sending from `VLLMOpenAI`
Fixed: #13507
@eyu
@efriis
@hwchase17
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** In this pull request, we address an issue related to
assigning a schema to the SQLDatabase class when utilizing an Oracle
database. The current implementation encounters a bug where, upon
attempting to execute a query, the alter session parse is not
appropriately defined for Oracle, leading to an error,
- **Issue:** #7928,
- **Dependencies:** No dependencies,
- **Tag maintainer:** @baskaryan,
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** This change allows for the `MWDumpLoader` to load all
namespaces including custom by default instead of only loading the
[default
namespaces](https://www.mediawiki.org/wiki/Help:Namespaces#Localisation).
- **Tag maintainer:** @hwchase17
**Description:**
This commit adds embedchain retriever along with tests and docs.
Embedchain is a RAG framework to create data pipelines.
**Twitter handle:**
- [Taranjeet's twitter](https://twitter.com/taranjeetio) and
[Embedchain's twitter](https://twitter.com/embedchain)
**Reviewer**
@hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
Enhance the functionality of YoutubeLoader to enable the translation of
available transcripts by refining the existing logic.
**Issue:**
Encountering a problem with YoutubeLoader (#13523) where the translation
feature is not functioning as expected.
Tag maintainers/contributors who might be interested:
@eyurtsev
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
BUG: langchain.agents.openai_assistant has a reference as
`from langchain_experimental.openai_assistant.base import
OpenAIAssistantRunnable`
should be
`from langchain.agents.openai_assistant.base import
OpenAIAssistantRunnable`
This prevents building of the API Reference docs
## Update 2023-09-08
This PR now supports further models in addition to Lllama-2 chat models.
See [this comment](#issuecomment-1668988543) for further details. The
title of this PR has been updated accordingly.
## Original PR description
This PR adds a generic `Llama2Chat` model, a wrapper for LLMs able to
serve Llama-2 chat models (like `LlamaCPP`,
`HuggingFaceTextGenInference`, ...). It implements `BaseChatModel`,
converts a list of chat messages into the [required Llama-2 chat prompt
format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) and
forwards the formatted prompt as `str` to the wrapped `LLM`. Usage
example:
```python
# uses a locally hosted Llama2 chat model
llm = HuggingFaceTextGenInference(
inference_server_url="http://127.0.0.1:8080/",
max_new_tokens=512,
top_k=50,
temperature=0.1,
repetition_penalty=1.03,
)
# Wrap llm to support Llama2 chat prompt format.
# Resulting model is a chat model
model = Llama2Chat(llm=llm)
messages = [
SystemMessage(content="You are a helpful assistant."),
MessagesPlaceholder(variable_name="chat_history"),
HumanMessagePromptTemplate.from_template("{text}"),
]
prompt = ChatPromptTemplate.from_messages(messages)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chain = LLMChain(llm=model, prompt=prompt, memory=memory)
# use chat model in a conversation
# ...
```
Also part of this PR are tests and a demo notebook.
- Tag maintainer: @hwchase17
- Twitter handle: `@mrt1nz`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Added a method `fetch_valid_documents` to
`WebResearchRetriever` class that will test the connection for every url
in `new_urls` and remove those that raise a `ConnectionError`.
- **Issue:** [Previous
PR](https://github.com/langchain-ai/langchain/pull/13353),
- **Dependencies:** None,
- **Tag maintainer:** @efriis
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
## Description
This PR adds an option to allow unsigned requests to the Neptune
database when using the `NeptuneGraph` class.
```python
graph = NeptuneGraph(
host='<my-cluster>',
port=8182,
sign=False
)
```
Also, added is an option in the `NeptuneOpenCypherQAChain` to provide
additional domain instructions to the graph query generation prompt.
This will be injected in the prompt as-is, so you should include any
provider specific tags, for example `<instructions>` or `<INSTR>`.
```python
chain = NeptuneOpenCypherQAChain.from_llm(
llm=llm,
graph=graph,
extra_instructions="""
Follow these instructions to build the query:
1. Countries contain airports, not the other way around
2. Use the airport code for identifying airports
"""
)
```
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description**
MongoDB drivers are used in various flavors and languages. Making sure
we exercise our due diligence in identifying the "origin" of the library
calls makes it best to understand how our Atlas servers get accessed.
The original notebook has the `faiss` title which is duplicated in
the`faiss.jpynb`. As a result, we have two `faiss` items in the
vectorstore ToC. And the first item breaks the searching order (it is
placed between `A...` items).
- I updated title to `Asynchronous Faiss`.
**Description/Issue:**
When OpenAI calls a function with no args, the args are `""` rather than
`"{}"`. Then `json.loads("")` blows up. This PR handles it correctly.
**Dependencies:** None
- Fixed titles for two notebooks. They were inconsistent with other
titles and clogged ToC.
- Added `Upstash` description and link
- Moved the authentication text up in the `Elasticsearch` nb, right
after package installation. It was on the end of the page which was a
wrong place.
This PR brings a few minor improvements to the docs, namely class/method
docstrings and the demo notebook.
- A note on how to control concurrency levels to tune performance in
bulk inserts, both in the class docstring and the demo notebook;
- Slightly increased concurrency defaults after careful experimentation
(still on the conservative side even for clients running on
less-than-typical network/hardware specs)
- renamed the DB token variable to the standardized
`ASTRA_DB_APPLICATION_TOKEN` name (used elsewhere, e.g. in the Astra DB
docs)
- added a note and a reference (add_text docstring, demo notebook) on
allowed metadata field names.
Thank you!
The current `integrations/document_loaders/` sidebar has the
`example_data` item, which is a menu with a single item: "Notebook".
It is happening because the `integrations/document_loaders/` folder has
the `example_data/notebook.md` file that is used to autogenerate the
above menu item.
- removed an example_data/notebook.md file. Docusaurus doesn't have
simple ways to fix this problem (to exclude folders/files from an
autogenerated sidebar). Removing this file didn't break any existing
examples, so this fix is safe.
Updated several notebooks:
- fixed titles which are inconsistent or break the ToC sorting order.
- added missed soruce descriptions and links
- fixed formatting
- the `SemaDB` notebook was placed in additional subfolder which breaks
the vectorstore ToC. I moved file up, removed this unnecessary
subfolder; updated the `vercel.json` with rerouting for the new URL
- Added SemaDB description and link
- improved text consistency
- Fixed the title of the notebook. It created an ugly ToC element as
`Activeloop DeepLake's DeepMemory + LangChain + ragas or how to get +27%
on RAG recall.`
- Added Activeloop description
- improved consistency in text
- fixed ToC (it was using HTML tagas that break left-side in-page ToC).
Now in-page ToC works
- Fixed headers (was more then 1 Titles)
- Removed security token value. It was OK to have it, because it is
temporary token, but the automatic security swippers raise warnings on
that.
- Added `ClickUp` service description and link.
…rnative LLMs until used
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
fix#13356
Add supports following properties for metadata to NotionDBLoader.
- `checkbox`
- `email`
- `number`
- `select`
There are no relevant tests for this code to be updated.
The `Integrations` site is hidden now.
I've added it into the `More` menu.
The name is `Integration Cards` otherwise, it is confused with the
`Integrations` menu.
---------
Co-authored-by: Erick Friis <erickfriis@gmail.com>
- **Description:** Adds `limit_to_domains` param to the APIChain based
tools (open_meteo, TMDB, podcast_docs, and news_api)
- **Issue:** I didn't open an issue, but after upgrading to 0.0.328
using these tools would throw an error.
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan
**Note**: I included the trailing / simply because the docs here did
fc886cc303/docs/docs/use_cases/apis.ipynb (L246)
, but I checked the code and it is using `urlparse`. SoI followed the
docs since it comes down to stylee.
The `langchain` repo was being flagged for using vulnerable
dependencies, some of which were in this template's lockfile. Updating
to newer versions should fix that.
Just `poetry lock` and moving `langchain` to the latest version, in case
folks copy this template.
This resolves some vulnerable dependency alerts GitHub code scanning was
flagging.
My thought is that the ==version would prevent pip from finding the
package on regular [pypi.org](http://pypi.org/), so it would look at
[test.pypi.org](http://test.pypi.org/) for that. Otherwise it'll pull
package from [pypi.org](http://pypi.org/) (e.g. sub deps)
Right now, the cli release is failing because it's going to
test.pypi.org by default, so it finds this incorrect FASTAPI package
instead of the real one: https://test.pypi.org/project/FASTAPI/
The new ruff version fixed the blocking bugs, and I was able to fairly
easily us to a passing state: ruff fixed some issues on its own, I fixed
a handful by hand, and I added a list of narrowly-targeted exclusions
for files that are currently failing ruff rules that we probably should
look into eventually.
I went pretty lenient on the docs / cookbooks rules, allowing dead code
and such things. Perhaps in the future we may want to tighten the rules
further, but this is already a good set of checks that found real issues
and will prevent them going forward.
Hi,
this PR adds support for OpenAI API v1 for Azure OpenAI completion API.
@baskaryan @hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Bumps [pyarrow](https://github.com/apache/arrow) from 13.0.0 to 14.0.1.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ba53748361"><code>ba53748</code></a>
MINOR: [Release] Update versions for 14.0.1</li>
<li><a
href="529f3768fa"><code>529f376</code></a>
MINOR: [Release] Update .deb/.rpm changelogs for 14.0.1</li>
<li><a
href="b84bbcac64"><code>b84bbca</code></a>
MINOR: [Release] Update CHANGELOG.md for 14.0.1</li>
<li><a
href="f141709763"><code>f141709</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/38607">GH-38607</a>:
[Python] Disable PyExtensionType autoload (<a
href="https://redirect.github.com/apache/arrow/issues/38608">#38608</a>)</li>
<li><a
href="5a37e74198"><code>5a37e74</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/38431">GH-38431</a>:
[Python][CI] Update fs.type_name checks for s3fs tests (<a
href="https://redirect.github.com/apache/arrow/issues/38455">#38455</a>)</li>
<li><a
href="2dcee3f82c"><code>2dcee3f</code></a>
MINOR: [Release] Update versions for 14.0.0</li>
<li><a
href="297428cbf2"><code>297428c</code></a>
MINOR: [Release] Update .deb/.rpm changelogs for 14.0.0</li>
<li><a
href="3e9734f883"><code>3e9734f</code></a>
MINOR: [Release] Update CHANGELOG.md for 14.0.0</li>
<li><a
href="9f90995c8c"><code>9f90995</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/38332">GH-38332</a>:
[CI][Release] Resolve symlinks in RAT lint (<a
href="https://redirect.github.com/apache/arrow/issues/38337">#38337</a>)</li>
<li><a
href="bd61239a32"><code>bd61239</code></a>
<a
href="https://redirect.github.com/apache/arrow/issues/35531">GH-35531</a>:
[Python] C Data Interface PyCapsule Protocol (<a
href="https://redirect.github.com/apache/arrow/issues/37797">#37797</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/apache/arrow/compare/go/v13.0.0...go/v14.0.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/langchain-ai/langchain/network/alerts).
</details>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
Hey @rlancemartin, @eyurtsev ,
I did some minimal changes to the `ElasticVectorSearch` client so that
it plays better with existing ES indices.
Main changes are as follows:
1. You can pass the dense vector field name into `_default_script_query`
2. You can pass a custom script query implementation and the respective
parameters to `similarity_search_with_score`
3. You can pass functions for building page content and metadata for the
resulting `Document`
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
4. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @dev2049
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @dev2049
- Memory: @hwchase17
- Agents / Tools / Toolkits: @vowelparrot
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
hi!
This is pretty straight-forward: The sdist package does not contain the
license file (which is needed by e.g. conda) because the package is
built from the subdir and can't see the license.
I _copied_ the license but since I'm unfamiliar with the projects
direction, I'm not sure that's correct.
thanks!
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Fixes: #8207
Description:
Pinecone returns scores (not distances) with cosine similarity. The
values according to the docs are [-1, 1], although I could never
reproduce negative values.
This PR ensures that the score returned from Pinecone is preserved,
rather than inverted, so the most relevant documents can be filtered (eg
when using similarity thresholds)
I'll leave this as a draft PR as I couldn't run the tests (my pinecone
account might not be enough - some errors were being thrown around
namespaces) so hopefully someone who _can_ will pick this up.
Maintainers:
@rlancemartin, @eyurtsev
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Fixed a serialization issue in the add_texts method
of the Matching Engine Vector Store caused by a typo, leading to an
attempt to serialize the json module itself.
- **Issue:** #12154
- **Dependencies:** ./.
- **Tag maintainer:**
- **Description:** Refine Weaviate tutorial and add an example for
Retrieval-Augmented Generation (RAG)
- **Issue:** (not applicable),
- **Dependencies:** none
- **Tag maintainer:** @baskaryan <!--
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Twitter handle:** @helloiamleonie
Co-authored-by: Leonie <leonie@Leonies-MBP-2.fritz.box>
Due to the possibility of external inputs including UUIDs, there may be
additional values in **kwargs, while Weaviate's `__init__` method does
not support passing extra **kwarg parameters.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
When calling max_marginal_relevance_search from PGVector the filter
param is not carried over to max_marginal_relevance_search_by_vector
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Uses `endpoint_url` if provided with a boto3 session.
When running dynamodb locally, credentials are required even if invalid.
With this change, it will be possible to pass a boto3 session with
credentials and specify an endpoint_url
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Replace this entire comment with:
- **Description:** Add MyScaleWithoutJSON which allows user to wrap
columns into Document's Metadata
- **Tag maintainer:** @baskaryan
**Description**
Bumps the Momento dependency to the latest version and refactors the
usage of `SearchHit` in the Momento Vector Index (MVI) vector store
integration. This change is a one liner where we use the preferred
attribute `score` to read the query-document similarity instead of
`distance`. The latest versions of Momento clients will use this
attribute going forward.
**Dependencies**
Updated the Momento dependency to latest version.
**Tests**
💚 I re-ran the existing MVI integration tests
(`tests/integration_tests/vectorstores/test_momento_vector_index.py`)
and they pass.
**Review**
cc @baskaryan @eyurtsev
On the [Defining Custom
Tools](https://python.langchain.com/docs/modules/agents/tools/custom_tools)
page, there's a 'Subclassing the BaseTool class' paragraph under the
'Completely New Tools - String Input and Output' header. Also there's
another 'Subclassing the BaseTool' paragraph under no header, which I
think may belong to the 'Custom Structured Tools' header.
Another thing is, there's a 'Using the tool decorator' and a 'Using the
decorator' paragraph, I think should belong to 'Completely New Tools -
String Input and Output' and 'Custom Structured Tools' separately.
This PR moves those paragraphs to corresponding headers.
**Description**
the ollama api now supports passing system prompt and template directly
instead of modifying the model file , but the ollama integration in
langchain did not have this change updated . The update just adds these
two parameters to it ( there are 2 more parameters that are pending to
be updated, I was not sure about their utility wrt to langchain )
Refer :
8713ac23a8
**Issue** : None Applicable
**Dependencies** : None Changed
**Twitter handle** : https://twitter.com/violetto96
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
added Parallel Function Calling for Structured Data Extraction notebook
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
`_get_kwarg_value` function is useless, one can rely on python builtin
functionalities to do the exact same thing.
- **Description:** Removed `_get_kwarg_value`. Helps with code
readability.
- **Issue:** the issue # it fixes (if applicable),
- **Twitter handle:** @Guillem_96
Improve CSV reader which can't call .strip() on NoneType if there are
less cells in the row compared to the header
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:**
I have a CSV file as followed
```
headerA,headerB,headerC
v1A,v1B,v1C,
v2A,v2B
v3A,v3B,v3C
```
In this case, row 2 is missing a value, which results in reading a None
type. The strip() method can not be called on None, hence raising. In
this PR I am making the change to only call strip if the value if not
None.
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
We updated MyScale free knowledge base, where you can try your RAG with
36 million paragraphs from wikipedia and 2 million paragraphs from
ArXiv.
The pod has two tables
```sql
CREATE TABLE default.ChatArXiv (
`abstract` String,
`id` String,
`vector` Array(Float32),
`metadata` Object('JSON'),
`pubdate` DateTime,
`title` String,
`categories` Array(String),
`authors` Array(String),
`comment` String,
`primary_category` String,
VECTOR INDEX vec_idx vector TYPE MSTG('metric_type=Cosine'),
CONSTRAINT vec_len CHECK length(vector) = 768)
ENGINE = ReplacingMergeTree ORDER BY id;
CREATE TABLE wiki.Wikipedia (
`id` String,
`title` String,
`text` String,
`url` String,
`wiki_id` UInt64,
`views` Float32,
`paragraph_id` UInt64,
`langs` UInt32,
`emb` Array(Float32),
VECTOR INDEX emb_idx emb TYPE MSTG('metric_type=Cosine'),
CONSTRAINT emb_len CHECK length(emb) = 768)
ENGINE = ReplacingMergeTree ORDER BY id;
```
You can connect those two tables using credentials below (just the same
to the old one)
URL: `msc-4a9e710a.us-east-1.aws.staging.myscale.cloud`
Port: `443`
Username: `chatdata`
Password: `myscale_rocks`
It's FREE and you can also use it with
ChatData: https://github.com/myscale/ChatData
Retrieval-QA-Benchmark:
https://github.com/myscale/Retrieval-QA-Benchmark
... and also LangChain!
Request for review @baskaryan
**Description:** This is like the rag-conversation template in many
ways. What's different is:
- support for a timescale vector store.
- support for time-based filters.
- support for metadata filters.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Changed the fleet_context documentation to use
`context.download_embeddings()` from the latest release from our
package. More details here:
https://github.com/fleet-ai/context/tree/main#api
- **Issue:** n/a
- **Dependencies:** n/a
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @andrewthezhou
Added a Docusaurus Loader
Issue: #6353
I had to implement this for working with the Ionic documentation, and
wanted to open this up as a draft to get some guidance on building this
out further. I wasn't sure if having it be a light extension of the
SitemapLoader was in the spirit of a proper feature for the library --
but I'm grateful for the opportunities Langchain has given me and I'd
love to build this out properly for the sake of the community.
Any feedback welcome!
- **Description:** `AzureMLChatOnlineEndpoint` object from
langchain/chat_models/azureml_endpoint.py safe to print
without having any secrets included in raw format in the string
representation.
- **Issue:** #12165,
- **Tag maintainer:** @eyurtsev
---------
Co-authored-by: Faysal Bougamale <faysal.bougamale@horiba.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Adding documentation to the runnable.
- Documentation is not organized in the best way for the runnable; i.e.,
in
terms of LCEL vs. other standard methods, will follow up with more
edits.
**Description:** Removing the single quote wrapper around the table
names in the SQL agent toolkit.py file as it misleads the LLM into
querying against tables with single quotes around their names.
**Issue:** #7457
**Dependencies:** None
**Tag maintainer:** @hwchase17
**Twitter handle:** None
- Implement config_specs to include session_id
- Remove Runnable method and update notebook
- Add more details to notebook, eg. show input schema and config schema
before and after adding message history
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This commit fixes the issue that langchain.llms OpenAI completion
stopped working since the V1 openai client update.
Replace this entire comment with:
- **Description:** This PR fixes the issue [AttributeError: module
'openai' has no attribute
'Completion'](https://github.com/langchain-ai/langchain/issues/12967)
similar to
8e0cb2eb84
and https://github.com/langchain-ai/langchain/pull/12969,
- **Issue:** https://github.com/langchain-ai/langchain/issues/12967,
- **Dependencies:** `openai` v1.x.x client,
- **Tag maintainer:** @baskaryan,
- **Twitter handle:** @dosuken123
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
This adds the response message as a document to the rag retriever so
users can choose to use this. Also drops document limit.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
We need to centralize the API we use to get the project name for our
tracers. This PR makes it so we always get this from a shared function
in the langsmith sdk.
## Dependencies
Upgraded langsmith from 0.52 to 0.62 to include the new API
`get_tracer_project`
- **Description:**
Recently Chroma rolled out a breaking change on the way we handle
embedding functions, in order to support multi-modal collections.
This broke the way LangChain's `Chroma` objects get created, because we
were passing the EF down into the Chroma collection:
https://docs.trychroma.com/migration#migration-to-0416---november-7-2023
However, internally, we are never actually using embeddings on the
chroma collection - LangChain's `Chroma` object calls it instead. Thus
we just don't pass an `embedding_function` to Chroma itself, which fixes
the issue.
- **Description:** The issue was not listing the proper import error for
amazon textract loader.
- **Issue:** Time wasted trying to figure out what to install...
(langchain docs don't list the dependency either)
- **Dependencies:** N/A
- **Tag maintainer:** @sbusso
- **Twitter handle:** @h9ste
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
# Astra DB Vector store integration
- **Description:** This PR adds a `VectorStore` implementation for
DataStax Astra DB using its HTTP API
- **Issue:** (no related issue)
- **Dependencies:** A new required dependency is `astrapy` (`>=0.5.3`)
which was added to pyptoject.toml, optional, as per guidelines
- **Tag maintainer:** I recently mentioned to @baskaryan this
integration was coming
- **Twitter handle:** `@rsprrs` if you want to mention me
This PR introduces the `AstraDB` vector store class, extensive
integration test coverage, a reworking of the documentation which
conflates Cassandra and Astra DB on a single "provider" page and a new,
completely reworked vector-store example notebook (common to the
Cassandra store, since parts of the flow is shared by the two APIs). I
also took care in ensuring docs (and redirects therein) are behaving
correctly.
All style, linting, typechecks and tests pass as far as the `AstraDB`
integration is concerned.
I could build the documentation and check it all right (but ran into
trouble with the `api_docs_build` makefile target which I could not
verify: `Error: Unable to import module
'plan_and_execute.agent_executor' with error: No module named
'langchain_experimental'` was the first of many similar errors)
Thank you for a review!
Stefano
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Correct naming for package in README
- **Issue:** README wasn't aligned with pyproject.toml, resulting in not
being able to install the rag-supabase package.
- **Tag maintainer:** @gregnr
Cohere released the new embedding API (Embed v3:
https://txt.cohere.com/introducing-embed-v3/) that treats document and
query embeddings differently. This PR updated the `CohereEmbeddings` to
use them appropriately. It also works with the old models.
Description: This PR masks API key secrets for the Nebula model from
Symbl.ai
Issue: #12165
Maintainer: @eyurtsev
---------
Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>
* ChatAnyscale was missing coercion to SecretStr for anyscale api key
* The model inherits from ChatOpenAI so it should not force the openai
api key to be secret str until openai model has the same changes
https://github.com/langchain-ai/langchain/issues/12841
- **Description:** Remove text "LangChain currently does not support"
which appears to be vestigial leftovers from a previous change.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan, @eyurtsev
- **Twitter handle:** thezanke
- **Description:** Noticed that the Hugging Face Pipeline documentation
was a bit out of date.
Updated with information about passing in a pipeline directly
(consistent with docstring) and a recent contribution of mine on adding
support for multi-gpu specifications with Accelerate in
21eeba075c
Qdrant was incorrectly calculating the cosine similarity and returning
`0.0` for the best match, instead of `1.0`. Internally Qdrant returns a
cosine score from `-1.0` (worst match) to `1.0` (best match), and the
current formula reflects it.
Possibility to pass on_artifacts to a conversation. It can be then
achieved by adding this way:
```python
result = agent.run(
input=message.text,
metadata={
"on_artifact": CALLBACK_FUNCTION
},
)
```
The line removed is not required as there are no other alternative
solutions above than that.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This patch fixes a spelling typo in message
within wikibase_agent.ipynb.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
This PR adds a self-querying template using Qdrant as a vector store.
The template uses an artificial dataset and was implemented in a way
that simplifies passing different components and choosing LLM and
embedding providers.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Calls uvicorn directly from cli:
Reload works if you define app by import string instead of object.
(was doing subprocess in order to get reloading)
Version bump to 0.0.14
Remove the need for [serve] for simplicity.
Readmes are updated in #12847 to avoid cluttering this PR
Previously we treated trace_on_chain_group as a command to always start
tracing. This is unintuitive (makes the function do 2 things), and makes
it harder to toggle tracing
**Description**
Removed confusing sentence.
Not clear what "both" was referring to. The two required components
mentioned previously? The two methods listed below?
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
When you use a MultiQuery it might be useful to use the original query
as well as the newly generated ones to maximise the changes to retriever
the correct document. I haven't created an issue, it seems a very small
and easy thing.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
# Description
Add a RAG template showcasing Momento Vector Index as a vector store.
Includes a project directory and README.
# **Twitter handle**
Tag the company @momentohq for a mention and @mlonml for the
contribution.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This change adds a new template for simple RAG using the SingleStoreDB
vectorstore.
Twitter: @alexjpeng
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Correct number of elements in config list in
`batch()` and `abatch()` of `BaseLLM` in case `max_concurrency` is not
None.
- **Issue:** #12643
- **Twitter handle:** @akionux
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Zep now has the ability to search over chat history summaries. This PR
adds support for doing so. More here: https://blog.getzep.com/zep-v0-17/
@baskaryan @eyurtsev
…s present
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
### Enabling `device_map` in HuggingFacePipeline
For multi-gpu settings with large models, the
[accelerate](https://huggingface.co/docs/accelerate/usage_guides/big_modeling#using--accelerate)
library provides the `device_map` parameter to automatically distribute
the model across GPUs / disk.
The [Transformers
pipeline](3520e37e86/src/transformers/pipelines/__init__.py (L543))
enables users to specify `device` (or) `device_map`, and handles cases
(with warnings) when both are specified.
However, Langchain's HuggingFacePipeline only supports specifying
`device` when calling transformers which limits large models and
multi-gpu use-cases.
Additionally, the [default
value](8bd3ce59cd/libs/langchain/langchain/llms/huggingface_pipeline.py (L72))
of `device` is initialized to `-1` , which is incompatible with the
transformers pipeline when `device_map` is specified.
This PR addresses the addition of `device_map` as a parameter , and
solves the incompatibility of `device = -1` when `device_map` is also
specified.
An additional test has been added for this feature.
Additionally, some existing tests no longer work since
1. `max_new_tokens` has to be specified under `pipeline_kwargs` and not
`model_kwargs`
2. The GPT2 tokenizer raises a `ValueError: Pipeline with tokenizer
without pad_token cannot do batching`, since the `tokenizer.pad_token`
is `None` ([related
issue](https://github.com/huggingface/transformers/issues/19853) on the
transformers repo).
This PR handles fixing these tests as well.
Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>
[The python
spec](https://docs.python.org/3/reference/datamodel.html#object.__getattr__)
requires that `__getattr__` throw `AttributeError` for missing
attributes but there are several places throwing `ImportError` in the
current code base. This causes a specific problem with `hasattr` since
it calls `__getattr__` then looks only for `AttributeError` exceptions.
At present, calling `hasattr` on any of these modules will raise an
unexpected exception that most code will not handle as `hasattr`
throwing exceptions is not expected.
In our case this is triggered by an exception tracker (Airbrake) that
attempts to collect the version of all installed modules with code that
looks like: `if hasattr(mod, "__version__"):`. With `HEAD` this is
causing our exception tracker to fail on all exceptions.
I only changed instances of unknown attributes raising `ImportError` and
left instances of known attributes raising `ImportError`. It feels a
little weird but doesn't seem to break anything.
- **Description:** Use all Google search results data in SerpApi.com
wrapper instead of the first one only
- **Tag maintainer:** @hwchase17
_P.S. `libs/langchain/tests/integration_tests/utilities/test_serpapi.py`
are not executed during the `make test`._
- **Description:**
Corrected a specific link within the documentation.
- **Issue:**
#12490
- **Dependencies:**
- **Tag maintainer:**
- **Twitter handle:**
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
It was passing in message instead of generation
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Cookbook showing how to incoporate RAG search within a postgreSQL
database using pgvector.
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Fixed a typo
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** Fixed a typo on the code
- **Issue:** the issue # it fixes (if applicable),
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
* Restrict the chain to specific domains by default
* This is a breaking change, but it will fail loudly upon object
instantiation -- so there should be no silent errors for users
* Resolves CVE-2023-32786
* This is an opt-in feature, so users should be aware of risks if using
jinja2.
* Regardless we'll add sandboxing by default to jinja2 templates -- this
sandboxing is a best effort basis.
* Best strategy is still to make sure that jinja2 templates are only
loaded from trusted sources.
**Description:** Update `langchain.document_loaders.pdf.PyPDFLoader` to
store url in metadata (instead of a temporary file path) if user
provides a web path to a pdf
- **Issue:** Related to #7034; the reporter on that issue submitted a PR
updating `PyMuPDFParser` for this behavior, but it has unresolved merge
issues as of 20 Oct 2023 #7077
- In addition to `PyPDFLoader` and `PyMuPDFParser`, these other classes
in `langchain.document_loaders.pdf` exhibit similar behavior and could
benefit from an update: `PyPDFium2Loader`, `PDFMinerLoader`,
`PDFMinerPDFasHTMLLoader`, `PDFPlumberLoader` (I'm happy to contribute
to some/all of that, including assisting with `PyMuPDFParser`, if my
work is agreeable)
- The root cause is that the underlying pdf parser classes, e.g.
`langchain.document_loaders.parsers.pdf.PyPDFParser`, never receive
information about the url; the parsers receive a
`langchain.document_loaders.blob_loaders.blob`, which contains the pdf
contents and local file path, but not the url
- This update passes the web path directly to the parser since it's
minimally invasive and doesn't require further changes to maintain
existing behavior for local files... bigger picture, I'd consider
extending `blob` so that extra information like this can be
communicated, but that has much bigger implications on the codebase
which I think warrants maintainer input
- **Dependencies:** None
```python
# old behavior
>>> from langchain.document_loaders import PyPDFLoader
>>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': '/var/folders/w2/zx77z1cs01s1thx5dhshkd58h3jtrv/T/tmpfgrorsi5/tmp.pdf', 'page': 0}
# new behavior
>>> from langchain.document_loaders import PyPDFLoader
>>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0}
```
- **Description:** #12273 's suggestion PR
Like other PDFLoader, loading pdf per each page and giving page
metadata.
- **Issue:** #12273
- **Twitter handle:** @blue0_0hope
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This will allow you create the schema beforehand. The check was failing
and preventing importing into existing classes.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:** This template creates an agent that transforms a single
LLM into a cognitive synergist by engaging in multi-turn
self-collaboration with multiple personas.
**Tag maintainer:** @hwchase17
---------
Co-authored-by: Sayandip Sarkar <sayandip.sarkar@skypointcloud.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
PyPI trusted publishing wants to know which workflow is expected to do
the publish. We always want to publish from the same workflow, so we're
making `_test_release.yml` the only workflow that publishes to Test
PyPI.
This follows the principle of least privilege. Our `poetry build` step
doesn't need, and shouldn't get, access to our GitHub OIDC capability.
This is the same structure as I used in the already-merged PR for
refactoring the regular PyPI release workflow: #12578.
- a few instructions in the readme (load_documents -> ingest.py)
- added docker run command for local elastic
- adds input type definition to render playground properly
Prior to this PR, `ruff` was used only for linting and not for
formatting, despite the names of the commands. This PR makes it be used
for both linting code and autoformatting it.
This input key was missed in the last update PR:
https://github.com/langchain-ai/langchain/pull/7391
The input/output formats are intended to be like this:
```
{"inputs": [<prompt>]}
{"outputs": [<output_text>]}
```
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Matvey Arye <mat@timescale.com>
## Description
This PR adds support for
[lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) to
LangChain.

The library is similar to jsonformer / RELLM which are supported in
Langchain, but has several advantages such as
- Batching and Beam search support
- More complete JSON Schema support
- LLM has control over whitespace, improving quality
- Better runtime performance due to only calling the LLM's generate()
function once per generate() call.
The integration is loosely based on the jsonformer integration in terms
of project structure.
## Dependencies
No compile-time dependency was added, but if `lm-format-enforcer` is not
installed, a runtime error will occur if it is trying to be used.
## Tests
Due to the integration modifying the internal parameters of the
underlying huggingface transformer LLM, it is not possible to test
without building a real LM, which requires internet access. So, similar
to the jsonformer and RELLM integrations, the testing is via the
notebook.
## Twitter Handle
[@noamgat](https://twitter.com/noamgat)
Looking forward to hearing feedback!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR addresses what seems like a unnecessary Python version
restriction in the pyroject.toml specs within both Cassandra (/Astra DB)
templates. With "^3.11" I got some version incompatibilities with the
latest "langchain add [...]" commands, so these are now relaxed in line
with the other templates I could inspect.
Incidentally, in the "entomology" template, the need for an explicit
"setup" step for the user to carry on has been removed, replaced by a
check-and-execute-if-necessary instruction on app startup.
Thank you for your attention!
Best to review one commit at a time, since two of the commits are 100%
autogenerated changes from running `ruff format`:
- Install and use `ruff format` instead of black for code formatting.
- Output of `ruff format .` in the `langchain` package.
- Use `ruff format` in experimental package.
- Format changes in experimental package by `ruff format`.
- Manual formatting fixes to make `ruff .` pass.
I always take 20-30 seconds to re-discover where the
`convert_to_openai_function` wrapper lives in our codebase. Chat
langchain [has no
clue](https://smith.langchain.com/public/3989d687-18c7-4108-958e-96e88803da86/r)
what to do either. There's the older `create_openai_fn_chain` , but we
haven't been recommending it in LCEL. The example we show in the
[cookbook](https://python.langchain.com/docs/expression_language/how_to/binding#attaching-openai-functions)
is really verbose.
General function calling should be as simple as possible to do, so this
seems a bit more ergonomic to me (feel free to disagree). Another option
would be to directly coerce directly in the class's init (or when
calling invoke), if provided. I'm not 100% set against that. That
approach may be too easy but not simple. This PR feels like a decent
compromise between simple and easy.
```
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field
class Category(str, Enum):
"""The category of the issue."""
bug = "bug"
nit = "nit"
improvement = "improvement"
other = "other"
class IssueClassification(BaseModel):
"""Classify an issue."""
category: Category
other_description: Optional[str] = Field(
description="If classified as 'other', the suggested other category"
)
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI().bind_functions([IssueClassification])
llm.invoke("This PR adds a convenience wrapper to the bind argument")
# AIMessage(content='', additional_kwargs={'function_call': {'name': 'IssueClassification', 'arguments': '{\n "category": "improvement"\n}'}})
```
- Prefer lambda type annotations over inferred dict schema
- For sequences that start with RunnableAssign infer seq input type as
"input type of 2nd item in sequence - output type of runnable assign"
- **Description:**
Before:
`
To install modules needed for the common LLM providers, run:
`
After:
`
To install modules needed for the common LLM providers, run the
following command. Please bear in mind that this command is exclusively
compatible with the `bash` shell:
`
> This is required for the user so that the user will know if this
command is compatible with `zsh` or not.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Replace this entire comment with:
-Add MultiOn close function and update key value and add async
functionality
- solved the key value TabId not found.. (updated to use latest key
value)
@hwchase17
**Description:** Textract PDF Loader generating linearized output,
meaning it will replicate the structure of the source document as close
as possible based on the features passed into the call (e. g. LAYOUT,
FORMS, TABLES). With LAYOUT reading order for multi-column documents or
identification of lists and figures is supported and with TABLES it will
generate the table structure as well. FORMS will indicate "key: value"
with columms.
- **Issue:** the issue fixes#12068
- **Dependencies:** amazon-textract-textractor is added, which provides
the linearization
- **Tag maintainer:** @3coins
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Ran the following bash script for all templates
```bash
#!/bin/bash
set -e
current_dir="$(pwd)"
for directory in */; do
if [ -d "$directory" ]; then
(cd "$directory" && poetry lock --no-update)
fi
done
cd "$current_dir"
```
Co-authored-by: Bagatur <baskaryan@gmail.com>
can get the correct token count instead of using gpt-2 model
**Description:**
Implement get_num_tokens within VertexLLM to use google's count_tokens
function.
(https://cloud.google.com/vertex-ai/docs/generative-ai/get-token-count).
So we don't need to download gpt-2 model from huggingface, also when we
do the mapreduce chain we can get correct token count.
**Tag maintainer:**
@lkuligin
**Twitter handle:**
My twitter: @abehsu1992626
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Following this tutoral about using OpenAI Embeddings with FAISS
https://python.langchain.com/docs/integrations/vectorstores/faiss
```python
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.document_loaders import TextLoader
from langchain.document_loaders import TextLoader
loader = TextLoader("../../../extras/modules/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
```
This works fine
```python
db = FAISS.from_documents(docs, embeddings)
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)
```
But the async version is not
```python
db = await FAISS.afrom_documents(docs, embeddings) # NotImplementedError
query = "What did the president say about Ketanji Brown Jackson"
docs = await db.asimilarity_search(query) # this will use await asyncio.get_event_loop().run_in_executor under the hood and will not call OpenAIEmbeddings.aembed_query but call OpenAIEmbeddings.embed_query
```
So this PR add async/await supports for FAISS
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Before making a new `langchain` release, we want to test that everything
works as expected. This PR lets us publish `langchain` to test PyPI,
then install it from there and run checks to ensure everything works
normally before publishing it "for real".
It also takes the opportunity to refactor the build process, splitting
up the build, release-creation, and PyPI upload steps into separate jobs
that do not share their elevated permissions with each other.
- Description: adding support to Activeloop's DeepMemory feature that
boosts recall up to 25%. Added Jupyter notebook showcasing the feature
and also made index params explicit.
- Twitter handle: will really appreciate if we could announce this on
twitter.
---------
Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>
Hey, we're looking to invest more in adding cohere integrations to
langchain so would love to get more of an idea for how it's used.
Hopefully this pr is acceptable. This week I'm also going to be looking
into adding our new [retrieval augmented generation
product](https://txt.cohere.com/chat-with-rag/) to langchain.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
## **Description:**
When building our own readthedocs.io scraper, we noticed a couple
interesting things:
1. Text lines with a lot of nested <span> tags would give unclean text
with a bunch of newlines. For example, for [Langchain's
documentation](https://api.python.langchain.com/en/latest/document_loaders/langchain.document_loaders.readthedocs.ReadTheDocsLoader.html#langchain.document_loaders.readthedocs.ReadTheDocsLoader),
a single line is represented in a complicated nested HTML structure, and
the naive `soup.get_text()` call currently being made will create a
newline for each nested HTML element. Therefore, the document loader
would give a messy, newline-separated blob of text. This would be true
in a lot of cases.
<img width="945" alt="Screenshot 2023-10-26 at 6 15 39 PM"
src="https://github.com/langchain-ai/langchain/assets/44193474/eca85d1f-d2bf-4487-a18a-e1e732fadf19">
<img width="1031" alt="Screenshot 2023-10-26 at 6 16 00 PM"
src="https://github.com/langchain-ai/langchain/assets/44193474/035938a0-9892-4f6a-83cd-0d7b409b00a3">
Additionally, content from iframes, code from scripts, css from styles,
etc. will be gotten if it's a subclass of the selector (which happens
more often than you'd think). For example, [this
page](https://pydeck.gl/gallery/contour_layer.html#) will scrape 1.5
million characters of content that looks like this:
<img width="1372" alt="Screenshot 2023-10-26 at 6 32 55 PM"
src="https://github.com/langchain-ai/langchain/assets/44193474/dbd89e39-9478-4a18-9e84-f0eb91954eac">
Therefore, I wrote a recursive _get_clean_text(soup) class function that
1. skips all irrelevant elements, and 2. only adds newlines when
necessary.
2. Index pages (like [this
one](https://api.python.langchain.com/en/latest/api_reference.html))
would be loaded, chunked, and eventually embedded. This is really bad
not just because the user will be embedding irrelevant information - but
because index pages are very likely to show up in retrieved content,
making retrieval less effective (in our tests). Therefore, I added a
bool parameter `exclude_index_pages` defaulted to False (which is the
current behavior — although I'd petition to default this to True) that
will skip all pages where links take up 50%+ of the page. Through manual
testing, this seems to be the best threshold.
## Other Information:
- **Issue:** n/a
- **Dependencies:** n/a
- **Tag maintainer:** n/a
- **Twitter handle:** @andrewthezhou
---------
Co-authored-by: Andrew Zhou <andrew@heykona.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:**
* Add unit tests for document_transformers/beautiful_soup_transformer.py
* Basic functionality is tested (extract tags, remove tags, drop lines)
* add a FIXME comment about the order of tags that is not preserved
(and a passing test, but with the expected tags now out-of-order)
- **Issue:** None
- **Dependencies:** None
- **Tag maintainer:** @rlancemartin
- **Twitter handle:** `peter_v`
Please make sure your PR is passing linting and testing before
submitting.
=> OK: I ran `make format`, `make test` (passing after install of
beautifulsoup4) and `make lint`.
- **Description:** Added masking of the API Key for AI21 LLM when
printed and improved the docstring for AI21 LLM.
- Updated the AI21 LLM to utilize SecretStr from pydantic to securely
manage API key.
- Made improvements in the docstring of AI21 LLM. It now mentions that
the API key can also be passed as a named parameter to the constructor.
- Added unit tests.
- **Issue:** #12165
- **Tag maintainer:** @eyurtsev
---------
Co-authored-by: Anirudh Gautam <anirudh@Anirudhs-Mac-mini.local>
Currently this gives a bug:
```
from langchain.schema.runnable import RunnableLambda
bound = RunnableLambda(lambda x: x).with_config({"callbacks": []})
# ConfigError: field "callbacks" not yet prepared so type is still a ForwardRef, you might need to call RunnableConfig.update_forward_refs().
```
Rather than deal with cyclic imports and extra load time, etc., I think
it makes sense to just have a separate Callbacks definition here that is
a relaxed typehint.
1. Allow run evaluators to return {"results": [list of evaluation
results]} in the evaluator callback.
2. Allows run evaluators to pick the target run ID to provide feedback
to
(1) means you could do something like a function call that populates a
full rubric in one go (not sure how reliable that is in general though)
rather than splitting off into separate LLM calls - cheaper and less
code to write
(2) means you can provide feedback to runs on subsequent calls.
Immediate use case is if you wanted to add an evaluator to a chat bot
and assign to assign to previous conversation turns
have a corresponding one in the SDK
In the GoogleSerperResults class, the name field is defined as
'google_serrper_results_json'. This looks like a typo, and perhaps
should be 'google_serper_results_json'.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Add Redis langserve template! Eventually will add semantic caching to
this too. But I was struggling to get that to work for some reason with
the LCEL implementation here.
- **Description:** Introduces the Redis LangServe template. A simple RAG
based app built on top of Redis that allows you to chat with company's
public financial data (Edgar 10k filings)
- **Issue:** None
- **Dependencies:** The template contains the poetry project
requirements to run this template
- **Tag maintainer:** @baskaryan @Spartee
- **Twitter handle:** @tchutch94
**Note**: this requires the commit here that deletes the
`_aget_relevant_documents()` method from the Redis retriever class that
wasn't implemented. That was breaking the langserve app.
---------
Co-authored-by: Sam Partee <sam.partee@redis.com>
Updates the bedrock rag template.
- Removes pinecone and replaces with FAISS as the vector store
- Fixes the environment variables, setting defaults
- Adds a `main.py` test file quick sanity testing
- Updates README.md with correct instructions
-**Description** Adds returning the reranking score when using semantic
search
-**Issue:* #12317
---------
Co-authored-by: Adam Law <adamlaw@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Bumps
[@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse)
from 7.22.8 to 7.23.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/babel/babel/releases"><code>@babel/traverse</code>'s
releases</a>.</em></p>
<blockquote>
<h2>v7.23.2 (2023-10-11)</h2>
<p><strong>NOTE</strong>: This release also re-publishes
<code>@babel/core</code>, even if it does not appear in the linked
release commit.</p>
<p>Thanks <a
href="https://github.com/jimmydief"><code>@jimmydief</code></a> for
your first PR!</p>
<h4>🐛 Bug Fix</h4>
<ul>
<li><code>babel-traverse</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16033">#16033</a>
Only evaluate own String/Number/Math methods (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16022">#16022</a>
Rewrite <code>.tsx</code> extension when using
<code>rewriteImportExtensions</code> (<a
href="https://github.com/jimmydief"><code>@jimmydief</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16017">#16017</a>
Fix: fallback to typeof when toString is applied to incompatible object
(<a href="https://github.com/JLHwung"><code>@JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16025">#16025</a>
Avoid override mistake in namespace imports (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h4>Committers: 5</h4>
<ul>
<li>Babel Bot (<a
href="https://github.com/babel-bot"><code>@babel-bot</code></a>)</li>
<li>Huáng Jùnliàng (<a
href="https://github.com/JLHwung"><code>@JLHwung</code></a>)</li>
<li>James Diefenderfer (<a
href="https://github.com/jimmydief"><code>@jimmydief</code></a>)</li>
<li>Nicolò Ribaudo (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
<li><a
href="https://github.com/liuxingbaoyu"><code>@liuxingbaoyu</code></a></li>
</ul>
<h2>v7.23.1 (2023-09-25)</h2>
<p>Re-publishing <code>@babel/helpers</code> due to a publishing error
in 7.23.0.</p>
<h2>v7.23.0 (2023-09-25)</h2>
<p>Thanks <a
href="https://github.com/lorenzoferre"><code>@lorenzoferre</code></a>
and <a
href="https://github.com/RajShukla1"><code>@RajShukla1</code></a> for
your first PRs!</p>
<h4>🚀 New Feature</h4>
<ul>
<li><code>babel-plugin-proposal-import-wasm-source</code>,
<code>babel-plugin-syntax-import-source</code>,
<code>babel-plugin-transform-dynamic-import</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15870">#15870</a>
Support transforming <code>import source</code> for wasm (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-module-transforms</code>,
<code>babel-helpers</code>,
<code>babel-plugin-proposal-import-defer</code>,
<code>babel-plugin-syntax-import-defer</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15878">#15878</a>
Implement <code>import defer</code> proposal transform support (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>, <code>babel-parser</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15845">#15845</a>
Implement <code>import defer</code> parsing support (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
<li><a
href="https://redirect.github.com/babel/babel/pull/15829">#15829</a> Add
parsing support for the "source phase imports" proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>,
<code>babel-helper-module-transforms</code>, <code>babel-parser</code>,
<code>babel-plugin-transform-dynamic-import</code>,
<code>babel-plugin-transform-modules-amd</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-plugin-transform-modules-systemjs</code>,
<code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15682">#15682</a> Add
<code>createImportExpressions</code> parser option (<a
href="https://github.com/JLHwung"><code>@JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15671">#15671</a>
Pass through nonce to the transformed script element (<a
href="https://github.com/JLHwung"><code>@JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-function-name</code>,
<code>babel-helper-member-expression-to-functions</code>,
<code>babel-helpers</code>, <code>babel-parser</code>,
<code>babel-plugin-proposal-destructuring-private</code>,
<code>babel-plugin-proposal-optional-chaining-assign</code>,
<code>babel-plugin-syntax-optional-chaining-assign</code>,
<code>babel-plugin-transform-destructuring</code>,
<code>babel-plugin-transform-optional-chaining</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15751">#15751</a> Add
support for optional chain in assignments (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-proposal-decorators</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15895">#15895</a>
Implement the "decorator metadata" proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15893">#15893</a> Add
<code>t.buildUndefinedNode</code> (<a
href="https://github.com/liuxingbaoyu"><code>@liuxingbaoyu</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/babel/babel/blob/main/CHANGELOG.md"><code>@babel/traverse</code>'s
changelog</a>.</em></p>
<blockquote>
<h2>v7.23.2 (2023-10-11)</h2>
<h4>🐛 Bug Fix</h4>
<ul>
<li><code>babel-traverse</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16033">#16033</a>
Only evaluate own String/Number/Math methods (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16022">#16022</a>
Rewrite <code>.tsx</code> extension when using
<code>rewriteImportExtensions</code> (<a
href="https://github.com/jimmydief"><code>@jimmydief</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16017">#16017</a>
Fix: fallback to typeof when toString is applied to incompatible object
(<a href="https://github.com/JLHwung"><code>@JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16025">#16025</a>
Avoid override mistake in namespace imports (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h2>v7.23.0 (2023-09-25)</h2>
<h4>🚀 New Feature</h4>
<ul>
<li><code>babel-plugin-proposal-import-wasm-source</code>,
<code>babel-plugin-syntax-import-source</code>,
<code>babel-plugin-transform-dynamic-import</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15870">#15870</a>
Support transforming <code>import source</code> for wasm (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-module-transforms</code>,
<code>babel-helpers</code>,
<code>babel-plugin-proposal-import-defer</code>,
<code>babel-plugin-syntax-import-defer</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15878">#15878</a>
Implement <code>import defer</code> proposal transform support (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>, <code>babel-parser</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15845">#15845</a>
Implement <code>import defer</code> parsing support (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
<li><a
href="https://redirect.github.com/babel/babel/pull/15829">#15829</a> Add
parsing support for the "source phase imports" proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>,
<code>babel-helper-module-transforms</code>, <code>babel-parser</code>,
<code>babel-plugin-transform-dynamic-import</code>,
<code>babel-plugin-transform-modules-amd</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-plugin-transform-modules-systemjs</code>,
<code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15682">#15682</a> Add
<code>createImportExpressions</code> parser option (<a
href="https://github.com/JLHwung"><code>@JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15671">#15671</a>
Pass through nonce to the transformed script element (<a
href="https://github.com/JLHwung"><code>@JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-function-name</code>,
<code>babel-helper-member-expression-to-functions</code>,
<code>babel-helpers</code>, <code>babel-parser</code>,
<code>babel-plugin-proposal-destructuring-private</code>,
<code>babel-plugin-proposal-optional-chaining-assign</code>,
<code>babel-plugin-syntax-optional-chaining-assign</code>,
<code>babel-plugin-transform-destructuring</code>,
<code>babel-plugin-transform-optional-chaining</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15751">#15751</a> Add
support for optional chain in assignments (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-proposal-decorators</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15895">#15895</a>
Implement the "decorator metadata" proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15893">#15893</a> Add
<code>t.buildUndefinedNode</code> (<a
href="https://github.com/liuxingbaoyu"><code>@liuxingbaoyu</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15913">#15913</a> Add
<code>rewriteImportExtensions</code> option to TS preset (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-parser</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15896">#15896</a>
Allow TS tuples to have both labeled and unlabeled elements (<a
href="https://github.com/yukukotani"><code>@yukukotani</code></a>)</li>
</ul>
</li>
</ul>
<h4>🐛 Bug Fix</h4>
<ul>
<li><code>babel-plugin-transform-block-scoping</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15962">#15962</a>
fix: <code>transform-block-scoping</code> captures the variables of the
method in the loop (<a
href="https://github.com/liuxingbaoyu"><code>@liuxingbaoyu</code></a>)</li>
</ul>
</li>
</ul>
<h4>💅 Polish</h4>
<ul>
<li><code>babel-traverse</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15797">#15797</a>
Expand evaluation of global built-ins in <code>@babel/traverse</code>
(<a
href="https://github.com/lorenzoferre"><code>@lorenzoferre</code></a>)</li>
</ul>
</li>
<li><code>babel-plugin-proposal-explicit-resource-management</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15985">#15985</a>
Improve source maps for blocks with <code>using</code> declarations (<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h4>🔬 Output optimization</h4>
<ul>
<li><code>babel-core</code>,
<code>babel-helper-module-transforms</code>,
<code>babel-plugin-transform-async-to-generator</code>,
<code>babel-plugin-transform-classes</code>,
<code>babel-plugin-transform-dynamic-import</code>,
<code>babel-plugin-transform-function-name</code>,
<code>babel-plugin-transform-modules-amd</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-plugin-transform-modules-umd</code>,
<code>babel-plugin-transform-parameters</code>,
<code>babel-plugin-transform-react-constant-elements</code>,
<code>babel-plugin-transform-react-inline-elements</code>,
<code>babel-plugin-transform-runtime</code>,
<code>babel-plugin-transform-typescript</code>,
<code>babel-preset-env</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15984">#15984</a>
Inline <code>exports.XXX =</code> update in simple variable declarations
(<a
href="https://github.com/nicolo-ribaudo"><code>@nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h2>v7.22.20 (2023-09-16)</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b4b9942a6c"><code>b4b9942</code></a>
v7.23.2</li>
<li><a
href="b13376b346"><code>b13376b</code></a>
Only evaluate own String/Number/Math methods (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/16033">#16033</a>)</li>
<li><a
href="ca58ec15cb"><code>ca58ec1</code></a>
v7.23.0</li>
<li><a
href="0f333dafcf"><code>0f333da</code></a>
Add <code>createImportExpressions</code> parser option (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15682">#15682</a>)</li>
<li><a
href="3744545649"><code>3744545</code></a>
Fix linting</li>
<li><a
href="c7e6806e21"><code>c7e6806</code></a>
Add <code>t.buildUndefinedNode</code> (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15893">#15893</a>)</li>
<li><a
href="38ee8b4dd6"><code>38ee8b4</code></a>
Expand evaluation of global built-ins in <code>@babel/traverse</code>
(<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15797">#15797</a>)</li>
<li><a
href="9f3dfd9021"><code>9f3dfd9</code></a>
v7.22.20</li>
<li><a
href="3ed28b29c1"><code>3ed28b2</code></a>
Fully support <code>||</code> and <code>&&</code> in
<code>pluginToggleBooleanFlag</code> (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15961">#15961</a>)</li>
<li><a
href="77b0d73599"><code>77b0d73</code></a>
v7.22.19</li>
<li>Additional commits viewable in <a
href="https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/langchain-ai/langchain/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
**Description:** Improve handling of empty queries in timescale-vector.
For timescale-vector it is more efficient to get a None embedding when
the embedding has no semantic meaning. It allows timescale-vector to
perform more optimizations. Thus, when the query is empty, use a None
embedding.
Also pass down constructor arguments to the timescale vector client.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This code path is hit in the following case:
- Start in langchain code and manually provide a tracer
- Handoff to the traceable
- Hand back to langchain code.
Which happens for evaluating `@traceable` functions unfortunately
- **Description: To handle the hybrid search with RRF(Reciprocal Rank
Fusion) in the Elasticsearch, rrf argument was added for adjusting
'rank_constant' and 'window_size' to combine multiple result sets with
different relevance indicators into a single result set. (ref:
https://www.elastic.co/kr/blog/whats-new-elastic-enterprise-search-8-9-0),
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** No dependencies changed,
- **Tag maintainer:** @baskaryan,
Nice to meet you,
I'm a newbie for contributions and it's my first PR.
I only changed the langchain/vectorstores/elasticsearch.py file.
I did make format&lint
I got this message,
```shell
make lint_diff
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "langchain/vectorstores/elasticsearch.py" = "" ] || poetry run black langchain/vectorstores/elasticsearch.py --check
All done! ✨🍰✨
1 file would be left unchanged.
[ "langchain/vectorstores/elasticsearch.py" = "" ] || poetry run mypy langchain/vectorstores/elasticsearch.py
langchain/__init__.py: error: Source file found twice under different module names: "mvp.nlp.langchain.libs.langchain.langchain" and "langchain"
Found 1 error in 1 file (errors prevented further checking)
make: *** [lint_diff] Error 2
```
Thank you
---------
Co-authored-by: 황중원 <jwhwang@amorepacific.com>
My postgres out of connections after continuous PGVector usage, and the
reason because it constantly creates new connections, so adding a
reusable pre established connection seems like solves an issue
---------
Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
See discussion here:
https://github.com/langchain-ai/langchain/discussions/11680
The code is available for usage from langchain_experimental. The reason
for the deprecation is that the agents are relying on a Python REPL. The
code can only be run safely with appropriate sandboxing.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
The changes introduced in #12267 and #12190 broke the cost computation
of the `completion` tokens for fine-tuned models because of the early
return. This PR aims at fixing this.
@baskaryan.
Adds a `langchain-location` param to lint, so we can properly locate it.
Regular langchain and experimental lint steps are passing, so default
value seems to be working.
**Description:**
Revise `libs/langchain/langchain/document_loaders/async_html.py` to
store the HTML Title and Page Language in the `metadata` of
`AsyncHtmlLoader`.
Compare predicted json to reference. First canonicalize (sort keys, rm
whitespace separators), then return normalized string edit distance.
Not a silver bullet but maybe an easy way to capture structure
differences in a less flakey way
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Will run all CI because of _test change, but future PRs against CLI will
only trigger the new CLI one
Has a bunch of file changes related to formatting/linting.
No mypy yet - coming soon
**Description**
This small change will make chunk_size a configurable parameter for
loading documents into a Supabase database.
**Issue**
https://github.com/langchain-ai/langchain/issues/11422
**Dependencies**
No chanages
**Twitter**
@ j1philli
**Reminder**
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Greg Richardson <greg.nmr@gmail.com>
Description
* Add _generate and _agenerate to support Fireworks batching.
* Add stop words test cases
* Opt out retry mechanism
Issue - Not applicable
Dependencies - None
Tag maintainer - @baskaryan
- **Description:** refactors the redis vector field schema to properly
handle default values, includes a new unit test suite.
- **Issue:** N/A
- **Dependencies:** nothing new.
- **Tag maintainer:** @baskaryan @Spartee
- **Twitter handle:** this is a tiny fix/improvement :)
This issue was causing some clients/cuatomers issues when building a
vector index on Redis on smaller db instances (due to fault default
values in index configuration). It would raise an error like:
```redis.exceptions.ResponseError: Vector index initial capacity 20000 exceeded server limit (852 with the given parameters)```
This PR will address this moving forward.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR replaces the previous `Intent` check with the new `Prompt
Safety` check. The logic and steps to enable chain moderation via the
Amazon Comprehend service, allowing you to detect and redact PII, Toxic,
and Prompt Safety information in the LLM prompt or answer remains
unchanged.
This implementation updates the code and configuration types with
respect to `Prompt Safety`.
### Usage sample
```python
from langchain_experimental.comprehend_moderation import (BaseModerationConfig,
ModerationPromptSafetyConfig,
ModerationPiiConfig,
ModerationToxicityConfig
)
pii_config = ModerationPiiConfig(
labels=["SSN"],
redact=True,
mask_character="X"
)
toxicity_config = ModerationToxicityConfig(
threshold=0.5
)
prompt_safety_config = ModerationPromptSafetyConfig(
threshold=0.5
)
moderation_config = BaseModerationConfig(
filters=[pii_config, toxicity_config, prompt_safety_config]
)
comp_moderation_with_config = AmazonComprehendModerationChain(
moderation_config=moderation_config, #specify the configuration
client=comprehend_client, #optionally pass the Boto3 Client
verbose=True
)
template = """Question: {question}
Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])
responses = [
"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.",
"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]
llm = FakeListLLM(responses=responses)
llm_chain = LLMChain(prompt=prompt, llm=llm)
chain = (
prompt
| comp_moderation_with_config
| {llm_chain.input_keys[0]: lambda x: x['output'] }
| llm_chain
| { "input": lambda x: x['text'] }
| comp_moderation_with_config
)
try:
response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})
except Exception as e:
print(str(e))
else:
print(response['output'])
```
### Output
```python
> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...
> Finished chain.
> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...
> Finished chain.
Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like XXXXXXXXXXXX John Doe's phone number is (999)253-9876.
```
---------
Co-authored-by: Jha <nikjha@amazon.com>
Co-authored-by: Anjan Biswas <anjanavb@amazon.com>
Co-authored-by: Anjan Biswas <84933469+anjanvb@users.noreply.github.com>
**Description:**
This PR adds support for the [Pro version of Titan Takeoff
Server](https://docs.titanml.co/docs/category/pro-features). Users of
the Pro version will have to import the TitanTakeoffPro model, which is
different from TitanTakeoff.
**Issue:**
Also minor fixes to docs for Titan Takeoff (Community version)
**Dependencies:**
No additional dependencies
**Twitter handle:** @becoming_blake
@baskaryan @hwchase17
- **Description:** Super simple fix for colab link on
code_understanding.ipynb,
- **Issue:** not applicable
- **Dependencies:** none,
- **Tag maintainer:** ,
- **Twitter handle:** @kengoodridge
- **Description:**
This PR adds `allowd_operators` property to `QdrantTranslator` to fix
the `TypeError: can only join an iterable` bug. This property is
required in `get_query_constructor_prompt` in
`query_constructor\base.py`:
```
allowed_operators=" | ".join(allowed_operators),
```
- **Issue:**
#12061
---------
Co-authored-by: XIE Qihui <qihui.xie@bopufund.com>
Problem statement:
In the `integrations/llms` and `integrations/chat` pages, we have a
sidebar with ToC, and we also have a ToC at the end of the page.
The ToC at the end of the page is not necessary, and it is confusing
when we mix the index page styles; moreover, it requires manual work.
So, I removed ToC at the end of the page (it was discussed with and
approved by @baskaryan)
If user function is wrapped as a traceable function, this will help hand
off the trace between the two.
Also update handling fields to reflect optional values
- **Description**: Fix for the SPARQL QA chain: fixed SPARQL queries for
retrieving information about relations in the graph to create a textual
description of the schema for the language model. This should resolve
#8907
- **Issue**: #8907
- **Dependencies**: None
- **Tag maintainer**: @baskaryan, @hwchase17
**Description:** When llms output leading or trailing whitespace for xml
(when using XMLOutputParser) the parser would raise a `ValueError: Could
not parse output: ...`. However, leading or trailing whitespace are
"ignorable" in the sense of XML standard.
**Issue:** I did not find an issue related.
**Dependencies:** None
**Tag maintainer:**
**Twitter handle:** donatoaz
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
Done, updated unit test and ran `make docker_test`.
- **Description:** Response parser for arcee retriever,
- **Issue:** follow-up pr on #11578 and
[discussion](https://github.com/arcee-ai/arcee-python/issues/15#issuecomment-1759874053),
- **Dependencies:** NA
This pr implements a parser for the response from ArceeRetreiver to
convert to langchain `Document`. This closes the loop of generation and
retrieval for Arcee DALMs in langchain.
The reference for the response parser is
[api-docs:retrieve](https://api.arcee.ai/docs#/v2/retrieve_model)
Attaching screenshot of working implementation:
<img width="1984" alt="Screenshot 2023-10-25 at 7 42 34 PM"
src="https://github.com/langchain-ai/langchain/assets/65639964/026987b9-34b2-4e4b-b87d-69fcd0c6641a">
\*api key deleted
---
Successful tests, lints, etc.
```shell
Re-run pytest with --snapshot-update to delete unused snapshots.
==================================================================================================================== slowest 5 durations =====================================================================================================================
1.56s call tests/unit_tests/schema/runnable/test_runnable.py::test_retrying
0.63s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream
0.33s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_stream_iterator_input
0.30s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream_iterator_input
0.20s call tests/unit_tests/indexes/test_indexing.py::test_cleanup_with_different_batchsize
======================================================================================================= 1265 passed, 270 skipped, 32 warnings in 6.55s =======================================================================================================
[ "." = "" ] || poetry run black .
All done! ✨🍰✨
1871 files left unchanged.
[ "." = "" ] || poetry run ruff --select I --fix .
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run black . --check
All done! ✨🍰✨
1871 files would be left unchanged.
[ "." = "" ] || poetry run mypy .
Success: no issues found in 1868 source files
poetry run codespell --toml pyproject.toml
poetry run codespell --toml pyproject.toml -w
```
Co-authored-by: Shubham Kushwaha <shwu@Shubhams-MacBook-Pro.local>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:**
Documents further usage of RetrievalQAWithSourcesChain in an existing
test. I'd not found much documented usage of RetrievalQAWithSourcesChain
and how to get the sources out. This additional code will hopefully be
useful to other potential users of this retriever.
**Issue:** No raised issue
**Dependencies:** No new dependencies needed to run the test (it already
needs `open-ai`, `faiss-cpu` and `unstructured`).
Note - `make lint` showed 8 linting errors in unrelated files
---------
Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>
If I go traceable -> runnable when the project is manually specified,
the runnable wont be logged. This makes sure the session/project is
threaded through appropriately.
- Move Document AI provider to the Google provider page
- Change Vertex AI Matching Engine to Vector Search
- Change references from GCP to Google Cloud
- Add Gmail chat loader to Google provider page
- Change Serper page title to "Serper - Google Search API" since it is
not a Google product.
* Add a type literal for the generation and sub-classes for serialization purposes.
* Fix the root validator of ChatGeneration to return ValueError instead of KeyError or Attribute error if intialized improperly.
* This change is done for langserve to make sure that llm related callbacks can be serialized/deserialized properly.
Fix Description:
For Redis Vector integration in add_texts method, there were two issues
that lead to this bug.
1. Vector index is not being created leading to no such_index error
2. `doc:index` prefix was also missing for Redis Keys.
resolves#11197
Maintainer: @baskaryan
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
Add cost calculation for fine tuned models (new and legacy), this is
required after OpenAI added new models for fine tuning and separated the
costs of I/O for fine tuned models.
Also I updated the relevant unit tests
see https://platform.openai.com/docs/guides/fine-tuning for more
information.
issue: https://github.com/langchain-ai/langchain/issues/11715
- **Issue:** 11715
- **Twitter handle:** @nirkopler
- replace `requests` package with `langchain.requests`
- add `_acall` support
- add `_stream` and `_astream`
- freshen up the documentation a bit
- update vendor doc
Allows for passing arguments into the LLM chains used by the
GraphCypherQAChain. This is to address a request by a user to include
memory in the Cypher creating chain. Will keep the prompt variables
as-is to be backward compatible. But, would be a good idea to deprecate
them and use the **kwargs variables. Added a test case.
In general, I think it would be good for any chain to automatically pass
in a readonlymemory(of its input) to its subchains whilist allowing for
an override. But, this would be a different change.
- **Description:**
Add missing apostrophe in `user's` in stuff_prompt's system_template.
The first sentence in the system template went from:
> Use the following pieces of context to answer the users question.
to
> Use the following pieces of context to answer the user's question.
- **Issue:**
- **Dependencies:** none
- **Tag maintainer:** @baskaryan
- **Twitter handle:** ojohnnyo
- This is used internally to gather aggregate usage metrics for the
LangChain integrations
- Note: This cannot be added to some of the Vertex AI integrations at
this time because the SDK doesn't allow overriding the
[`ClientInfo`](https://googleapis.dev/python/google-api-core/latest/client_info.html#module-google.api_core.client_info)
- Added to:
- BigQuery
- Google Cloud Storage
- Document AI
- Vertex AI Model Garden
- Document AI Warehouse
- Vertex AI Search
- Vertex AI Matching Engine (Cloud Storage Client)
@baskaryan, @eyurtsev, @hwchase17
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
code of conduct.md file is missing it is generally present in good repos
which have large community
Replace this entire comment with:
- **Description:** Added a `code_of_conduct.md` file to the repository
to establish community standards and guidelines for contributors.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** N/A
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- **Description:** In the max_marginal_relevance_search function of the
ElasticsearchStore vector store, the name of the field corresponding to
the vector embedding of the document is hard coded in the delete
statement that drops the field from the document metadata. This results
in an exception if the vector embedding field is customized. This PR
changes the hard-coded "vector" into the vector_query_field variable.
- **Issue:** None
- **Dependencies:** None
- **Tag maintainer:** @hwchase17
Co-authored-by: Shilong Dai <sdai@viperfish.net>
**Description: Allow to inject boto3 client for Cross account access
type of scenarios in using SagemakerEndpointEmbeddings and also updated
the documentation for same in the sample notebook**
**Issue:SagemakerEndpointEmbeddings cross account capability #10634
#10184**
Dependencies: None
Tag maintainer:
Twitter handle:lethargicoder
Co-authored-by: Vikram(VS) <vssht@amazon.com>
- **Description:** sqlalchemy create_engine() does not take into account
connect_args which are mandatory for managed PGSQL instances on cloud
providers (ssl_context for example).
Also re-enabled create_vector_extension at post_init for using pgvector
class seamlessly
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Sami Bargaoui <bargaoui.sam@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
If non-pickleable objects (like locks) get passed to the tracing
callback, they'll fail in the deepcopy. Fallback to a shallow copy in
these instances .
We don't use any of the new functionality at the moment. Just making
sure we don't fall back on versions and fail to benefit from new
patches. This is an easy upgrade and it's always harder to upgrade
across multiple major versions at once.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Adding Tavily Search API as a tool. I will be the maintainer and
assaf_elovic is the twitter handler.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Current ChatTongyi is not compatible with DashScope API, which will
cause error when passing api key to chat model directly.
- **Description:** Update tongyi.py to be compatible with DashScope API.
Specifically, update parameter name "dashscope_api_key" to "api_key".
- **Issue:** None.
- **Dependencies:** Nothing new, Tongyi would require DashScope as
before.
- **Description:**
- Replace Telegram with Whatsapp in whatsapp.ipynb
- Add # to mark the telegram as heading in telegram.ipynb
- **Issue:** None
- **Dependencies:** None
- **Description:** Implementing the Google Scholar Tool as requested in
PR #11505. The tool will be using the [serpapi python
package](https://serpapi.com/integrations/python#search-google-scholar).
The main idea of the tool will be to return the results from a Google
Scholar search given a query as an input to the tool.
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17
Replace this entire comment with:
- **Description:** Fix superfluous [Auto-fixing
parser](https://python.langchain.com/docs/modules/model_io/output_parsers/output_fixing_parser)
docs. Also switching to `langchain.pydantic_v1` from the direct
reference to `pydantic`,
- **Issue:** N/A,
- **Dependencies:** N/A,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** @dosuken123
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
- Fixes error:
```
ValueError: "GoogleVertexAISearchRetriever" object has no field "_serving_config"
```
Introduced in #11736
@baskaryan, @eyurtsev, @hwchase17 if you could review and merge quickly,
that would be appreciated :)
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** The return info in the documentation for
similarity_search_by_vector and similarity_search_with_relevance_scores
is wrong
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This reverts commit a46eef64a7.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** Provide a way to use different text for embedding.
- For example, if you are ingesting stack-overflow Q&As for RAG, you
would want to embed the questions and return the answer(s) for the hits.
With this change, the consumer of langchain can implement that easily.
- I noticed the similar function is added on faiss.py with #1912 which
was for performance reason, but I see the same function can be used to
achieve what I thought. So instead of changing Document class to have
embedding_content, I mimicked the implementation of faiss.py.
- The test should provide some guidance on how to use it. It would be
more intuitive if I just pass texts and embedding_texts as separate
arguments, but I chose to use `zip`-ed object for the consistency with
faiss.py implementation.
- I plan to make similar pull request for OpenSearch.
- **Issue:** N/A
- **Dependencies:** None other than the existing ones.
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Adding Pydantic v2 support for OpenAPI Specs
- **Issue:**
- OpenAPI spec support was disabled because `openapi-schema-pydantic`
doesn't support Pydantic v2:
#9205
- Caused errors in `get_openapi_chain`
- This may be the cause of #9520.
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** kreneskyp
The root cause was that `openapi-schema-pydantic` hasn't been updated in
some time but
[openapi-pydantic](https://github.com/mike-oakley/openapi-pydantic)
forked and updated the project.
Added a notebook with examples of the creation of a retriever from the
SingleStoreDB vector store, and further usage.
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Updated the elasticsearch self query retriever to use the match clause
for LIKE operator instead of the non-analyzed fuzzy search clause.
Other small updates include:
- fixing the stack inference integration test where the index's default
pipeline didn't use the inference pipeline created
- adding a user-agent to the old implementation to track usage
- improved the documentation for ElasticsearchStore filters
### Description:
To provide an eas llm service access methods in this pull request by
impletementing `PaiEasEndpoint` and `PaiEasChatEndpoint` classes in
`langchain.llms` and `langchain.chat_models` modules. Base on this pr,
langchain users can build up a chain to call remote eas llm service and
get the llm inference results.
### About EAS Service
EAS is a Alicloud product on Alibaba Cloud Machine Learning Platform for
AI which is short for AliCloud PAI. EAS provides model inference
deployment services for the users. We build up a llm inference services
on EAS with a general llm docker images. Therefore, end users can
quickly setup their llm remote instances to load majority of the
hugginface llm models, and serve as a backend for most of the llm apps.
### Dependencies
This pr does't involve any new dependencies.
---------
Co-authored-by: 子洪 <gaoyihong.gyh@alibaba-inc.com>
Description: Supported RetryOutputParser & RetryWithErrorOutputParser
max_retries
- max_retries: Maximum number of retries to parser.
Issue: None
Dependencies: None
Tag maintainer: @baskaryan
Twitter handle:
We now require uses to have the pip package `llmonitor` installed. It
allows us to have cleaner code and avoid duplicates between our library
and our code in Langchain.
FAISS does not implement embeddings method and use embed_query to
embedding texts which is wrong for some embedding models.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
feat: Raise KeyError when 'prompt' key is missing in JSON response
This commit updates the error handling in the code to raise a KeyError
when the 'prompt' key is not found in the JSON response. This change
makes the code more explicit about the nature of the error, helping to
improve clarity and debugging.
@baskaryan, @eyurtsev.
I may be missing something but it seems like we inappropriately overrode
the 'stream()' method, losing callbacks in the process. I don't think
(?) it gave us anything in this case to customize it here?
See new trace:
https://smith.langchain.com/public/fbb82825-3a16-446b-8207-35622358db3b/r
and confirmed it streams.
Also fixes the stopwords issues from #12000
- **Description:** According to the document
https://cloud.baidu.com/doc/WENXINWORKSHOP/s/clntwmv7t, add ERNIE-Bot-4
model support for ErnieBotChat.
- **Dependencies:** Before using the ERNIE-Bot-4, you should have the
model's access authority.
By default replace input_variables with the correct value
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
.dict() is a Pydantic method that cannot raise exceptions, as it is used
eg. in `__eq__`
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Type hinting `*args` as `List[Any]` means that each positional argument
should be a list. Type hinting `**kwargs` as `Dict[str, Any]` means that
each keyword argument should be a dict of strings.
This is almost never what we actually wanted, and doesn't seem to be
what we want in any of the cases I'm replacing here.
The Docs folder changed its structure, and the notebook example for
SingleStoreDChatMessageHistory has not been copied to the new place due
to a merge conflict. Adding the example to the correct place.
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
- Update Zep Memory and Retriever docstrings
- Zep Memory Retriever: Add support for native MMR
- Add MMR example to existing ZepRetriever Notebook
@baskaryan
Example
```
from langchain.schema.runnable import RunnableLambda
from langsmith import traceable
chain = RunnableLambda(lambda x: x)
@traceable(run_type = "chain")
def my_traceable(a):
chain.invoke(a)
my_traceable(5)
```
Would have a nested result.
This would NOT work for interleaving chains and traceables. E.g., things
like thiswould still not work well
```
from langchain.schema.runnable import RunnableLambda
from langsmith import traceable
@traceable()
def other_traceable(a):
return a
def foo(x):
return other_traceable(x)
chain = RunnableLambda(foo)
@traceable(run_type = "chain")
def my_traceable(a):
chain.invoke(a)
my_traceable(5)
```
Minor lint dependency version upgrade to pick up latest functionality.
Ruff's new v0.1 version comes with lots of nice features, like
fix-safety guarantees and a preview mode for not-yet-stable features:
https://astral.sh/blog/ruff-v0.1.0
- **Description:** Chroma >= 0.4.10 added support for batch sizes
validation of add/upsert. This batch size is dependent on the SQLite
limits of the target system and varies. In this change, for
Chroma>=0.4.10 batch splitting was added as the aforementioned
validation is starting to surface in the Chroma community (users using
LC)
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** t_azarov
- Description: Considering the similarity computation method of
[BGE](https://github.com/FlagOpen/FlagEmbedding) model is cosine
similarity, set normalize_embeddings to be True.
- Tag maintainer: @baskaryan
Co-authored-by: Erick Friis <erick@langchain.dev>
Description: A large language models developed by Baichuan Intelligent
Technology,https://www.baichuan-ai.com/home
Issue: None
Dependencies: None
Tag maintainer:
Twitter handle:
This adds security notices to toolkits init, and to several toolkits.
We'll need to continue documenting the rest of the toolkits.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- Fix some typing issues found while doing that
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
the updated value was:
` Criteria.MISOGYNY: "Is the submission misogynistic? If so, respond Y."
`
The " If so, respond Y." should not be here. This sub-string is not
presented in any other criteria and should not be presented here.
I also added a synonym to "misogynistic" as it done in many other
criteria.
The current ToC on the index page and on navbar don't match. Page titles
and Titles in ToC doesn't match
Changes:
- made ToCs equal
- made titles equal
- updated some page formattings.
I have fixed some typos in file
`cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb`. I kindly
request the repo maintainers to review and merge it. Thanks!
I have fixed some typos in file
`cookbook/Semi_structured_and_multi_modal_RAG.ipynb`. I kindly request
the repo maintainers to review and merge it. Thanks!
**Description**
- Added the `SingleStoreDBChatMessageHistory` class that inherits
`BaseChatMessageHistory` and allows to use of a SingleStoreDB database
as a storage for chat message history.
- Added integration test to check that everything works (requires
`singlestoredb` to be installed)
- Added notebook with usage example
- Removed custom retriever for SingleStoreDB vector store (as it is
useless)
---------
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
## Description
| Tool | Original Tool Name |
|-----------------------------|---------------------------|
| open-meteo-api | Open Meteo API |
| news-api | News API |
| tmdb-api | TMDB API |
| podcast-api | Podcast API |
| golden_query | Golden Query |
| dall-e-image-generator | Dall-E Image Generator |
| twilio | Text Message |
| searx_search_results | Searx Search Results |
| dataforseo | DataForSeo Results JSON |
When using these tools through `load_tools`, I encountered the following
validation error:
```console
openai.error.InvalidRequestError: 'TMDB API' does not match '^[a-zA-Z0-9_-]{1,64}$' - 'functions.0.name'
```
In order to avoid this error, I replaced spaces with hyphens in the tool
names:
| Tool | Corrected Tool Name |
|-----------------------------|---------------------------|
| open-meteo-api | Open-Meteo-API |
| news-api | News-API |
| tmdb-api | TMDB-API |
| podcast-api | Podcast-API |
| golden_query | Golden-Query |
| dall-e-image-generator | Dall-E-Image-Generator |
| twilio | Text-Message |
| searx_search_results | Searx-Search-Results |
| dataforseo | DataForSeo-Results-JSON |
This correction resolved the validation error.
Additionally, a unit test,
`tests/unit_tests/schema/runnable/test_runnable.py::test_stream_log_retriever`,
was failing at random. Upon further investigation, I confirmed that the
failure was not related to the above-mentioned changes. The `stream_log`
variable was generating the order of logs in two ways at random The
reason for this behavior is unclear, but in the assertion, I included
both possible orders to account for this variability.
Fixed a typo :
"asyncrhonized" > "asynchronized"
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Hello Folks,
Alibaba Cloud OpenSearch has released a new version of the vector
storage engine, which has significantly improved performance compared to
the previous version. At the same time, the sdk has also undergone
changes, requiring adjustments alibaba opensearch vector store code to
adapt.
This PR includes:
Adapt to the latest version of Alibaba Cloud OpenSearch API.
More comprehensive unit testing.
Improve documentation.
I have read your contributing guidelines. And I have passed the tests
below
- [x] make format
- [x] make lint
- [x] make coverage
- [x] make test
---------
Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>
This patch fixes some spelling typo in
learned_prompt_optimization.ipynb.
It only changed messages, no logic changed.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
**Description:**
While working on the Docusaurus site loader #9138, I noticed some
outdated docs and tests for the Sitemap Loader.
**Issue:**
This is tangentially related to #6691 in reference to doc links. I plan
on digging in to a few of these issue when I find time next.
- **Description:** added examples to Vertex chat models as optional
class attributes, so that a model with examples can be used inside a
chain
- **Twitter handle:** lkuligin
Adding description of the `View deployment` button on the PR page. This
nice feature was not documented.
---------
Co-authored-by: Erick Friis <erickfriis@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Related to #10800
- Errors in the Docstring of GradientLLM / Gradient.ai LLM
- Renamed the `model_id` to `model` and adapting this in all tests.
Reason to so is to be in Sync with `GradientEmbeddings` and other LLM's.
- inmproving tests so they check the headers in the sent request.
- making the aiosession a private attribute in the docs, as in the
future `pip install gradientai` will be replacing aiosession.
- adding a example how to fine-tune on the Prompt Template as suggested
in #10800
- Description: Adds the ChatEverlyAI class with llama-2 7b on [EverlyAI
Hosted
Endpoints](https://everlyai.xyz/)
- It inherits from ChatOpenAI and requires openai (probably unnecessary
but it made for a quick and easy implementation)
---------
Co-authored-by: everly-studio <127131037+everly-studio@users.noreply.github.com>
- **Description:**
- If the Elasticsearch field used for Langchain > Document.page_content
is missing because the specific document is
somehow malformed fail gracefully.
- **Tag maintainer:**
- @joemcelroy
Reverts langchain-ai/langchain#11714
This has linting and formatting issues, plus it's added to chat models
folder but doesn't subclass Chat Model base class
Motivation and Context
At present, the Baichuan Large Language Model is relatively popular and
efficient in performance. Due to widespread market recognition, this
model has been added to enhance the scalability of Langchain's ability
to access the big language model, so as to facilitate application access
and usage for interested users.
System Info
langchain: 0.0.295
python:3.8.3
IDE:vs code
Description
Add the following files:
1. Add baichuan_baichuaninc_endpoint.py in the
libs/langchain/langchain/chat_models
2. Modify the __init__.py file,which is located in the
libs/langchain/langchain/chat_models/__init__.py:
a. Add "from langchain.chat_models.baichuan_baichuaninc_endpoint import
BaichuanChatEndpoint"
b. Add "BaichuanChatEndpoint" In the file's __ All__ method
Your contribution
I am willing to help implement this feature and submit a PR, but I would
appreciate guidance from the maintainers or community to ensure the
changes are made correctly and in line with the project's standards and
practices.
- **Description:** Add `TrainableLLM` for those LLM support fine-tuning
- **Tag maintainer:** @hwchase17
This PR add training methods to `GradientLLM`
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Hi there
This PR is aim to implement chat model for Alibaba Tongyi LLM model. It
contains work below:
1.Implement ChatTongyi chat model in langchain.chat_models.tongyi. Note
this is different with tongyi llm model to another PR
https://github.com/langchain-ai/langchain/pull/10878.
For detail it implements _generate() and _stream() function in
ChatTongyi.
2. Add some examples in chat/tongyi.ipynb.
3. Add integration test in chat_models/test_tongyi.py
Note async completion for the Text API is not yet supported.
Dependencies: dashscope. It will be installed manually cause it is not
need by everyone.
**Description**
This PR adds the `ElasticsearchChatMessageHistory` implementation that
stores chat message history in the configured
[Elasticsearch](https://www.elastic.co/elasticsearch/) deployment.
```python
from langchain.memory.chat_message_histories import ElasticsearchChatMessageHistory
history = ElasticsearchChatMessageHistory(
es_url="https://my-elasticsearch-deployment-url:9200", index="chat-history-index", session_id="123"
)
history.add_ai_message("This is me, the AI")
history.add_user_message("This is me, the human")
```
**Dependencies**
- [elasticsearch client](https://elasticsearch-py.readthedocs.io/)
required
Co-authored-by: Bagatur <baskaryan@gmail.com>
Instead of accessing `langchain.debug`, `langchain.verbose`, or
`langchain.llm_cache`, please use the new getter/setter functions in
`langchain.globals`:
- `langchain.globals.set_debug()` and `langchain.globals.get_debug()`
- `langchain.globals.set_verbose()` and
`langchain.globals.get_verbose()`
- `langchain.globals.set_llm_cache()` and
`langchain.globals.get_llm_cache()`
Using the old globals directly will now raise a warning.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:**
Add a document loader for the RSpace Electronic Lab Notebook
(www.researchspace.com), so that scientific documents and research notes
can be easily pulled into Langchain pipelines.
**Issue**
This is an new contribution, rather than an issue fix.
**Dependencies:**
There are no new required dependencies.
In order to use the loader, clients will need to install rspace_client
SDK using `pip install rspace_client`
---------
Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Update Indexing API docs to specify vectorstores that
are compatible with the Indexing API. I add a unit test to remind
developers to update the documentation whenever they add or change a
vectorstore in a way that affects compatibility. For the unit test I
repurposed existing code from
[here](https://github.com/langchain-ai/langchain/blob/v0.0.311/libs/langchain/langchain/indexes/_api.py#L245-L257).
This is my first PR to an open source project. This is a trivially
simple PR whose main purpose is to make me more comfortable submitting
Langchain PRs. If this PR goes through I plan to submit PRs with more
substantive changes in the near future.
**Issue:** Resolves
[10482](https://github.com/langchain-ai/langchain/discussions/10482).
**Dependencies:** No new dependencies.
**Twitter handle:** None.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Allows MMR functionality only for the case where we have access to the
embedding function. Also allows for users to request for fields from
elasticsearch store. These are added to the document metadata.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Description: Introducing an ability to load a transcription document of
audio file using [Yandex
SpeechKit](https://cloud.yandex.com/en-ru/services/speechkit)
Issue: None
Dependencies: yandex-speechkit
Tag maintainer: @rlancemartin, @eyurtsev
**Description**
This PR implements the usage of the correct tokenizer in Bedrock LLMs,
if using anthropic models.
**Issue:** #11560
**Dependencies:** optional dependency on `anthropic` python library.
**Twitter handle:** jtolgyesi
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Modify Anyscale integration to work with [Anyscale
Endpoint](https://docs.endpoints.anyscale.com/)
and it supports invoke, async invoke, stream and async invoke features
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Should delegate to parse_result, not to aparse, as parse_result is a
method that some output parsers override
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:** Avoid huggingfacepipeline to truncate the response if
user setup return_full_text as False within huggingface pipeline.
**Dependencies:** : None
**Tag maintainer:** Maybe @sam-h-bean ?
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
No relevant documents may be found for a given question. In some use
cases, we could directly respond with a fixed message instead of doing
an LLM call with an empty context. This PR exposes this as an option:
response_if_no_docs_found.
---------
Co-authored-by: Sudharsan Rangarajan <sudranga@nile-global.com>
Fix the documentation in
https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/ngram_overlap.
It's currently declaring unrelated variables, for example, `examples`
local variable is declared twice and the first one is overwritten
immediately.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** @dosuken123
Replace this entire comment with:
- **Description:** In this modified version of the function, if the
metadatas parameter is not None, the function includes the corresponding
metadata in the JSON object for each text. This allows the metadata to
be stored alongside the text's embedding in the vector store.
-
- **Issue:** #10924
- **Dependencies:** None
- **Tag maintainer:** @hwchase17
@agola11
- **Twitter handle:** @MelliJoaco
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Added demo for QA system with anonymization. It will be part of
LangChain's privacy webinar.
@hwchase17 @baskaryan @nfcampos
Twitter handle: @MaksOpp
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** fixed a bug in pal-chain when it reports Python
code validation errors. When node.func does not have any ids, the
original code tried to print node.func.id in raising ValueError.
- **Issue:** n/a,
- **Dependencies:** no dependencies,
- **Tag maintainer:** @hazzel-cn, @eyurtsev
- **Twitter handle:** @lazyswamp
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
I am merely making some minor adjustments to the function documentation.
I hope to provide a small assistance to LangChain.
- **Description:** Change the docs of JSONAgentOutputParser. It will be
`JSON` better,
- **Issue:** no,
- **Dependencies:** no,
- **Tag maintainer:** @hwchase17,
- **Twitter handle:** Not worth mentioning.
**Description:** This PR adds support for ChatOpenAI models in the
Infino callback handler. In particular, this PR implements
`on_chat_model_start` callback, so that ChatOpenAI models are supported.
With this change, Infino callback handler can be used to track latency,
errors, and prompt tokens for ChatOpenAI models too (in addition to the
support for OpenAI and other non-chat models it has today). The existing
example notebook is updated to show how to use this integration as well.
cc/ @naman-modi @savannahar68
**Issue:** https://github.com/langchain-ai/langchain/issues/11607
**Dependencies:** None
**Tag maintainer:** @hwchase17
**Twitter handle:** [@vkakade](https://twitter.com/vkakade)
* Should use non chunked messages for Invoke/Batch
* After this PR, stream output type is not represented, do we want to
use the union?
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Adds standard `type` field for all messages that will be
serialized/validated by pydantic.
* The presence of `type` makes it easier for developers consuming
schemas to write client code to serialize/deserialize.
* In LangServe `type` will be used for both validation and will appear
in the generated openapi specs
Preventing error caused by attempting to move the model that was already
loaded on the GPU using the Accelerate module to the same or another
device. It is not possible to load model with Accelerate/PEFT to CPU for
now
Addresses:
[#10985](https://github.com/langchain-ai/langchain/issues/10985)
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** This is an update to OctoAI LLM provider that adds
support for llama2 endpoints hosted on OctoAI and updates MPT-7b url
with the current one.
@baskaryan
Thanks!
---------
Co-authored-by: ML Wiz <bassemgeorgi@gmail.com>
**Description:** I noticed the metadata returned by the url_selenium
loader was missing several values included by the web_base loader. (The
former returned `{source: ...}`, the latter returned `{source: ...,
title: ..., description: ..., language: ...}`.) This change fixes it so
both loaders return all 4 key value pairs.
Files have been properly formatted and all tests are passing. Note,
however, that I am not much of a python expert, so that whole "Adding
the imports inside the code so that tests pass" thing seems weird to me.
Please LMK if I did anything wrong.
- **Description:** Assigning the custom_llm_provider to the default
params function so that it will be passed to the litellm
- **Issue:** Even though the custom_llm_provider argument is being
defined it's not being assigned anywhere in the code and hence its not
being passed to litellm, therefore any litellm call which uses the
custom_llm_provider as required parameter is being failed. This
parameter is mainly used by litellm when we are doing inference via
Custom API server.
https://docs.litellm.ai/docs/providers/custom_openai_proxy
- **Dependencies:** No dependencies are required
@krrishdholakia , @baskaryan
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
There is some invalid link in open ai platform
[docs](https://python.langchain.com/docs/integrations/platforms/openai).
So i fixed it to valid links.
- `/docs/integrations/chat_models/openai` ->
`/docs/integrations/chat/openai`
- `/docs/integrations/chat_models/azure_openai` ->
`/docs/integrations/chat/azure_chat_openai`
Thanks! ☺️
- **Description:** This PR introduces a new LLM and Retriever API to
https://arcee.ai for the python client
- **Issue:** implements the integrations as requested in #11578 ,
- **Dependencies:** no dependencies are required,
- **Tag maintainer:** @hwchase17
- **Twitter handle:** shwooobham
**✅ `make format`, `make lint` and `make test` runs locally.**
```shell
=========== 1245 passed, 277 skipped, 20 warnings in 16.26s ===========
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run black . --check
All done! ✨🍰✨
1818 files would be left unchanged.
[ "." = "" ] || poetry run mypy .
Success: no issues found in 1815 source files
[ "." = "" ] || poetry run black .
All done! ✨🍰✨
1818 files left unchanged.
[ "." = "" ] || poetry run ruff --select I --fix .
poetry run codespell --toml pyproject.toml
poetry run codespell --toml pyproject.toml -w
```
**Contributions**
1. Arcee (langchain/llms), ArceeRetriever (langchain/retrievers),
ArceeWrapper (langchain/utilities)
2. docs for Arcee (llms/arcee.py) and
ArceeRetriever(retrievers/arcee.py)
3.
cc: @jacobsolawetz @ben-epstein
---------
Co-authored-by: Shubham <shubham@sORo.local>
jinja2 templates are not sandboxed and are at risk for arbitrary code
execution. To mitigate this risk:
- We no longer support loading jinja2-formatted prompt template files.
- `PromptTemplate` with jinja2 may still be constructed manually, but
the class carries a security warning reminding the user to not pass
untrusted input into it.
Resolves#4394.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:** CohereRerank is missing `cohere_api_key` as a field and
since extras are forbidden, it is not possible to pass-in the key. The
only way is to use an env variable named `COHERE_API_KEY`.
For example, if trying to create a compressor like this:
```python
cohere_api_key = "......Cohere api key......"
compressor = CohereRerank(cohere_api_key=cohere_api_key)
```
you will get the following error:
```
File "/langchain/.venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for CohereRerank
cohere_api_key
extra fields not permitted (type=value_error.extra)
```
- **Description:** Fixes minor typo for the
query_sql_database_tool_description in the db toolkit
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @nfcampos
- **Twitter handle:** N/A
LangChain relies on NumPy to compute cosine distances, which becomes a
bottleneck with the growing dimensionality and number of embeddings. To
avoid this bottleneck, in our libraries at
[Unum](https://github.com/unum-cloud), we have created a specialized
package - [SimSIMD](https://github.com/ashvardanian/simsimd), that knows
how to use newer hardware capabilities. Compared to SciPy and NumPy, it
reaches 3x-200x performance for various data types. Since publication,
several LangChain users have asked me if I can integrate it into
LangChain to accelerate their workflows, so here I am 🤗
## Benchmarking
To conduct benchmarks locally, run this in your Jupyter:
```py
import numpy as np
import scipy as sp
import simsimd as simd
import timeit as tt
def cosine_similarity_np(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
X_norm = np.linalg.norm(X, axis=1)
Y_norm = np.linalg.norm(Y, axis=1)
with np.errstate(divide="ignore", invalid="ignore"):
similarity = np.dot(X, Y.T) / np.outer(X_norm, Y_norm)
similarity[np.isnan(similarity) | np.isinf(similarity)] = 0.0
return similarity
def cosine_similarity_sp(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
return 1 - sp.spatial.distance.cdist(X, Y, metric='cosine')
def cosine_similarity_simd(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
return 1 - simd.cdist(X, Y, metric='cosine')
X = np.random.randn(1, 1536).astype(np.float32)
Y = np.random.randn(1, 1536).astype(np.float32)
repeat = 1000
print("NumPy: {:,.0f} ops/s, SciPy: {:,.0f} ops/s, SimSIMD: {:,.0f} ops/s".format(
repeat / tt.timeit(lambda: cosine_similarity_np(X, Y), number=repeat),
repeat / tt.timeit(lambda: cosine_similarity_sp(X, Y), number=repeat),
repeat / tt.timeit(lambda: cosine_similarity_simd(X, Y), number=repeat),
))
```
## Results
I ran this on an M2 Pro Macbook for various data types and different
number of rows in `X` and reformatted the results as a table for
readability:
| Data Type | NumPy | SciPy | SimSIMD |
| :--- | ---: | ---: | ---: |
| `f32, 1` | 59,114 ops/s | 80,330 ops/s | 475,351 ops/s |
| `f16, 1` | 32,880 ops/s | 82,420 ops/s | 650,177 ops/s |
| `i8, 1` | 47,916 ops/s | 115,084 ops/s | 866,958 ops/s |
| `f32, 10` | 40,135 ops/s | 24,305 ops/s | 185,373 ops/s |
| `f16, 10` | 7,041 ops/s | 17,596 ops/s | 192,058 ops/s |
| `f16, 10` | 21,989 ops/s | 25,064 ops/s | 619,131 ops/s |
| `f32, 100` | 3,536 ops/s | 3,094 ops/s | 24,206 ops/s |
| `f16, 100` | 900 ops/s | 2,014 ops/s | 23,364 ops/s |
| `i8, 100` | 5,510 ops/s | 3,214 ops/s | 143,922 ops/s |
It's important to note that SimSIMD will underperform if both matrices
are huge.
That, however, seems to be an uncommon usage pattern for LangChain
users.
You can find a much more detailed performance report for different
hardware models here:
- [Apple M2
Pro](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-1-performance-on-apple-m2-pro).
- [4th Gen Intel Xeon
Platinum](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-2-performance-on-4th-gen-intel-xeon-platinum-8480).
- [AWS Graviton
3](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-3-performance-on-aws-graviton-3).
## Additional Notes
1. Previous version used `X = np.array(X)`, to repackage lists of lists.
It's an anti-pattern, as it will use double-precision floating-point
numbers, which are slow on both CPUs and GPUs. I have replaced it with
`X = np.array(X, dtype=np.float32)`, but a more selective approach
should be discussed.
2. In numerical computations, it's recommended to explicitly define
tolerance levels, which were previously avoided in
`np.allclose(expected, actual)` calls. For now, I've set absolute
tolerance to distance computation errors as 0.01: `np.allclose(expected,
actual, atol=1e-2)`.
---
- **Dependencies:** adds `simsimd` dependency
- **Tag maintainer:** @hwchase17
- **Twitter handle:** @ashvardanian
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
#### Description
This PR adds the option to specify additional metadata columns in the
CSVLoader beyond just `Source`.
The current CSV loader includes all columns in `page_content` and if we
want to have columns specified for `page_content` and `metadata` we have
to do something like the below.:
```
csv = pd.read_csv(
"path_to_csv"
).to_dict("records")
documents = [
Document(
page_content=doc["content"],
metadata={
"last_modified_by": doc["last_modified_by"],
"point_of_contact": doc["point_of_contact"],
}
) for doc in csv
]
```
#### Usage
Example Usage:
```
csv_test = CSVLoader(
file_path="path_to_csv",
metadata_columns=["last_modified_by", "point_of_contact"]
)
```
Example CSV:
```
content, last_modified_by, point_of_contact
"hello world", "Person A", "Person B"
```
Example Result:
```
Document {
page_content: "hello world"
metadata: {
row: '0',
source: 'path_to_csv',
last_modified_by: 'Person A',
point_of_contact: 'Person B',
}
```
---------
Co-authored-by: Ben Chello <bchello@dropbox.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Fixes the comments in the ConvoOutputParser. Because
the \\\\ is escaping a single \\, they render something like:
`"action_input": string \ The input to the action` in the prompt.
Changing this to \\\\\\\\ lets it escape two slashes so that it renders
a proper comment: `"action_input": string \\ The input to the action`
- **Issue:** N/A
- **Dependencies:**
- **Tag maintainer:** @hwchase17
- **Twitter handle:**
**Description**:
- Added Momento Vector Index (MVI) as a vector store provider. This
includes an implementation with docstrings, integration tests, a
notebook, and documentation on the docs pages.
- Updated the Momento dependency in pyproject.toml and the lock file to
enable access to MVI.
- Refactored the Momento cache and chat history session store to prefer
using "MOMENTO_API_KEY" over "MOMENTO_AUTH_TOKEN" for consistency with
MVI. This change is backwards compatible with the previous "auth_token"
variable usage. Updated the code and tests accordingly.
**Dependencies**:
- Updated Momento dependency in pyproject.toml.
**Testing**:
- Run the integration tests with a Momento API key. Get one at the
[Momento Console](https://console.gomomento.com) for free. MVI is
available in AWS us-west-2 with a superuser key.
- `MOMENTO_API_KEY=<your key> poetry run pytest
tests/integration_tests/vectorstores/test_momento_vector_index.py`
**Tag maintainer:**
@eyurtsev
**Twitter handle**:
Please mention @momentohq for this addition to langchain. With the
integration of Momento Vector Index, Momento caching, and session store,
Momento provides serverless support for the core langchain data needs.
Also mention @mlonml for the integration.
**Description**
This PR adds an additional Example to the Redis integration
documentation. [The
example](https://learn.microsoft.com/azure/azure-cache-for-redis/cache-tutorial-vector-similarity)
is a step-by-step walkthrough of using Azure Cache for Redis and Azure
OpenAI for vector similarity search, using LangChain extensively
throughout.
**Issue**
Nothing specific, just adding an additional example.
**Dependencies**
None.
**Tag Maintainer**
Tagging @hwchase17 :)
Wraps every callback handler method in error handlers to avoid breaking
users' programs when an error occurs inside the handler.
Thanks @valdo99 for the suggestion 🙂
[The `duckduckgo-search` v3.9.2 was removed from
PyPi](https://pypi.org/project/duckduckgo-search/#history). That breaks
the build.
- **Description:** refreshes the Poetry dependency to v3.9.3
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @ashvardanian
updating query constructor and self query retriever to
- make it easier to pass in examples
- validate attributes used in query
- remove invalid parts of query
- make it easier to get + edit prompt
- make query constructor a runnable
- make self query retriever use as runnable
- keep alias for RunnableMap
- update docs to use RunnableParallel and RunnablePassthrough.assign
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** Add support for a SQLRecordManager in async
environments. It includes the creation of `RecorManagerAsync` abstract
class.
- **Issue:** None
- **Dependencies:** Optional `aiosqlite`.
- **Tag maintainer:** @nfcampos
- **Twitter handle:** @jvelezmagic
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Use `.copy()` to fix the bug that the first `llm_inputs` element is
overwritten by the second `llm_inputs` element in `intermediate_steps`.
***Problem description:***
In [line 127](
c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L127C17-L127C17)),
the `llm_inputs` of the sql generation step is appended as the first
element of `intermediate_steps`:
```
intermediate_steps.append(llm_inputs) # input: sql generation
```
However, `llm_inputs` is a mutable dict, it is updated in [line
179](https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py#L179)
for the final answer step:
```
llm_inputs["input"] = input_text
```
Then, the updated `llm_inputs` is appended as another element of
`intermediate_steps` in [line
180](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L180)):
```
intermediate_steps.append(llm_inputs) # input: final answer
```
As a result, the final `intermediate_steps` returned in [line
189](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L189C43-L189C43))
actually contains two same `llm_inputs` elements, i.e., the `llm_inputs`
for the sql generation step overwritten by the one for final answer step
by mistake. Users are not able to get the actual `llm_inputs` for the
sql generation step from `intermediate_steps`
Simply calling `.copy()` when appending `llm_inputs` to
`intermediate_steps` can solve this problem.
### Description
This pull request involves modifications to the extraction method for
abstracts/summaries within the PubMed utility. A condition has been
added to verify the presence of unlabeled abstracts. Now an abstract
will be extracted even if it does not have a subtitle. In addition, the
extraction of the abstract was extended to books.
### Issue
The PubMed utility occasionally returns an empty result when extracting
abstracts from articles, despite the presence of an abstract for the
paper on PubMed. This issue arises due to the varying structure of
articles; some articles follow a "subtitle/label: text" format, while
others do not include subtitles in their abstracts. An example of the
latter case can be found at:
[https://pubmed.ncbi.nlm.nih.gov/37666905/](url)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description
SelfQueryRetriever is missing async support, so I am adding it.
I also removed deprecated predict_and_parse method usage here, and added
some tests.
### Issue
N/A
### Tag maintainer
Not yet
### Twitter handle
N/A
**Description**
It is for #10423 that it will be a useful feature if we can extract
images from pdf and recognize text on them. I have implemented it with
`PyPDFLoader`, `PyPDFium2Loader`, `PyPDFDirectoryLoader`,
`PyMuPDFLoader`, `PDFMinerLoader`, and `PDFPlumberLoader`.
[RapidOCR](https://github.com/RapidAI/RapidOCR.git) is used to recognize
text on extracted images. It is time-consuming for ocr so a boolen
parameter `extract_images` is set to control whether to extract and
recognize. I have tested the time usage for each parser on my own laptop
thinkbook 14+ with AMD R7-6800H by unit test and the result is:
| extract_images | PyPDFParser | PDFMinerParser | PyMuPDFParser |
PyPDFium2Parser | PDFPlumberParser |
| ------------- | ------------- | ------------- | ------------- |
------------- | ------------- |
| False | 0.27s | 0.39s | 0.06s | 0.08s | 1.01s |
| True | 17.01s | 20.67s | 20.32s | 19,75s | 20.55s |
**Issue**
#10423
**Dependencies**
rapidocr_onnxruntime in
[RapidOCR](https://github.com/RapidAI/RapidOCR/tree/main)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description: The previous version of the MarkdownHeaderTextSplitter
did not take into account the possibility of '#' appearing within code
blocks, which caused segmentation anomalies in these situations. This PR
has fixed this issue.
- Issue:
- Dependencies: No
- Tag maintainer:
- Twitter handle:
cc @baskaryan @eyurtsev @rlancemartin
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description: This PR adds a new chain `rl_chain.PickBest` for learned
prompt variable injection, detailed description and usage can be found
in the example notebook added. It essentially adds a
[VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer
before the llm call in order to learn or personalize prompt variable
selections.
Most of the code is to make the API simple and provide lots of defaults
and data wrangling that is needed to use Vowpal Wabbit, so that the user
of the chain doesn't have to worry about it.
- Dependencies:
[vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/),
- sentence-transformers (already a dep)
- numpy (already a dep)
- tagging @ataymano who contributed to this chain
- Tag maintainer: @baskaryan
- Twitter handle: @olgavrou
Added example notebook and unit tests
Replace this entire comment with:
- **Description:** minor update to constructor to allow for
specification of "source"
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @ofermend
# Description
Attempts to fix RedisCache for ChatGenerations using `loads` and `dumps`
used in SQLAlchemy cache by @hwchase17 . this is better than pickle
dump, because this won't execute any arbitrary code during
de-serialisation.
# Issues
#7722 & #8666
# Dependencies
None, but removes the warning introduced in #8041 by @baskaryan
Handle: @jaikanthjay46
- Description: Updated output parser for mrkl to remove any
hallucination actions after the final answer; this was encountered when
using Anthropic claude v2 for planning; reopening PR with updated unit
tests
- Issue: #10278
- Dependencies: N/A
- Twitter handle: @johnreynolds
Description: this PR changes the `ArcGISLoader` to set
`return_all_records` to `False` when `result_record_count` is provided
as a keyword argument. Previously, `return_all_records` was `True` by
default and this made the API ignore `result_record_count`.
Issue: `ArcGISLoader` would ignore `result_record_count` unless user
also passed `return_all_records=False`.
- **Description:** This commit corrects a minor typo in the
documentation. It changes "frum" to "from" in the sentence: "The results
from search are passed back to the LLM for synthesis into an answer" in
the file `docs/extras/use_cases/more/agents/agents.ipynb`. This typo fix
enhances the clarity and accuracy of the documentation.
- **Tag maintainer:** @baskaryan
- **Description:** Fix the `PyMuPDFLoader` to accept `loader_kwargs`
from the document loader's `loader_kwargs` option. This provides more
flexibility in formatting the output from documents.
- **Issue:** The `loader_kwargs` is not passed into the `load` method
from the document loader, which limits configuration options.
- **Dependencies:** None
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
I have restructured the code to ensure uniform handling of ImportError.
In place of previously used ValueError, I've adopted the standard
practice of raising ImportError with explanatory messages. This
modification enhances code readability and clarifies that any problems
stem from module importation.
- **Description:** Just docs related to csharp code splitter
- **Issue:** It's related to a request made by @baskaryan in a comment
on my previous PR #10350
- **Dependencies:** None
- **Twitter handle:** @ather19
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description
Add instance anonymization - if `John Doe` will appear twice in the
text, it will be treated as the same entity.
The difference between `PresidioAnonymizer` and
`PresidioReversibleAnonymizer` is that only the second one has a
built-in memory, so it will remember anonymization mapping for multiple
texts:
```
>>> anonymizer = PresidioAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Brett Russell. Hi Brett Russell!'
```
```
>>> anonymizer = PresidioReversibleAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
```
### Twitter handle
@deepsense_ai / @MaksOpp
### Tag maintainer
@baskaryan @hwchase17 @hinthornw
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
PyPDF does not chunk at the character level to my understanding.
Description: PyPDF does not chunk at the character level, but instead
breaks up content by page. Fixup comment
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Description: There are cases when the output from the LLM comes fine
(i.e. function_call["arguments"] is a valid JSON object), but it does
not contain the key "actions". So I split the validation in 2 steps:
loading arguments as JSON and then checking for "actions" in it.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description: Google Cloud Enterprise Search was renamed to Vertex AI
Search
-
https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available
- This PR updates the documentation and Retriever class to use the new
terminology.
- Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to
`GoogleVertexAISearchRetriever`
- Updated documentation to specify that `extractive_segments` requires
the new [Enterprise
edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features)
to be enabled.
- Fixed spelling errors in documentation.
- Change parameter for Retriever from `search_engine_id` to
`data_store_id`
- When this retriever was originally implemented, there was no
distinction between a data store and search engine, but now these have
been split.
- Fixed an issue blocking some users where the api_endpoint can't be set
### Description
When using Weaviate Self-Retrievers, certain common filter comparators
generated by user queries were unimplemented, resulting in errors. This
PR implements some of them. All linting and format commands have been
run and tests passed.
### Issue
#10474
### Dependencies
timestamp module
---------
Co-authored-by: Patrick Randell <prandell@deloitte.com.au>
**Description:** Previously if the access to Azure Cognitive Search was
not done via an API key, the default credential was called which doesn't
allow to use an interactive login. I simply added the option to use
"INTERACTIVE" as a key name, and this will launch a login window upon
initialization of the AzureSearch object.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
I was hoping this would pick up numpy 1.26, which is required to support
the new Python 3.12 release, but it didn't. It seems that some
transitive dependency requirement on numpy is preventing that, and the
highest we can currently go is 1.24.x.
But to find this out required a 15min `poetry lock`, so I figured we
might as well upgrade the dependencies we can and hopefully make the
next dependency upgrade a bit smaller.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
There are several pages in `integrations/providers/more` that belongs to
Google and AWS `integrations/providers`.
- moved content of these pages into the Google and AWS
`integrations/providers` pages
- removed these individual pages
Consolidating to a single README for now, will be easier to maintain we
can differentiate between poetry and pip later. Does not seem critical.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
First version of CLI command to create a new langchain project template
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
## Description
Currently SQLAlchemy >=1.4.0 is a hard requirement. We are unable to run
`from langchain.vectorstores import FAISS` with SQLAlchemy <1.4.0 due to
top-level imports, even if we aren't even using parts of the library
that use SQLAlchemy. See Testing section for repro. Let's make it so
that langchain is still compatible with SQLAlchemy <1.4.0, especially if
we aren't using parts of langchain that require it.
The main conflict is that SQLAlchemy removed `declarative_base` from
`sqlalchemy.ext.declarative` in 1.4.0 and moved it to `sqlalchemy.orm`.
We can fix this by try-catching the import. This is the same fix as
applied in https://github.com/langchain-ai/langchain/pull/883.
(I see that there seems to be some refactoring going on about isolating
dependencies, e.g.
c87e9fb2ce,
so if this issue will be eventually fixed by isolating imports in
langchain.vectorstores that also works).
## Issue
I can't find a matching issue.
## Dependencies
No additional dependencies
## Maintainer
@hwchase17 since you reviewed
https://github.com/langchain-ai/langchain/pull/883
## Testing
I didn't add a test, but I manually tested this.
1. Current failure:
```
langchain==0.0.305
sqlalchemy==1.3.24
```
``` python
python -i
>>> from langchain.vectorstores import FAISS
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/__init__.py", line 58, in <module>
from langchain.vectorstores.pgembedding import PGEmbedding
File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/pgembedding.py", line 10, in <module>
from sqlalchemy.orm import Session, declarative_base, relationship
ImportError: cannot import name 'declarative_base' from 'sqlalchemy.orm' (/pay/src/zoolander/vendor3/lib/python3.8/site-packages/sqlalchemy/orm/__init__.py)
```
2. This fix:
```
langchain==<this PR>
sqlalchemy==1.3.24
```
``` python
python -i
>>> from langchain.vectorstores import FAISS
<succeeds>
```
- Make logs a dictionary keyed by run name (and counter for repeats)
- Ensure no output shows up in lc_serializable format
- Fix up repr for RunLog and RunLogPatch
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- default MessagesPlaceholder one to list of messages
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Removes human prompt prefix before system message for anthropic models
Bedrock anthropic api enforces that Human and Assistant messages must be
interleaved (cannot have same type twice in a row). We currently treat
System Messages as human messages when converting messages -> string
prompt. Our validation when using Bedrock/BedrockChat raises an error
when this happens. For ChatAnthropic we don't validate this so no error
is raised, but perhaps the behavior is still suboptimal
- **Description:** add a paragraph to the GoogleDriveLoader doc on how
to bypass errors on authentication.
For some reason, specifying credential path via `credentials_path`
constructor parameter when creating `GoogleDriveLoader` makes it so that
the oAuth screen is never showing up when first using GoogleDriveLoader.
Instead, the `RefreshError: ('invalid_grant: Bad Request', {'error':
'invalid_grant', 'error_description': 'Bad Request'})` error happens.
Setting it via `os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ...`
solves the problem. Also, `token_path` constructor parameter is
mandatory, otherwise another error happens when trying to `load()` for
the first time.
These errors are tricky and time-consuming to figure out, so I believe
it's good to mention them in the docs.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:**
Added support for Cohere command model via Bedrock.
With this change it is now possible to use the `cohere.command-text-v14`
model via Bedrock API.
About Streaming: Cohere model outputs 2 additional chunks at the end of
the text being generated via streaming: a chunk containing the text
`<EOS_TOKEN>`, and a chunk indicating the end of the stream. In this
implementation I chose to ignore both chunks. An alternative solution
could be to replace `<EOS_TOKEN>` with `\n`
Tests: manually tested that the new model work with both
`llm.generate()` and `llm.stream()`.
Tested with `temperature`, `p` and `stop` parameters.
**Issue:** #11181
**Dependencies:** No new dependencies
**Tag maintainer:** @baskaryan
**Twitter handle:** mangelino
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Description: Similar in concept to the `MarkdownHeaderTextSplitter`, the
`HTMLHeaderTextSplitter` is a "structure-aware" chunker that splits text
at the element level and adds metadata for each header "relevant" to any
given chunk. It can return chunks element by element or combine elements
with the same metadata, with the objectives of (a) keeping related text
grouped (more or less) semantically and (b) preserving context-rich
information encoded in document structures. It can be used with other
text splitters as part of a chunking pipeline.
Dependency: lxml python package
Maintainer: @hwchase17
Twitter handle: @MartinZirulnik
---------
Co-authored-by: PresidioVantage <github@presidiovantage.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** Doc corrections and resolve notebook rendering issue
on GH
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan
- **Twitter handle:** `@isaacchung1217`
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:**
Examples in the "Select by similarity" section were not really
highlighting capabilities of similarity search.
E.g. "# Input is a measurement, so should select the tall/short example"
was still outputting the "mood" example.
I tweaked the inputs a bit and fixed the examples (checking that those
are indeed what the search outputs).
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
I've refactored the code to ensure that ImportError is consistently
handled. Instead of using ValueError as before, I've now followed the
standard practice of raising ImportError along with clear and
informative error messages. This change enhances the code's clarity and
explicitly signifies that any problems are associated with module
imports.
- **Description:** Fix typo about `RetrievalQAWithSourceChain` ->
`RetrievalQAWithSourcesChain`
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** Adds Kotlin language to `TextSplitter`
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
For external libraries that depend on `type_to_cls_dict`, adds a
workaround to continue using the old format.
Recommend people use `get_type_to_cls_dict()` instead and only resolve
the imports when they're used.
- **Description:** use term keyword according to the official python doc
glossary, see https://docs.python.org/3/glossary.html
- **Issue:** not applicable
- **Dependencies:** not applicable
- **Tag maintainer:** @hwchase17
- **Twitter handle:** vreyespue
The previous API of the `_execute()` function had a few rough edges that
this PR addresses:
- The `fetch` argument was type-hinted as being able to take any string,
but any string other than `"all"` or `"one"` would `raise ValueError`.
The new type hints explicitly declare that only those values are
supported.
- The return type was type-hinted as `Sequence` but using `fetch =
"one"` would actually return a single result item. This was incorrectly
suppressed using `# type: ignore`. We now always return a list.
- Using `fetch = "one"` would return a single item if data was found, or
an empty *list* if no data was found. This was confusing, and we now
always return a list to simplify.
- The return type was `Sequence[Any]` which was a bit difficult to use
since it wasn't clear what one could do with the returned rows. I'm
making the new type `Dict[str, Any]` that corresponds to the column
names and their values in the query.
I've updated the use of this method elsewhere in the file to match the
new behavior.
continuation of PR #8550
@hwchase17 please see and merge. And also close the PR #8550.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
therefor -> therefore
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Instead of:
```
client = Client()
with collect_runs() as cb:
chain.invoke()
run = cb.traced_runs[0]
client.get_run_url(run)
```
it's
```
with tracing_v2_enabled() as cb:
chain.invoke()
cb.get_run_url()
```
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Similarly to Vertex classes, PaLM classes weren't marked as
serialisable. Should be working fine with LangSmith.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR uses 2 dedicated LangChain warnings types for deprecations
(mirroring python's built in deprecation and pending deprecation
warnings).
These deprecation types are unslienced during initialization in
langchain achieving the same default behavior that we have with our
current warnings approach. However, because these warnings have a
dedicated type, users will be able to silence them selectively (I think
this is strictly better than our current handling of warnings).
The PR adds a deprecation warning to llm symbolic math.
---------
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
- Also move RunnableBranch to its own file
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
### Description
When I was reading the document, I found that some examples had extra
spaces and violated "Unexpected spaces around keyword / parameter equals
(E251)" in pep8. I removed these extra spaces.
### Tag maintainer
@eyurtsev
### Twitter handle
[billvsme](https://twitter.com/billvsme)
### Description
renamed several repository links from `hwchase17` to `langchain-ai`.
### Why
I discovered that the README file in the devcontainer contains an old
repository name, so I took the opportunity to rename the old repository
name in all files within the repository, excluding those that do not
require changes.
### Dependencies
none
### Tag maintainer
@baskaryan
### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)
updated `YouTube` and `tutorial` videos with new links.
Removed couple of duplicates.
Reordered several links by view counters
Some formatting: emphasized the names of products
- updated titles and descriptions of the `integrations/memory` notebooks
into consistent and laconic format;
- removed
`docs/extras/integrations/memory/motorhead_memory_managed.ipynb` file as
a duplicate of the
`docs/extras/integrations/memory/motorhead_memory.ipynb`;
- added `integrations/providers` Integration Cards for `dynamodb`,
`motorhead`.
- updated `integrations/providers/redis.mdx` with links
- renamed several notebooks; updated `vercel.json` to reroute new names.
**Description:** Adds streaming and many more sampling parameters to the
DeepSparse interface
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Fix a code injection vuln by adding one more keyword
into the filtering list
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:**
- **Twitter handle:**
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Passes through dict input and assigns additional keys
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<img width="1728" alt="Screenshot 2023-09-28 at 20 15 01"
src="https://github.com/langchain-ai/langchain/assets/56902/ed0644c3-6db7-41b9-9543-e34fce46d3e5">
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Enviroment -> Environment
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Suppress warnings in interactive environments that can arise from users
relying on tab completion (without even using deprecated modules).
jupyter seems to filter warnings by default (at least for me), but
ipython surfaces them all
- **Description:** A Document Loader for MongoDB
- **Issue:** n/a
- **Dependencies:** Motor, the async driver for MongoDB
- **Tag maintainer:** n/a
- **Twitter handle:** pigpenblue
Note that an initial mongodb document loader was created 4 months ago,
but the [PR ](https://github.com/langchain-ai/langchain/pull/4285)was
never pulled in. @leo-gan had commented on that PR, but given it is
extremely far behind the master branch and a ton has changed in
Langchain since then (including repo name and structure), I rewrote the
branch and issued a new PR with the expectation that the old one can be
closed.
Please reference that old PR for comments/context, but it can be closed
in favor of this one. Thanks!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
```
ChatPromptTemplate(messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a nice assistant.')), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['question'], template='{question}'))])
| RunnableLambda(lambda x: x)
| {
chat: FakeListChatModel(responses=["i'm a chatbot"]),
llm: FakeListLLM(responses=["i'm a textbot"])
}
```
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:**
be able to use langchain with other version than tiktoken 0.3.3 i.e
0.5.1
- **Issue:**
cannot installed the conda-forge version since it applied all optional
dependency:
https://github.com/conda-forge/langchain-feedstock/pull/85
replace "^0.3.2" by "">=0.3.2,<0.6.0" and "^3.9" by python=">=3.9"
Tested with python 3.10, langchain=0.0.288 and tiktoken==0.5.0
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
As of now, when instantiating and during inference, `LlamaCppEmbeddings`
outputs (a lot of) verbose when controlled from Langchain binding - it
is a bit annoying when computing the embeddings of long documents, for
instance.
This PR adds `verbose` for `LlamaCppEmbeddings` objects to be able
**not** to print the verbose of the model to `stderr`. It is natively
supported by `llama-cpp-python` and directly passed to the library – the
PR is hence very small.
The value of `verbose` is `True` by default, following the way it is
defined in [`LlamaCpp` (`llamacpp.py`
#L136-L137)](c87e9fb2ce/libs/langchain/langchain/llms/llamacpp.py (L136-L137))
## Issue
_No issue linked_
## Dependencies
_No additional dependency needed_
## To see it in action
```python
from langchain.embeddings import LlamaCppEmbeddings
MODEL_PATH = "<path_to_gguf_file>"
if __name__ == "__main__":
llm_embeddings = LlamaCppEmbeddings(
model_path=MODEL_PATH,
n_gpu_layers=1,
n_batch=512,
n_ctx=2048,
f16_kv=True,
verbose=False,
)
```
Co-authored-by: Bagatur <baskaryan@gmail.com>
Based on the customers' requests for native langchain integration,
SearchApi is ready to invest in AI and LLM space, especially in
open-source development.
- This is our initial PR and later we want to improve it based on
customers' and langchain users' feedback. Most likely changes will
affect how the final results string is being built.
- We are creating similar native integration in Python and JavaScript.
- The next plan is to integrate into Java, Ruby, Go, and others.
- Feel free to assign @SebastjanPrachovskij as a main reviewer for any
SearchApi-related searches. We will be glad to help and support
langchain development.
- **Description:**
- Make running integration test for opensearch easy
- Provide a way to use different text for embedding: refer to #11002 for
more of the use case and design decision.
- **Issue:** N/A
- **Dependencies:** None other than the existing ones.
Both black and mypy expect a list of files or directories as input.
As-is the Makefile computes a list files changed relative to the last
commit; these are passed to black and mypy in the `format_diff` and
`lint_diff` targets. This is done by way of the Makefile variable
`PYTHON_FILES`. This is to save time by skipping running mypy and black
over the whole source tree.
When no changes have been made, this variable is empty, so the call to
black (and mypy) lacks input files. The call exits with error causing
the Makefile target to error out with:
```bash
$ make format_diff
poetry run black
Usage: black [OPTIONS] SRC ...
One of 'SRC' or 'code' is required.
make: *** [format_diff] Error 1
```
This is unexpected and undesirable, as the naive caller (that's me! 😄 )
will think something else is wrong. This commit smooths over this by
short circuiting when `PYTHON_FILES` is empty.
- **Description:** The types of 'destination_chains' and 'default_chain'
in 'MultiPromptChain' were changed from 'LLMChain' to 'Chain'. and
removed variables declared overlapping with the parent class
- **Issue:** When a class that inherits only Chain and not LLMChain,
such as 'SequentialChain' or 'RetrievalQA', is entered in
'destination_chains' and 'default_chain', a pydantic validation error is
raised.
- - codes
```
retrieval_chain = ConversationalRetrievalChain(
retriever=doc_retriever,
combine_docs_chain=combine_docs_chain,
question_generator=question_gen_chain,
)
destination_chains = {
'retrieval': retrieval_chain,
}
main_chain = MultiPromptChain(
router_chain=router_chain,
destination_chains=destination_chains,
default_chain=default_chain,
verbose=True,
)
```
✅ `make format`, `make lint` and `make test`
## Description
Expanded the upper bound for `networkx` dependency to allow installation
of latest stable version. Tested the included sample notebook with
version 3.1, and all steps ran successfully.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Adds support for the `$vectorSearch` operator for
MongoDBAtlasVectorSearch, which was announced at .Local London
(September 26th, 2023). This change maintains breaks compatibility
support for the existing `$search` operator used by the original
integration (https://github.com/langchain-ai/langchain/pull/5338) due to
incompatibilities in the Atlas search implementations.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
We noticed that as we have been moving developers to the new
`ElasticsearchStore` implementation, we want to keep the
ElasticVectorSearch class still available as developers transition
slowly to the new store.
To speed up this process, I updated the blurb giving them a better
recommendation of why they should use ElasticsearchStore.
Description: Add "source" metadata to OutlookMessageLoader
This pull request adds the "source" metadata to the OutlookMessageLoader
class in the load method. The "source" metadata is required when
indexing with RecordManager in order to sync the index documents with a
source.
Issue: None
Dependencies: None
Twitter handle: @ATelders
Co-authored-by: Arthur Telders <arthur.telders@roquette.com>
- **Description:** Bedrock updated boto service name to
"bedrock-runtime" for the InvokeModel and InvokeModelWithResponseStream
APIs. This update also includes new model identifiers for Titan text,
embedding and Anthropic.
Co-authored-by: Mani Kumar Adari <maniadar@amazon.com>
The key of stopping strings used in text-generation-webui api is
[`stopping_strings`](https://github.com/oobabooga/text-generation-webui/blob/main/api-examples/api-example.py#L51),
not `stop`.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Fixed Typo Error in Update get_started.mdx file by addressing a minor
typographical error.
This improvement enhances the readability and correctness of the
notebook, making it easier for users to understand and follow the
demonstration. The commit aims to maintain the quality and accuracy of
the content within the repository.
please review the change at your convenience.
@baskaryan , @hwaking
- **Description:** Changed data type from `text` to `json` in xata for
improved performance. Also corrected the `additionalKwargs` key in the
`messages()` function to `additional_kwargs` to adhere to `BaseMessage`
requirements.
- **Issue:** The Chathisroty.messages() will return {} of
`additional_kwargs`, as the name is wrong for `additionalKwargs` .
- **Dependencies:** N/A
- **Tag maintainer:** N/A
- **Twitter handle:** N/A
My PR is passing linting and testing before submitting.
This adds `input_schema` and `output_schema` properties to all
runnables, which are Pydantic models for the input and output types
respectively. These are inferred from the structure of the Runnable as
much as possible, the only manual typing needed is
- optionally add type hints to lambdas (which get translated to
input/output schemas)
- optionally add type hint to RunnablePassthrough
These schemas can then be used to create JSON Schema descriptions of
input and output types, see the tests
- [x] Ensure no InputType and OutputType in our classes use abstract
base classes (replace with union of subclasses)
- [x] Implement in BaseChain and LLMChain
- [x] Implement in RunnableBranch
- [x] Implement in RunnableBinding, RunnableMap, RunnablePassthrough,
RunnableEach, RunnableRouter
- [x] Implement in LLM, Prompt, Chat Model, Output Parser, Retriever
- [x] Implement in RunnableLambda from function signature
- [x] Implement in Tool
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Adds LangServe package
* Integrate Runnables with Fast API creating Server and a RemoteRunnable
client
* Support multiple runnables for a given server
* Support sync/async/batch/abatch/stream/astream/astream_log on the
client side (using async implementations on server)
* Adds validation using annotations (relying on pydantic under the hood)
-- this still has some rough edges -- e.g., open api docs do NOT
generate correctly at the moment
* Uses pydantic v1 namespace
Known issues: type translation code doesn't handle a lot of types (e.g.,
TypedDicts)
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
The current behaviour just calls the handler without awaiting the
coroutine, which results in exceptions/warnings, and obviously doesn't
actually execute whatever the callback handler does
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** Prompt wrapping requirements have been implemented on
the service side of AWS Bedrock for the Anthropic Claude models to
provide parity between Anthropic's offering and Bedrock's offering. This
overnight change broke most existing implementations of Claude, Bedrock
and Langchain. This PR just steals the the Anthropic LLM implementation
to enforce alias/role wrapping and implements it in the existing
mechanism for building the request body. This has also been tested to
fix the chat_model implementation as well. Happy to answer any further
questions or make changes where necessary to get things patched and up
to PyPi ASAP, TY.
- **Issue:** No issue opened at the moment, though will update when
these roll in.
- **Dependencies:** None
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description:
NotionDB supports a number of common property types. I have found three
common types that are not included in notiondb loader. When programs
loaded them with notiondb, which will cause some metadata information
not to be passed to langchain. Therefore, I added three common types:
- date
- created_time
- last_edit_time.
### Issue:
no
### Dependencies:
No dependencies added :)
### Tag maintainer:
@rlancemartin, @eyurtsev
### Twitter handle:
@BJTUTC
Reverts langchain-ai/langchain#8610
this is actually an oversight - this merges all dfs into one df. we DO
NOT want to do this - the idea is we work and manipulate multiple dfs
This removes the use of the intermediate df list and directly
concatenates the dataframes if path is a list of strings. The pd.concat
function combines the dataframes efficiently, making it faster and more
memory-efficient compared to appending dataframes to a list.
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- Description: this PR adds the support for arxiv identifier of the
ArxivAPIWrapper. I modified the `run()` and `load()` functions in
`arxiv.py`, using regex to recognize if the query is in the form of
arxiv identifier (see
[https://info.arxiv.org/help/find/index.html](https://info.arxiv.org/help/find/index.html)).
If so, it will directly search the paper corresponding to the arxiv
identifier. I also modified and added tests in `test_arxiv.py`.
- Issue: #9047
- Dependencies: N/A
- Tag maintainer: N/A
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
The new Fireworks and FireworksChat implementations are awesome! Added
in this PR https://github.com/langchain-ai/langchain/pull/11117 thank
you @ZixinYang
However, I think stop words were not plumbed correctly. I've made some
simple changes to do that, and also updated the notebook to be a bit
clearer with what's needed to use both new models.
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
The intermediate steps example in docs has an example on how to retrieve
and display the intermediate steps.
But the intermediate steps object is of type AgentAction which cannot be
passed to json.dumps (it raises an error).
I replaced it with Langchain's dumps function (from langchain.load.dump
import dumps) which is the preferred way to do so.
**Description:**
As long as `enforce_stop_tokens` returns a first occurrence, we can
speed up the execution by setting the optional `maxsplit` parameter to
1.
Tag maintainer:
@agola11
@hwchase17
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** New metadata fields were added to
`unstructured==0.10.15`, and our hosted api has been updated to reflect
this. When users call `partition_via_api` with an older version of the
library, they'll hit a parsing error related to the new fields.
Description
* Refactor Fireworks within Langchain LLMs.
* Remove FireworksChat within Langchain LLMs.
* Add ChatFireworks (which uses chat completion api) to Langchain chat
models.
* Users have to install `fireworks-ai` and register an api key to use
the api.
Issue - Not applicable
Dependencies - None
Tag maintainer - @rlancemartin @baskaryan
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:**: Adds LLM as a judge as an eval chain
- **Tag maintainer:** @hwchase17
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
This enables bulk args like `chunk_size` to be passed down from the
ingest methods (from_text, from_documents) to be passed down to the bulk
API.
This helps alleviate issues where bulk importing a large amount of
documents into Elasticsearch was resulting in a timeout.
Contribution Shoutout
- @elastic
- [x] Updated Integration tests
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Fixed navbar:
- renamed several files, so ToC is sorted correctly
- made ToC items consistent: formatted several Titles
- added several links
- reformatted several docs to a consistent format
- renamed several files (removed `_example` suffix)
- added renamed files to the `docs/docs_skeleton/vercel.json`
Sometimes you don't want the LLM to be aware of the whole graph schema,
and want it to ignore parts of the graph when it is constructing Cypher
statements.
- **Description**: Adding retrievers for [kay.ai](https://kay.ai) and
SEC filings powered by Kay and Cybersyn. Kay provides context as a
service: it's an API built for RAG.
- **Issue**: N/A
- **Dependencies**: Just added a dep to the
[kay](https://pypi.org/project/kay/) package
- **Tag maintainer**: @baskaryan @hwchase17 Discussed in slack
- **Twtter handle:** [@vishalrohra_](https://twitter.com/vishalrohra_)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
The huggingface pipeline in langchain (used for locally hosted models)
does not support batching. If you send in a batch of prompts, it just
processes them serially using the base implementation of _generate:
https://github.com/docugami/langchain/blob/master/libs/langchain/langchain/llms/base.py#L1004C2-L1004C29
This PR adds support for batching in this pipeline, so that GPUs can be
fully saturated. I updated the accompanying notebook to show GPU batch
inference.
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Closes#8842
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
-->
- Description: fix `ChatMessageChunk` concat error
- Issue: #10173
- Dependencies: None
- Tag maintainer: @baskaryan, @eyurtsev, @rlancemartin
- Twitter handle: None
---------
Co-authored-by: wangshuai.scotty <wangshuai.scotty@bytedance.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
This PR aims at showcasing how to use vLLM's OpenAI-compatible chat API.
### Context
Lanchain already supports vLLM and its OpenAI-compatible `Completion`
API. However, the `ChatCompletion` API was not aligned with OpenAI and
for this reason I've waited for this
[PR](https://github.com/vllm-project/vllm/pull/852) to be merged before
adding this notebook to langchain.
### Description
This PR makes the following changes to OpenSearch:
1. Pass optional ids with `from_texts`
2. Pass an optional index name with `add_texts` and `search` instead of
using the same index name that was used during `from_texts`
### Issue
https://github.com/langchain-ai/langchain/issues/10967
### Maintainers
@rlancemartin, @eyurtsev, @navneet1v
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
LLMRails Embedding Integration
This PR provides integration with LLMRails. Implemented here are:
langchain/embeddings/llm_rails.py
docs/extras/integrations/text_embedding/llm_rails.ipynb
Hi @hwchase17 after adding our vectorstore integration to langchain with
confirmation of you and @baskaryan, now we want to add our embedding
integration
---------
Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Adds support for gradient.ai's embedding model.
This will remain a Draft, as the code will likely be refactored with the
`pip install gradientai` python sdk.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** a fix for `index`.
- **Issue:** Not applicable.
- **Dependencies:** None
- **Tag maintainer:**
- **Twitter handle:** richarddwang
# Problem
Replication code
```python
from pprint import pprint
from langchain.embeddings import OpenAIEmbeddings
from langchain.indexes import SQLRecordManager, index
from langchain.schema import Document
from langchain.vectorstores import Qdrant
from langchain_setup.qdrant import pprint_qdrant_documents, create_inmemory_empty_qdrant
# Documents
metadata1 = {"source": "fullhell.alchemist"}
doc1_1 = Document(page_content="1-1 I have a dog~", metadata=metadata1)
doc1_2 = Document(page_content="1-2 I have a daugter~", metadata=metadata1)
doc1_3 = Document(page_content="1-3 Ahh! O..Oniichan", metadata=metadata1)
doc2 = Document(page_content="2 Lancer died again.", metadata={"source": "fate.docx"})
# Create empty vectorstore
collection_name = "secret_of_D_disk"
vectorstore: Qdrant = create_inmemory_empty_qdrant()
# Create record Manager
import tempfile
from pathlib import Path
record_manager = SQLRecordManager(
namespace="qdrant/{collection_name}",
db_url=f"sqlite:///{Path(tempfile.gettempdir())/collection_name}.sql",
)
record_manager.create_schema() # 必須
sync_result = index(
[doc1_1, doc1_2, doc1_2, doc2],
record_manager,
vectorstore,
cleanup="full",
source_id_key="source",
)
print(sync_result, end="\n\n")
pprint_qdrant_documents(vectorstore)
```
<details>
<summary>Code of helper functions `pprint_qdrant_documents` and
`create_inmemory_empty_qdrant`</summary>
```python
def create_inmemory_empty_qdrant(**from_texts_kwargs):
# Qdrant requires vector size, which can be only know after applying embedder
vectorstore = Qdrant.from_texts(["dummy"], location=":memory:", embedding=OpenAIEmbeddings(), **from_texts_kwargs)
dummy_document_id = vectorstore.client.scroll(vectorstore.collection_name)[0][0].id
vectorstore.delete([dummy_document_id])
return vectorstore
def pprint_qdrant_documents(vectorstore, limit: int = 100, **scroll_kwargs):
document_ids, documents = [], []
for record in vectorstore.client.scroll(
vectorstore.collection_name, limit=100, **scroll_kwargs
)[0]:
document_ids.append(record.id)
documents.append(
Document(
page_content=record.payload["page_content"],
metadata=record.payload["metadata"] or {},
)
)
pprint_documents(documents, document_ids=document_ids)
def pprint_document(document: Document = None, document_id=None, return_string=False):
displayed_text = ""
if document_id:
displayed_text += f"Document {document_id}:\n\n"
displayed_text += f"{document.page_content}\n\n"
metadata_text = pformat(document.metadata, indent=1)
if "\n" in metadata_text:
displayed_text += f"Metadata:\n{metadata_text}"
else:
displayed_text += f"Metadata:{metadata_text}"
if return_string:
return displayed_text
else:
print(displayed_text)
def pprint_documents(documents, document_ids=None):
if not document_ids:
document_ids = [i + 1 for i in range(len(documents))]
displayed_texts = []
for document_id, document in zip(document_ids, documents):
displayed_text = pprint_document(
document_id=document_id, document=document, return_string=True
)
displayed_texts.append(displayed_text)
print(f"\n{'-' * 100}\n".join(displayed_texts))
```
</details>
You will get
```
{'num_added': 3, 'num_updated': 0, 'num_skipped': 0, 'num_deleted': 0}
Document 1b19816e-b802-53c0-ad60-5ff9d9b9b911:
1-2 I have a daugter~
Metadata:{'source': 'fullhell.alchemist'}
----------------------------------------------------------------------------------------------------
Document 3362f9bc-991a-5dd5-b465-c564786ce19c:
1-1 I have a dog~
Metadata:{'source': 'fullhell.alchemist'}
----------------------------------------------------------------------------------------------------
Document a4d50169-2fda-5339-a196-249b5f54a0de:
1-2 I have a daugter~
Metadata:{'source': 'fullhell.alchemist'}
```
This is not correct. We should be able to expect that the vectorsotre
now includes doc1_1, doc1_2, and doc2, but not doc1_1, doc1_2, and
doc1_2.
# Reason
In `index`, the original code is
```python
uids = []
docs_to_index = []
for doc, hashed_doc, doc_exists in zip(doc_batch, hashed_docs, exists_batch):
if doc_exists:
# Must be updated to refresh timestamp.
record_manager.update([hashed_doc.uid], time_at_least=index_start_dt)
num_skipped += 1
continue
uids.append(hashed_doc.uid)
docs_to_index.append(doc)
```
In the aforementioned example, `len(doc_batch) == 4`, but
`len(hashed_docs) == len(exists_batch) == 3`. This is because the
deduplication of input documents [doc1_1, doc1_2, doc1_2, doc2] is
[doc1_1, doc1_2, doc2]. So `index` insert doc1_1, doc1_2, doc1_2 with
the uid of doc1_1, doc1_2, doc2.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This PR makes `ChatAnthropic.anthropic_api_key` a `pydantic.SecretStr`
to avoid inadvertently exposing API keys when the `ChatAnthropic` object
is represented as a str.
**Description**
Fixes broken link to `CONTRIBUTING.md` in `libs/langchain/README.md`.
Because`libs/langchain/README.md` was copied from the top level README,
and because the README contains a link to `.github/CONTRIBUTING.md`, the
copied README's link relative path must be updated. This commit fixes
that link.
@@ -5,10 +5,10 @@ This project includes a [dev container](https://containers.dev/), which lets you
You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).
## GitHub Codespaces
[](https://codespaces.new/hwchase17/langchain)
[](https://codespaces.new/langchain-ai/langchain)
You may use the button above, or follow these steps to open this repo in a Codespace:
1. Click the **Code** drop-down menu at the top of https://github.com/hwchase17/langchain.
1. Click the **Code** drop-down menu at the top of https://github.com/langchain-ai/langchain.
1. Click on the **Codespaces** tab.
1. Click **Create codespace on master** .
@@ -17,13 +17,16 @@ For more info, check out the [GitHub documentation](https://docs.github.com/en/f
## VS Code Dev Containers
[](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
Note: If you click this link you will open the main repo and not your local cloned repo, you can use this link and replace with your username and cloned repo name:
Note: If you click the link above you will open the main repo (langchain-ai/langchain) and not your local cloned repo. This is fine if you only want to run and test the library, but if you want to contribute you can use the link below and replace with your username and cloned repo name:
Then you will have a local cloned repo where you can contribute and then create pull requests.
If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
You can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
Alternatively you can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).
Hi there! Thank you for even being interested in contributing to LangChain.
As an opensource project in a rapidly developing field, we are extremely open
to contributions, whether they be in the form of new features, improved infra, better documentation, or bug fixes.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
## 🗺️ Guidelines
### 👩💻 Contributing Code
To contribute to this project, please follow a ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
To contribute to this project, please follow the ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
Please do not try to push directly to this repo unless you are a maintainer.
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant
maintainers.
Pull requests cannot land without passing the formatting, linting and testing checks first. See
[Common Tasks](#-common-tasks) for how to run these checks locally.
Pull requests cannot land without passing the formatting, linting, and testing checks first. See [Testing](#testing) and
[Formatting and Linting](#formatting-and-linting) for how to run these checks locally.
It's essential that we maintain great documentation and testing. If you:
- Fix a bug
@@ -24,19 +23,17 @@ It's essential that we maintain great documentation and testing. If you:
- Update any affected example notebooks and documentation. These live in `docs`.
- Update unit and integration tests when relevant.
- Add a feature
- Add a demo notebook in `docs/modules`.
- Add a demo notebook in `docs/docs/`.
- Add unit and integration tests.
We're a small, building-oriented team. If there's something you'd like to add or change, opening a pull request is the
We are a small, progress-oriented team. If there's something you'd like to add or change, opening a pull request is the
best way to get our attention.
### 🚩GitHub Issues
Our [issues](https://github.com/hwchase17/langchain/issues) page is kept up to date
with bugs, improvements, and feature requests.
Our [issues](https://github.com/langchain-ai/langchain/issues) page is kept up to date with bugs, improvements, and feature requests.
There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help
organize issues.
There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help organize issues.
If you start working on an issue, please assign it to yourself.
@@ -59,47 +56,115 @@ we do not want these to get in the way of getting good code into the codebase.
## 🚀 Quick Start
> **Note:** You can run this repository locally (which is described below) or in a [development container](https://containers.dev/) (which is described in the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer)).
This quick start guide explains how to run the repository locally.
For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/langchain-ai/langchain/tree/master/.devcontainer).
This project uses [Poetry](https://python-poetry.org/) v1.5.1 as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.
### Dependency Management: Poetry and other env/dependency managers
❗Note: If you use `Conda` or `Pyenv` as your environment/package manager, avoid dependency conflicts by doing the following first:
1.*Before installing Poetry*, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
2. Install Poetry v1.5.1 (see above)
3. Tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
4. Continue with the following steps.
This project utilizes [Poetry](https://python-poetry.org/) v1.6.1+ as a dependency manager.
There are two separate projects in this repository:
-`langchain`: core langchain code, abstractions, and use cases
-`langchain.experimental`: more experimental code
❗Note: *Before installing Poetry*, if you use `Conda`, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
Each of these has their OWN development environment.
In order to run any of the commands below, please move into their respective directories.
For example, to contribute to `langchain` run `cd libs/langchain` before getting started with the below.
Install Poetry: **[documentation on how to install it](https://python-poetry.org/docs/#installation)**.
To install requirements:
❗Note: If you use `Conda` or `Pyenv` as your environment/package manager, after installing Poetry,
tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
### Core vs. Experimental
This repository contains three separate projects:
-`langchain`: core langchain code, abstractions, and use cases.
-`langchain_core`: contain interfaces for key abstractions as well as logic for combining them in chains (LCEL).
-`langchain_experimental`: see the [Experimental README](https://github.com/langchain-ai/langchain/tree/master/libs/experimental/README.md) for more information.
Each of these has its own development environment. Docs are run from the top-level makefile, but development
is split across separate test & release flows.
For this quickstart, start with langchain core:
```bash
cd libs/langchain
```
### Local Development Dependencies
Install langchain development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):
```bash
poetry install --with test
```
This will install all requirements for running the package, examples, linting, formatting, tests, and coverage.
❗Note: If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running Poetry v1.5.1. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases. If you are still seeing this bug on v1.5.1, you may also try disabling "modern installation" (`poetry config installer.modern-installation false`) and re-installing requirements. See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
Now assuming `make` and `pytest` are installed, you should be able to run the common tasks in the following section. To double check, run `make test` under `libs/langchain`, all tests should pass. If they don't, you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
## ✅ Common Tasks
Type `make` for a list of common tasks.
### Code Formatting
Formatting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/) and [isort](https://pycqa.github.io/isort/).
To run formatting for this project:
Then verify dependency installation:
```bash
make test
```
If the tests don't pass, you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running
Poetry v1.6.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
If you are still seeing this bug on v1.6.1, you may also try disabling "modern installation"
(`poetry config installer.modern-installation false`) and re-installing requirements.
See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
### Testing
_some test dependencies are optional; see section about optional dependencies_.
Unit tests cover modular logic that does not require calls to outside APIs.
If you add new logic, please add a unit test.
To run unit tests:
```bash
make test
```
To run unit tests in Docker:
```bash
make docker_tests
```
There are also [integration tests and code-coverage](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests/README.md) available.
### Only develop langchain_core or langchain_experimental
If you are only developing `langchain_core` or `langchain_experimental`, you can simply install the dependencies for the respective projects and run tests:
```bash
cd libs/core
poetry install --with test
make test
```
Or:
```bash
cd libs/experimental
poetry install --with test
make test
```
### Formatting and Linting
Run these locally before submitting a PR; the CI system will check also.
#### Code Formatting
Formatting for this project is done via [ruff](https://docs.astral.sh/ruff/rules/).
To run formatting for docs, cookbook and templates:
```bash
make format
```
To run formatting for a library, run the same command from the relevant library directory:
```bash
cd libs/{LIBRARY}
make format
```
@@ -111,16 +176,23 @@ make format_diff
This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.
### Linting
#### Linting
Linting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](http://mypy-lang.org/).
Linting for this project is done via a combination of [ruff](https://docs.astral.sh/ruff/rules/) and [mypy](http://mypy-lang.org/).
To run linting for this project:
To run linting for docs, cookbook and templates:
```bash
make lint
```
To run linting for a library, run the same command from the relevant library directory:
```bash
cd libs/{LIBRARY}
make lint
```
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:
```bash
@@ -131,7 +203,7 @@ This can be very helpful when you've made changes to only certain parts of the p
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
### Spellcheck
#### Spellcheck
Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
Note that `codespell` finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
@@ -157,20 +229,14 @@ If codespell is incorrectly flagging a word, you can skip spellcheck for that wo
Is there any way that you could help, e.g. by submitting a PR? Make sure to read the CONTRIBUTING.MD [readme](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md)
Is there any way that you could help, e.g. by submitting a PR? Make sure to read the CONTRIBUTING.MD [readme](https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md)
Input to this tool is a detailed question about the dataset, output is a result from the dataset. It will try to answer the question using the dataset, and if it cannot, it will ask for clarification.
"LLM chain for QueryPowerBITool must have input variables ['tool_input', 'tables', 'schemas', 'examples'], found %s",# noqa: C0301 E501 # pylint: disable=C0301
logger.info("0 records in result, query was valid.")
return(
None,
"0 rows returned, this might be correct, but please validate if all filter values were correct?",# noqa: E501
)
result=json_to_md(rows)
too_long,length=self._result_too_large(result)
iftoo_long:
return(
f"Result too large, please try to be more specific or use the `TOPN` function. The result is {length} tokens long, the limit is {self.output_token_limit} tokens.",# noqa: E501
"""Get a value from a dictionary or an environment variable."""
ifenv_keyinos.environandos.environ[env_key]:
returnos.environ[env_key]
elifdefaultisnotNone:
returndefault
else:
raiseValueError(
f"Did not find {key}, please add an environment variable"
f" `{env_key}` which contains it, or pass"
f" `{key}` as a named parameter."
)
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.