Compare commits

...

184 Commits

Author SHA1 Message Date
Harrison Chase
370f2e944e cr 2023-12-06 14:42:29 -08:00
Harrison Chase
50017094f2 cr 2023-12-06 14:37:21 -08:00
Amy L
66162ddf82 MongoDB agent fixes (#14362)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
- **Description:** Moved the agent designed to interact with and query a
MongoDB database using PyMongo into experimental, and fixed the unit
test issues that came about using MongoDB Unix sockets.
- **Issue:** There were some some test workflows not running in
`Harrison/mongo-agent` due to using nonexistent sockets, and Mongo agent
should have been moved to experimental due to concerns mentioned in
#13991
  - **Dependencies:** `pymongo` in experimental, not optional
- Reason: Github workflows were throwing `pymongo ModuleNotFound` errors
when it was optional, even after adding it to extended testing, running
poetry lock, doing everything that the testing documentation said to do
  - **Related:** #14056, #13991
  - **Tag maintainers:** @hwchase17 @efriis

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-06 14:31:10 -08:00
Bagatur
ca21ef51a1 cr 2023-12-03 16:11:54 -08:00
Harrison Chase
09990827a0 Merge branch 'Haoming-jpg-3399-mongo-database-agent' into harrison/mongo-agent 2023-11-29 22:28:08 -05:00
Harrison Chase
c45b8ef283 cr 2023-11-29 22:28:00 -05:00
Harrison Chase
d7f693ab0a Update libs/langchain/langchain/tools/sql_database/tool.py 2023-11-29 21:36:08 -05:00
Harrison Chase
fa3a2630fc Update libs/langchain/langchain/tools/sql_database/tool.py 2023-11-29 21:36:03 -05:00
Harrison Chase
717cc4f3e8 Update libs/langchain/langchain/tools/sql_database/tool.py 2023-11-29 21:35:58 -05:00
Harrison Chase
ad93c6d296 Update docs/docs/integrations/toolkits/mongo_database.ipynb 2023-11-29 21:35:53 -05:00
amiaxys
14bc9e399b refactor: remove unused function 2023-11-28 14:33:59 -05:00
Amy L
ebe609987d Amy mongo integration (#3)
Fix formatting/linting and poetry.lock
2023-11-27 20:49:19 -05:00
maplepolis
c9309b3995 Merge branch 'Amy-mongo-integration' into dev 2023-11-27 18:04:24 -05:00
amiaxys
9564c1f3b2 fix: remove openai key 2023-11-27 17:51:37 -05:00
amiaxys
8b2c763cb8 fix: fix AI input issues + Mongo shell command output 2023-11-27 17:50:57 -05:00
amiaxys
341d1bc5f6 fix: exposed api key 2023-11-27 00:55:18 -05:00
amiaxys
b5b59b961a feat: added more to Mongo database notebook + fixed errors in prompts 2023-11-26 22:47:55 -05:00
amiaxys
6e11622625 fix: Mongo agent test not passing 2023-11-26 20:53:37 -05:00
amiaxys
14b03a85f1 fix: remove lint errors 2023-11-26 20:31:29 -05:00
amiaxys
f4e582ee09 fix: testing errors related to pymongo + imports 2023-11-26 20:20:23 -05:00
amiaxys
48a83dd731 fix: lint errors + minor Mongo database issues 2023-11-26 18:59:31 -05:00
amiaxys
1cb6e91b01 feat: fix unused variables + add mongo agent test file 2023-11-26 18:15:51 -05:00
amiaxys
71f882d616 feat: shortened Mongo database class + added tests and juptyer notebook file 2023-11-26 17:48:08 -05:00
amiaxys
327ee26af8 feat: add basic Mongo database unit test file 2023-11-26 17:08:10 -05:00
amiaxys
cecc0b5679 fix: _id showing as not unique in indexes + rename document name function 2023-11-26 15:47:07 -05:00
amiaxys
6d811c0ba5 fix: Mongo database agent toolkit/class errors 2023-11-26 15:30:12 -05:00
amiaxys
0b8cfdd4f0 Merge remote-tracking branch 'origin/Kai-toolkit.py' into Amy-mongo-integration 2023-11-25 15:56:05 -05:00
amiaxys
ac860688d1 fix: linting errors in mongo tools/database 2023-11-25 15:55:57 -05:00
maplepolis
45e4b6a556 Finished mongo toolkit base, prompt, and toolkit 2023-11-23 21:07:56 -05:00
Haoming Hu
8fc76a1ad1 Merge pull request #2 from Haoming-jpg/Haoming-tool.py
Re-write the MongoDB query checker
2023-11-23 19:18:26 -05:00
Haoming-jpg
53b1d82555 Re-write the MongoDB query checker 2023-11-23 19:17:53 -05:00
maplepolis
839a9f8f95 Added files 2023-11-23 19:12:57 -05:00
amiaxys
234a8d2579 fix: nonexistent variables/functions + run format for Mongo database 2023-11-23 19:11:42 -05:00
Haoming Hu
079d20cfe6 Merge pull request #1 from Haoming-jpg/Haoming-tool.py
Haoming tool.py
2023-11-23 16:45:48 -05:00
Haoming-jpg
e26672914c Basic structure of MongoDB tool 2023-11-23 16:45:02 -05:00
Haoming-jpg
227c65b4d3 Merge remote-tracking branch 'origin/dev' into Haoming-tool.py 2023-11-23 16:12:07 -05:00
amiaxys
e286f30f8e Merge upstream dev 2023-11-23 16:11:39 -05:00
Bagatur
65c4ea5067 Revert "INFRA: temp rm master condition (#13753)" (#13759) 2023-11-23 16:06:19 -05:00
Bagatur
fc898d7797 INFRA: temp rm master condition (#13753) 2023-11-23 16:06:19 -05:00
Bagatur
b5f8d6305a IMPROVEMENT: filter global warnings properly (#13754) 2023-11-23 16:06:19 -05:00
William FH
68e7f81686 Add Batch Size kwarg to the llm start callback (#13483)
So you can more easily use the token counts directly from the API
endpoint for batch size of 1
2023-11-23 16:06:19 -05:00
Bagatur
19a511d19c DOCS: core editable dep api refs (#13747) 2023-11-23 16:06:19 -05:00
Bagatur
132a07592e RELEASE: 0.0.339rc1 (#13746) 2023-11-23 16:06:19 -05:00
Bagatur
4d86bb6894 RELEASE: core 0.0.4 (#13745) 2023-11-23 16:06:19 -05:00
Bagatur
34144e8e6a INFRA: run LC ci after core changes (#13742) 2023-11-23 16:06:19 -05:00
Bagatur
26a85f199b DOCS: fix core api ref build (#13744) 2023-11-23 16:06:19 -05:00
Bagatur
1f37594667 REFACTOR: combine core documents files (#13733) 2023-11-23 16:06:19 -05:00
h3l
7335fdf95e DOCS: Fix typo/line break in python code (#13708) 2023-11-23 16:06:19 -05:00
William FH
6394b1b545 Fix locking (#13725) 2023-11-23 16:06:19 -05:00
Bagatur
094777e10e BUGFIX: add prompt imports for backwards compat (#13702) 2023-11-23 16:06:19 -05:00
Erick Friis
469734f525 TEMPLATES Metadata (#13691)
Co-authored-by: Lance Martin <lance@langchain.dev>
2023-11-23 16:06:19 -05:00
Bagatur
9890396dbf IMPROVEMENT: Conditionally import core type hints (#13700) 2023-11-23 16:06:19 -05:00
dandanwei
139d9e7473 BUGFIX: redis vector store overwrites falsey metadata (#13652)
- **Description:** This commit fixed the problem that Redis vector store
will change the value of a metadata from 0 to empty when saving the
document, which should be an un-intended behavior.
  - **Issue:** N/A
  - **Dependencies:** N/A
2023-11-23 16:06:19 -05:00
Bagatur
01dadd679d BUGFIX: llm backwards compat imports (#13698) 2023-11-23 16:06:19 -05:00
Yujie Qian
5dc865539b IMPROVEMENT: VoyageEmbeddings embed_general_texts (#13620)
- **Description:** add method embed_general_texts in VoyageEmebddings to
support input_type
  - **Issue:** 
  - **Dependencies:** 
  - **Tag maintainer:** 
  - **Twitter handle:** @Voyage_AI_
2023-11-23 16:06:19 -05:00
tanujtiwari-at
bc6bde7252 BUGFIX: handle tool message type when converting to string (#13626)
**Description:** Currently, if we pass in a ToolMessage back to the
chain, it crashes with error

`Got unsupported message type: `

This fixes it. 

Tested locally

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-23 16:06:19 -05:00
Josep Pon Farreny
622e02571d Added partial_variables to BaseStringMessagePromptTemplate.from_template(...) (#13645)
**Description:** BaseStringMessagePromptTemplate.from_template was
passing the value of partial_variables into cls(...) via **kwargs,
rather than passing it to PromptTemplate.from_template. Which resulted
in those *partial_variables being* lost and becoming required
*input_variables*.

Co-authored-by: Josep Pon Farreny <josep.pon-farreny@siemens.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-23 16:06:19 -05:00
Erick Friis
9ac9c17994 INFRA: Lint for imports (#13632)
- Adds pydantic/import linting to core
- Adds a check for `langchain_experimental` imports to langchain
2023-11-23 16:06:19 -05:00
Erick Friis
6ae39133c1 BUGFIX: anthropic models on bedrock (#13629)
Introduced in #13403
2023-11-23 16:06:19 -05:00
David Ruan
79e1461036 BUGFIX: Update bedrock.py to fix provider bug (#13646)
Provider check was incorrectly failing for anything other than "meta"
2023-11-23 16:06:19 -05:00
Guangya Liu
7af9f67ab7 DOCS: remove openai api key from cookbook (#13633) 2023-11-23 16:06:19 -05:00
Guangya Liu
56859e893c DOCS: fixed import error for BashOutputParser (#13680) 2023-11-23 16:06:19 -05:00
Bagatur
821c308a5d IMPROVEMENT: bump core dep 0.0.3 (#13690) 2023-11-23 16:06:19 -05:00
Bagatur
ab7f41fd99 add callback import test (#13689) 2023-11-23 16:06:19 -05:00
Bagatur
cb8ac333e6 BUG: Add core utils imports (#13688) 2023-11-23 16:06:19 -05:00
Bagatur
545c9f64e7 BUG: more core fixes (#13665)
Fix some circular deps:
- move PromptValue into top level module bc both PromptTemplates and
OutputParsers import
- move tracer context vars to `tracers.context` and import them in
functions in `callbacks.manager`
- add core import tests
2023-11-23 16:06:19 -05:00
William FH
fe4b3c48e3 Update name (#13676) 2023-11-23 16:06:19 -05:00
Erick Friis
16089651fb CLI 0.0.19 (#13677) 2023-11-23 16:06:19 -05:00
Taqi Jaffri
9f98a97390 docugami cookbook (#13183)
Adds a cookbook for semi-structured RAG via Docugami. This follows the
same outline as the semi-structured RAG with Unstructured cookbook:
https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_Structured_RAG.ipynb

The main change is this cookbook uses Docugami instead of Unstructured
to find text and tables, and shows how XML markup in the output helps
with retrieval and generation.

We are \@docugami on twitter, I am \@tjaffri

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-11-23 16:06:19 -05:00
jakerachleff
2cc9cf09ce update langserve to v0.0.30 (#13673)
Upgrade langserve template version to 0.0.30 to include new improvements
2023-11-23 16:06:19 -05:00
jakerachleff
4fd13680df fix templates dockerfile (#13672)
- **Description:** We need to update the Dockerfile for templates to
also copy your README.md. This is because poetry requires that a readme
exists if it is specified in the pyproject.toml
2023-11-23 16:06:19 -05:00
Bagatur
255e67b231 bump 0.0.339rc0 (#13664) 2023-11-23 16:06:19 -05:00
Bagatur
f53e0be4a7 REFACTOR: Refactor langchain_core (#13627)
Changes:
- remove langchain_core/schema since no clear distinction b/n schema and
non-schema modules
- make every module that doesn't end in -y plural
- where easy have 1-2 classes per file
- no more than one level of nesting in directories
- only import from top level core modules in langchain
2023-11-23 16:06:18 -05:00
William FH
6f278120c6 Add error rate (#13568)
To the in-memory outputs. Separate it out from the outputs so it's
present in the dataframe.describe() results
2023-11-23 16:06:18 -05:00
Nuno Campos
63f13a46a2 Use pytest asyncio auto mode (#13643)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-23 16:06:18 -05:00
Lance Martin
7369d648b1 Add template for gpt-crawler (#13625)
Template for RAG using
[gpt-crawler](https://github.com/BuilderIO/gpt-crawler).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:18 -05:00
Bagatur
633e06dbcb REFACTOR: Add core as dep (#13623) 2023-11-23 16:06:18 -05:00
Harrison Chase
308a0d83ef Separate out langchain_core package (#13577)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:18 -05:00
Bagatur
447e4c6f35 DOCS: update rag use case images (#13615) 2023-11-23 16:06:18 -05:00
Bagatur
c9ebd8a866 RELEASE: bump 339 (#13613) 2023-11-23 16:06:18 -05:00
Ofer Mendelevitch
81dd259480 BUG: Fix search_kwargs in Vectara retriever (#13299)
- **Description:** fix a bug that prevented as_retriever() in Vectara to
use the desired input arguments
  - **Issue:** as_retriever did not pass the arguments properly
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ofermend
2023-11-23 16:06:18 -05:00
Holt Skinner
434a3825cf IMPROVEMENT: Reduce post-processing time for DocAIParser (#13210)
- Remove `WrappedDocument` introduced in
https://github.com/langchain-ai/langchain/pull/11413
- https://github.com/googleapis/python-documentai-toolbox/issues/198 in
Document AI Toolbox to improve initialization time for `WrappedDocument`
object.

@lkuligin

@baskaryan

@hwchase17
2023-11-23 16:06:18 -05:00
Leonid Kuligin
5cf7959285 fixed an UnboundLocalError when no documents are found (#12995)
Replace this entire comment with:
  - **Description:** fixed a bug
  - **Issue:** the issue # #12780
2023-11-23 16:06:18 -05:00
Stijn Tratsaert
c386265acb VertexAI LLM count_tokens method requires list of prompts (#13451)
I encountered this during summarization with VertexAI. I was receiving
an INVALID_ARGUMENT error, as it was trying to send a list of about
17000 single characters.

The [count_tokens
method](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/language_models/_language_models.py#L658)
made available by Google takes in a list of prompts. It does not fail
for small texts, but it does for longer documents because the argument
list will be exceeding Googles allowed limit. Enforcing the list type
makes it work successfully.

This change will cast the input text to count to a list of that single
text so that the input format is always correct.

[Twitter](https://www.x.com/stijn_tratsaert)
2023-11-23 16:06:18 -05:00
Wang Wei
abc3e867ba feat: add ERNIE-Bot-4 Function Calling (#13320)
- **Description:** ERNIE-Bot-Chat-4 Large Language Model adds the
ability of `Function Calling` by passing parameters through the
`functions` parameter in the request. To simplify function calling for
ERNIE-Bot-Chat-4, the `create_ernie_fn_chain()` function has been added.
The definition and usage of the `create_ernie_fn_chain()` function is
similar to that of the `create_openai_fn_chain()` function.

Examples as the follows:

```
import json

from langchain.chains.ernie_functions import (
    create_ernie_fn_chain,
)
from langchain.chat_models import ErnieBotChat
from langchain.prompts import ChatPromptTemplate

def get_current_news(location: str) -> str:
    """Get the current news based on the location.'

    Args:
        location (str): The location to query.
    
    Returs:
        str: Current news based on the location.
    """

    news_info = {
        "location": location,
        "news": [
            "I have a Book.",
            "It's a nice day, today."
        ]
    }

    return json.dumps(news_info)

def get_current_weather(location: str, unit: str="celsius") -> str:
    """Get the current weather in a given location

    Args:
        location (str): location of the weather.
        unit (str): unit of the tempuature.
    
    Returns:
        str: weather in the given location.
    """

    weather_info = {
        "location": location,
        "temperature": "27",
        "unit": unit,
        "forecast": ["sunny", "windy"],
    }
    return json.dumps(weather_info)

llm = ErnieBotChat(model_name="ERNIE-Bot-4")
prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{query}"),
    ]
)

chain = create_ernie_fn_chain([get_current_weather, get_current_news], llm, prompt, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```

The running results of the above program are shown below:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?



> Finished chain.
{'name': 'get_current_news', 'thoughts': '用户想要知道北京今天的新闻。我可以使用get_current_news工具来获取这些信息。', 'arguments': {'location': '北京'}}
```
2023-11-23 16:06:18 -05:00
Adilkhan Sarsen
1ba426de8f DeepLake Backwards compatibility fix (#13388)
- **Description:** during search with DeepLake some people are facing
backwards compatibility issues, this PR fixes it by making search
accessible for the older datasets

---------

Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>
2023-11-23 16:06:18 -05:00
Tyler Hutcherson
4577b46584 IMPROVEMENT: Minor redis improvements (#13381)
- **Description:**
- Fixes a `key_prefix` bug where passing it in on
`Redis.from_existing(...)` did not work properly. Updates doc strings
accordingly.
- Updates Redis filter classes logic with best practices on typing,
string formatting, and handling "empty" filters.
- Fixes a bug that would prevent multiple tag filters from being applied
together in some scenarios.
- Added a whole new filter unit testing module. Also updated code
formatting for a number of modules that were failing the `make`
commands.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** @tchutch94
2023-11-23 16:06:18 -05:00
Sijun He
ca9955f5f5 DOCS: Fix typo in MongoDB memory docs (#13588)
- **Description:** Fix typo in MongoDB memory docs
  - **Tag maintainer:** @eyurtsev

<!-- Thank you for contributing to LangChain!

  - **Description:** Fix typo in MongoDB memory docs
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
  - **Tag maintainer:** @baskaryan
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-23 16:06:18 -05:00
Sergey Kozlov
7a4cb85c38 Fix tool arguments formatting in StructuredChatAgent (#10480)
In the `FORMAT_INSTRUCTIONS` template, 4 curly braces (escaping) are
used to get single curly brace after formatting:

```
"{{{ ... }}}}" -> format_instructions.format() ->  "{{ ... }}" -> template.format() -> "{ ... }".
```

Tool's `args_schema` string contains single braces `{ ... }`, and is
also transformed to `{{{{ ... }}}}` form. But this is not really correct
since there is only one `format()` call:

```
"{{{{ ... }}}}" -> template.format() -> "{{ ... }}".
```

As a result we get double curly braces in the prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:

foo: Test tool FOO, args: {{'tool_input': {{'type': 'string'}}}}    # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:

```
{
  "action": $TOOL_NAME,
  "action_input": $INPUT
}
```
````

This PR fixes curly braces escaping in the `args_schema` to have single
braces in the final prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:

foo: Test tool FOO, args: {'tool_input': {'type': 'string'}}    # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:

```
{
  "action": $TOOL_NAME,
  "action_input": $INPUT
}
```
````

---------

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2023-11-23 16:06:18 -05:00
Wouter Durnez
645d8a7019 Add llama2-13b-chat-v1 support to chat_models.BedrockChat (#13403)
Hi 👋 We are working with Llama2 on Bedrock, and would like to add it to
Langchain. We saw a [pull
request](https://github.com/langchain-ai/langchain/pull/13322) to add it
to the `llm.Bedrock` class, but since it concerns a chat model, we would
like to add it to `BedrockChat` as well.

- **Description:** Add support for Llama2 to `BedrockChat` in
`chat_models`
- **Issue:** the issue # it fixes (if applicable)
[#13316](https://github.com/langchain-ai/langchain/issues/13316)
  - **Dependencies:** any dependencies required for this change `None`
  - **Tag maintainer:** /
  - **Twitter handle:** `@SimonBockaert @WouterDurnez`

---------

Co-authored-by: wouter.durnez <wouter.durnez@showpad.com>
Co-authored-by: Simon Bockaert <simon.bockaert@showpad.com>
2023-11-23 16:06:18 -05:00
jwbeck97
cb3b8d7a61 FEAT: Add azure cognitive health tool (#13448)
- **Description:** This change adds an agent to the Azure Cognitive
Services toolkit for identifying healthcare entities
  - **Dependencies:** azure-ai-textanalytics (Optional)

---------

Co-authored-by: James Beck <James.Beck@sa.gov.au>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-23 16:06:17 -05:00
Massimiliano Pronesti
7368249852 BUG: Limit Azure OpenAI embeddings chunk size (#13425)
Hi! 
This short PR aims at:
* Fixing `OpenAIEmbeddings`' check on `chunk_size` when used with Azure
OpenAI (thus with openai < 1.0). Azure OpenAI embeddings support at most
16 chunks per batch, I believe we are supposed to take the min between
the passed value/default value and 16, not the max - which, I suppose,
was introduced by accident while refactoring the previous version of
this check from this other PR of mine: #10707
* Porting this fix to the newest class (`AzureOpenAIEmbeddings`) for
openai >= 1.0

This fixes #13539 (closed but the issue persists).  

@baskaryan @hwchase17
2023-11-23 16:06:17 -05:00
Zeyang Lin
8975dfb8e7 DOCS: doc-string - langchain.vectorstores.dashvector.DashVector (#13502)
- **Description:** There are several mistakes in the sample code in the
doc-string of `DashVector` class, and this pull request aims to correct
them.
The correction code has been tested against latest version (at the time
of creation of this pull request) of: `langchain==0.0.336`
`dashvector==1.0.6` .
- **Issue:** No issue is created for this.
- **Dependencies:** No dependency is required for this change,
<!-- - **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below), -->
- **Twitter handle:** `zeyanglin`

<!-- Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
2023-11-23 16:06:17 -05:00
John Mai
e04ef6ed56 BUG: fix hunyuan appid type (#13496)
- **Description: fix hunyuan appid type
- **Issue:
https://github.com/langchain-ai/langchain/pull/12022#issuecomment-1815627855
2023-11-23 16:06:17 -05:00
Leonid Ganeline
5daf77292d docs updating AzureML notebooks (#13492)
- Added/updated descriptions and links

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:17 -05:00
Nicolò Boschi
4c8c63233a AstraDB: use includeSimilarity option instead of $similarity (#13512)
- **Description:** AstraDB is going to deprecate the `$similarity`
projection property in favor of the ´includeSimilarity´ option flag. I
moved all the queries to the new format.
- **Tag maintainer:** @hemidactylus 
- **Twitter handle:** nicoloboschi
2023-11-23 16:06:17 -05:00
shumpei
35a1cd2fea Introduce search_kwargs for Custom Parameters in BingSearchAPIWrapper (#13525)
Added a `search_kwargs` field to BingSearchAPIWrapper in
`bing_search.py,` enabling users to include extra keyword arguments in
Bing search queries. This update, like specifying language preferences,
adds more customization to searches. The `search_kwargs` seamlessly
merge with standard parameters in `_bing_search_results` method.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:17 -05:00
Nicolò Boschi
0b030d06f8 Fix Astra integration tests (#13520)
- **Description:** Fix Astra integration tests that are failing. The
`delete` always return True as the deletion is successful if no errors
are thrown. I aligned the test to verify this behaviour
  - **Tag maintainer:** @hemidactylus 
  - **Twitter handle:** nicoloboschi

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-23 16:06:17 -05:00
umair mehmood
53a6fd0863 fix: VLLMOpenAI -- create() got an unexpected keyword argument 'api_key' (#13517)
The issue was accuring because of `openai` update in Completions. its
not accepting `api_key` and 'api_base' args.

The fix is we check for the openai version and if ats v1 then remove
these keys from args before passing them to `Compilation.create(...)`
when sending from `VLLMOpenAI`

Fixed: #13507 

@eyu
@efriis 
@hwchase17

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:17 -05:00
Manuel Alemán Cueto
b7f6afe676 Fix for oracle schema parsing stated on the issue #7928 (#13545)
- **Description:** In this pull request, we address an issue related to
assigning a schema to the SQLDatabase class when utilizing an Oracle
database. The current implementation encounters a bug where, upon
attempting to execute a query, the alter session parse is not
appropriately defined for Oracle, leading to an error,
  - **Issue:** #7928,
  - **Dependencies:** No dependencies,
  - **Tag maintainer:** @baskaryan,

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-23 16:06:17 -05:00
Andrew Teeter
93a2dd2c56 feat: load all namespaces (#13549)
- **Description:** This change allows for the `MWDumpLoader` to load all
namespaces including custom by default instead of only loading the
[default
namespaces](https://www.mediawiki.org/wiki/Help:Namespaces#Localisation).
  - **Tag maintainer:** @hwchase17
2023-11-23 16:06:17 -05:00
Taranjeet Singh
a516099200 Add embedchain retriever (#13553)
**Description:**

This commit adds embedchain retriever along with tests and docs.
Embedchain is a RAG framework to create data pipelines.

**Twitter handle:**
- [Taranjeet's twitter](https://twitter.com/taranjeetio) and
[Embedchain's twitter](https://twitter.com/embedchain)

**Reviewer**
@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-23 16:06:17 -05:00
rafly lesmana
8b83df6d29 fix: Make YoutubeLoader support on demand language translation (#13583)
**Description:**
Enhance the functionality of YoutubeLoader to enable the translation of
available transcripts by refining the existing logic.

**Issue:**
Encountering a problem with YoutubeLoader (#13523) where the translation
feature is not functioning as expected.

Tag maintainers/contributors who might be interested:
@eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-23 16:06:17 -05:00
Leonid Ganeline
92d012d5af DOCS langchain decorators update (#13535)
added disclaimer

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2023-11-23 16:06:17 -05:00
Brace Sproul
fcc95b65c5 DOCS: updated langchain stack img to be svg (#13540) 2023-11-23 16:06:17 -05:00
Bagatur
a27422f26d bump 338, exp 42 (#13564) 2023-11-23 16:06:17 -05:00
Bagatur
b0d945679a update multi index templates (#13569) 2023-11-23 16:06:17 -05:00
Harrison Chase
8e682be8b2 move streaming stdout (#13559) 2023-11-23 16:06:17 -05:00
Leonid Ganeline
2bdfa85f85 BUG fixed openai_assistant namespace (#13543)
BUG: langchain.agents.openai_assistant has a reference as
`from langchain_experimental.openai_assistant.base import
OpenAIAssistantRunnable`
should be 
`from langchain.agents.openai_assistant.base import
OpenAIAssistantRunnable`

This prevents building of the API Reference docs
2023-11-23 16:06:17 -05:00
Bassem Yacoube
c71ede3edf IMPROVEMENT Adds support for new OctoAI endpoints (#13521)
small fix to add support for new OctoAI LLM endpoints
2023-11-23 16:06:17 -05:00
Mark Silverberg
e9f93c4c39 Fix typo/line break in the middle of a word (#13314)
- **Description:** a simple typo/extra line break fix
  - **Dependencies:** none
2023-11-23 16:06:17 -05:00
William FH
4bea0b1649 Use random seed (#13544)
For default eval llm
2023-11-23 16:06:17 -05:00
Martin Krasser
e31afdcef9 EXPERIMENTAL Generic LLM wrapper to support chat model interface with configurable chat prompt format (#8295)
## Update 2023-09-08

This PR now supports further models in addition to Lllama-2 chat models.
See [this comment](#issuecomment-1668988543) for further details. The
title of this PR has been updated accordingly.

## Original PR description

This PR adds a generic `Llama2Chat` model, a wrapper for LLMs able to
serve Llama-2 chat models (like `LlamaCPP`,
`HuggingFaceTextGenInference`, ...). It implements `BaseChatModel`,
converts a list of chat messages into the [required Llama-2 chat prompt
format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) and
forwards the formatted prompt as `str` to the wrapped `LLM`. Usage
example:

```python
# uses a locally hosted Llama2 chat model
llm = HuggingFaceTextGenInference(
    inference_server_url="http://127.0.0.1:8080/",
    max_new_tokens=512,
    top_k=50,
    temperature=0.1,
    repetition_penalty=1.03,
)

# Wrap llm to support Llama2 chat prompt format.
# Resulting model is a chat model
model = Llama2Chat(llm=llm)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    MessagesPlaceholder(variable_name="chat_history"),
    HumanMessagePromptTemplate.from_template("{text}"),
]

prompt = ChatPromptTemplate.from_messages(messages)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chain = LLMChain(llm=model, prompt=prompt, memory=memory)

# use chat model in a conversation
# ...
```

Also part of this PR are tests and a demo notebook.

- Tag maintainer: @hwchase17
- Twitter handle: `@mrt1nz`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:17 -05:00
William FH
a804ec8c9f Add execution time (#13542)
And warn instead of raising an error, since the chain API is too
inconsistent.
2023-11-23 16:06:17 -05:00
pedro-inf-custodio
e532e1612b IMPROVEMENT WebResearchRetriever error handling in urls with connection error (#13401)
- **Description:** Added a method `fetch_valid_documents` to
`WebResearchRetriever` class that will test the connection for every url
in `new_urls` and remove those that raise a `ConnectionError`.
- **Issue:** [Previous
PR](https://github.com/langchain-ai/langchain/pull/13353),
  - **Dependencies:** None,
  - **Tag maintainer:** @efriis 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
2023-11-23 16:06:17 -05:00
Piyush Jain
c8a4914587 IMPROVEMENT Neptune graph updates (#13491)
## Description
This PR adds an option to allow unsigned requests to the Neptune
database when using the `NeptuneGraph` class.

```python
graph = NeptuneGraph(
    host='<my-cluster>',
    port=8182,
    sign=False
)
```

Also, added is an option in the `NeptuneOpenCypherQAChain` to provide
additional domain instructions to the graph query generation prompt.
This will be injected in the prompt as-is, so you should include any
provider specific tags, for example `<instructions>` or `<INSTR>`.

```python
chain = NeptuneOpenCypherQAChain.from_llm(
    llm=llm,
    graph=graph,
    extra_instructions="""
    Follow these instructions to build the query:
    1. Countries contain airports, not the other way around
    2. Use the airport code for identifying airports
    """
)
```
2023-11-23 16:06:17 -05:00
William FH
d4f097051e Override Keys Option (#13537)
Should be able to override the global key if you want to evaluate
different outputs in a single run
2023-11-23 16:06:17 -05:00
Bagatur
c45e3d2f23 bump 337 (#13534) 2023-11-23 16:06:17 -05:00
Wietse Venema
a03927e813 TEMPLATE Add VertexAI Chuck Norris template (#13531)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:17 -05:00
Bagatur
866ab76dfe FEATURE: Runnable with message history (#13418)
Add RunnableWithMessageHistory class that can wrap certain runnables and manages chat history for them.
2023-11-23 16:06:17 -05:00
Bagatur
d461e25772 IMPROVEMENT: update assistants output and doc (#13480) 2023-11-23 16:06:17 -05:00
Bagatur
09a3acdfa0 TEMPLATES: Add multi-index templates (#13490)
One that routes and one that fuses

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:17 -05:00
Hugues Chocart
447729e1da [LLMonitorCallbackHandler] Various improvements (#13151)
Small improvements for the llmonitor callback handler, like better
support for non-openai models.


---------

Co-authored-by: vincelwt <vince@lyser.io>
2023-11-23 16:06:16 -05:00
Noah Stapp
40b3ae1294 Add Wrapping Library Metadata to MongoDB vector store (#13084)
**Description**
MongoDB drivers are used in various flavors and languages. Making sure
we exercise our due diligence in identifying the "origin" of the library
calls makes it best to understand how our Atlas servers get accessed.
2023-11-23 16:06:16 -05:00
Leonid Ganeline
92882f1374 DOCS updated data_connection index page (#13426)
- the `Index` section was missed. Created it.
- text simplification

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:16 -05:00
Guy Korland
008e6c17ea Add optional arguments to FalkorDBGraph constructor (#13459)
**Description:** Add optional arguments to FalkorDBGraph constructor
**Tag maintainer:** baskaryan 
**Twitter handle:** @g_korland
2023-11-23 16:06:16 -05:00
Leonid Ganeline
ea2579260a docs integrations/vectorstores/ cleanup (#13487)
- updated titles to consistent format
- added/updated descriptions and links
- format heading
2023-11-23 16:06:16 -05:00
Leonid Ganeline
73d39d689e DOCS updated async-faiss example (#13434)
The original notebook has the `faiss` title which is duplicated in
the`faiss.jpynb`. As a result, we have two `faiss` items in the
vectorstore ToC. And the first item breaks the searching order (it is
placed between `A...` items).
- I updated title to `Asynchronous Faiss`.
2023-11-23 16:06:16 -05:00
Erick Friis
d7d1d6cbb8 IMPROVEMENT Allow openai v1 in all templates that require it (#13489)
- pyproject change
- lockfiles
2023-11-23 16:06:16 -05:00
chris stucchio
ea22be9323 Bug: OpenAIFunctionsAgentOutputParser doesn't handle functions with no args (#13467)
**Description/Issue:** 
When OpenAI calls a function with no args, the args are `""` rather than
`"{}"`. Then `json.loads("")` blows up. This PR handles it correctly.

**Dependencies:** None
2023-11-23 16:06:16 -05:00
Yujie Qian
1b23836fed IMPROVEMENT: add input_type to VoyageEmbeddings (#13488)
- **Description:** add input_type to VoyageEmbeddings
2023-11-23 16:06:16 -05:00
David Duong
402b3b915d Add serialisation arguments to Bedrock and ChatBedrock (#13465) 2023-11-23 16:06:16 -05:00
Erick Friis
6ffb292701 IMPROVEMENT Lock pydantic v1 in app template, cli 0.0.18 (#13485) 2023-11-23 16:06:16 -05:00
Erick Friis
a40d622a6f BUG Fix app_name in cli app new (#13482) 2023-11-23 16:06:16 -05:00
Leonid Ganeline
479ac5357d DOCS updated memory Titles (#13435)
- Fixed titles for two notebooks. They were inconsistent with other
titles and clogged ToC.
- Added `Upstash` description and link
- Moved the authentication text up in the `Elasticsearch` nb, right
after package installation. It was on the end of the page which was a
wrong place.
2023-11-23 16:06:16 -05:00
ifduyue
b5469ec7e2 Use List instead of list (#13443)
Unify List usages in libs/langchain/langchain/text_splitter.py, only one
place it's `list`, all other ocurrences are `List`
2023-11-23 16:06:16 -05:00
Stefano Lottini
71a89cbb0e Astra DB: minor improvements to docstrings and demo notebook (#13449)
This PR brings a few minor improvements to the docs, namely class/method
docstrings and the demo notebook.

- A note on how to control concurrency levels to tune performance in
bulk inserts, both in the class docstring and the demo notebook;
- Slightly increased concurrency defaults after careful experimentation
(still on the conservative side even for clients running on
less-than-typical network/hardware specs)
- renamed the DB token variable to the standardized
`ASTRA_DB_APPLICATION_TOKEN` name (used elsewhere, e.g. in the Astra DB
docs)
- added a note and a reference (add_text docstring, demo notebook) on
allowed metadata field names.

Thank you!
2023-11-23 16:06:16 -05:00
Eugene Yurtsev
411526a83d Add ahandle_event to _all_ (#13469)
Add ahandle_event for backwards compatibility as it is used by langserve
2023-11-23 16:06:16 -05:00
Leonid Ganeline
3133404d05 DOCS fix for integratons/document_loaders sidebar (#13471)
The current `integrations/document_loaders/` sidebar has the
`example_data` item, which is a menu with a single item: "Notebook".
It is happening because the `integrations/document_loaders/` folder has
the `example_data/notebook.md` file that is used to autogenerate the
above menu item.
- removed an example_data/notebook.md file. Docusaurus doesn't have
simple ways to fix this problem (to exclude folders/files from an
autogenerated sidebar). Removing this file didn't break any existing
examples, so this fix is safe.
2023-11-23 16:06:16 -05:00
Leonid Ganeline
42f593b06b DOCS: integrations/text_embeddings/ cleanup (#13476)
Updated several notebooks:
- fixed titles which are inconsistent or break the ToC sorting order.
- added missed soruce descriptions and links
- fixed formatting
2023-11-23 16:06:16 -05:00
Bagatur
05278ef2b8 Update chain of note README.md (#13473) 2023-11-23 16:06:16 -05:00
Lance Martin
7e3e8d7a41 Update multi-modal RAG cookbook (#13429)
Use example
[blog](https://cloudedjudgement.substack.com/p/clouded-judgement-111023)
w/ tables, charts as images.
2023-11-23 16:06:16 -05:00
Bagatur
9cd3fd08e5 Bagatur/chain of note template(#13470) 2023-11-23 16:06:16 -05:00
Leonid Ganeline
0d8d9282aa DOCS updated semadb example (#13431)
- the `SemaDB` notebook was placed in additional subfolder which breaks
the vectorstore ToC. I moved file up, removed this unnecessary
subfolder; updated the `vercel.json` with rerouting for the new URL
- Added SemaDB description and link
- improved text consistency
2023-11-23 16:06:16 -05:00
Leonid Ganeline
ad6d761114 DOCS updated Activeloop DeepMemory notebook (#13428)
- Fixed the title of the notebook. It created an ugly ToC element as
`Activeloop DeepLake's DeepMemory + LangChain + ragas or how to get +27%
on RAG recall.`
- Added Activeloop description
- improved consistency in text
- fixed ToC (it was using HTML tagas that break left-side in-page ToC).
Now in-page ToC works
2023-11-23 16:06:16 -05:00
Harrison Chase
f26dbbf0b9 callback refactor (#13372)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-11-23 16:06:16 -05:00
Bagatur
4d4d34a9d9 DOCS: rag nit (#13436) 2023-11-23 16:06:16 -05:00
Leonid Ganeline
af1c28d03c updated clickup example (#13424)
- Fixed headers (was more then 1 Titles)
- Removed security token value. It was OK to have it, because it is
temporary token, but the automatic security swippers raise warnings on
that.
- Added `ClickUp` service description and link.
2023-11-23 16:06:16 -05:00
Brace Sproul
c55b233e1f Fix a link in docs (#13423) 2023-11-23 16:06:15 -05:00
Nuno Campos
6eff45e535 IMPROVEMENT pirate-speak-configurable alternatives env vars (#13395)
…rnative LLMs until used

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-23 16:06:15 -05:00
Bagatur
e38b2d70d2 DOCS: langchain stack img update (#13421) 2023-11-23 16:06:15 -05:00
Bagatur
400084f4bf bump 336, exp 44 (#13420) 2023-11-23 16:06:15 -05:00
Bagatur
c9d61b38db FIX: Infer runnable agent single or multi action (#13412) 2023-11-23 16:06:15 -05:00
Eugene Yurtsev
20ecfd2d92 Use secretstr for api keys for javelin-ai-gateway (#13417)
- Make javelin_ai_gateway_api_key a SecretStr

---------

Co-authored-by: Hiroshi Tashiro <hiroshitash@gmail.com>
2023-11-23 16:06:15 -05:00
William FH
19c71197e7 Fix Runnable Lambda Afunc Repr (#13413)
Otherwise, you get an error when using async functions.


h/t to Chris Ruppelt
2023-11-23 16:06:15 -05:00
Sumukh Sridhara
5624ee866e Merge pull request #13232
* PGVector needs to close its connection if its garbage collected
2023-11-23 16:06:15 -05:00
Nuno Campos
5546424107 IMPROVEMENT Passthrough kwargs in runnable lambda (#13405)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-23 16:06:15 -05:00
Bagatur
ef6527d806 DOCS: update rag use case (#13319) 2023-11-23 16:06:15 -05:00
Bagatur
8a4afce236 DOCS: install nit (#13380) 2023-11-23 16:06:15 -05:00
Clay Elmore
df225da97d FEAT Bedrock cohere embedding support (#13366)
- **Description:** adding cohere embedding support to bedrock embedding
class
  - **Issue:** N/A
  - **Dependencies:** None
  - **Tag maintainer:** @3coins 
  - **Twitter handle:** celmore25

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-23 16:06:15 -05:00
Bagatur
9ecefa68b5 Agent window management how to (#13033) 2023-11-23 16:06:15 -05:00
Nuno Campos
70faf4bc10 Make it easier to subclass RunnableEach (#13346)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-23 16:06:15 -05:00
Erick Friis
af7009b0bf IMPROVEMENT research-assistant configurable report type (#13312)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-11-23 16:06:15 -05:00
竹内謙太
af163059e3 FEAT Add some properties to NotionDBLoader (#13358)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

fix #13356

Add supports following properties for metadata to NotionDBLoader.

- `checkbox`
- `email`
- `number`
- `select`

There are no relevant tests for this code to be updated.
2023-11-23 16:06:15 -05:00
Leonid Ganeline
158d29e390 FEAT docs integration cards site (#13379)
The `Integrations` site is hidden now.
I've added it into the `More` menu.
The name is `Integration Cards` otherwise, it is confused with the
`Integrations` menu.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2023-11-23 16:06:15 -05:00
Erick Friis
4bbd4ec9bd api doc newlines (#13378)
cc @leo-gan 

Deploying at
https://api.python.langchain.com/en/erick-api-doc-newlines-/api_reference.html
(will take a bit)
2023-11-23 16:06:15 -05:00
Fielding Johnston
2d9a3a762b BUG Add limit_to_domains to APIChain based tools (#13367)
- **Description:** Adds `limit_to_domains` param to the APIChain based
tools (open_meteo, TMDB, podcast_docs, and news_api)
- **Issue:** I didn't open an issue, but after upgrading to 0.0.328
using these tools would throw an error.
  - **Dependencies:** N/A
  - **Tag maintainer:** @baskaryan 
  
  
**Note**: I included the trailing / simply because the docs here did
fc886cc303/docs/docs/use_cases/apis.ipynb (L246)
, but I checked the code and it is using `urlparse`. SoI followed the
docs since it comes down to stylee.
2023-11-23 16:06:15 -05:00
Predrag Gruevski
2644e1cd2d Update templates/rag-self-query with newer dependencies without CVEs. (#13362)
The `langchain` repo was being flagged for using vulnerable
dependencies, some of which were in this template's lockfile. Updating
to newer versions should fix that.
2023-11-23 16:06:15 -05:00
Predrag Gruevski
810ebf1cfd Update rag-timescale-conversation to dependencies without CVEs. (#13364)
Just `poetry lock` and moving `langchain` to the latest version, in case
folks copy this template.

This resolves some vulnerable dependency alerts GitHub code scanning was
flagging.
2023-11-23 16:06:15 -05:00
Leonid Ganeline
283498a3a8 Yi model from 01.ai , example (#13375)
Added an example with new soa `Yi` model to `HuggingFace-hub` notebook
2023-11-23 16:06:15 -05:00
amiaxys
9b46d4434c feat: add run function in mongo_database.py 2023-11-23 16:06:15 -05:00
Giselle Wang
6d16d1e61b feat: add functions to Mongo database class 2023-11-23 16:05:58 -05:00
amiaxys
8fc45afb99 feat: add sample documents and execute functions to Mongo database class 2023-11-23 16:01:54 -05:00
amiaxys
aa56c422a4 feat: add collection indexes function to Mongo database class 2023-11-23 16:01:11 -05:00
Giselle Wang
955ab9028b feat: add basic template for Mongo database class 2023-11-23 16:00:39 -05:00
amiaxys
82e834b07b Merge branch 'dev' of https://github.com/Haoming-jpg/team-skill-issue-langchain into dev 2023-11-23 15:50:52 -05:00
amiaxys
80e52bd865 Merge branch 'Amy-mongo_database.py' into dev 2023-11-23 15:49:32 -05:00
amiaxys
6054f462a4 feat: add run function in mongo_database.py 2023-11-23 15:49:00 -05:00
amiaxys
bb8faa3c5a Merge branch Giselle-mongo_database.py into Amy-mongo_database.py 2023-11-23 15:48:25 -05:00
amiaxys
55c666a740 feat: add _get_sample_documents and _execute functions to mongo_database.py 2023-11-23 15:20:50 -05:00
amiaxys
cd840f1f78 feat: add _get_collection_indexes 2023-11-14 22:42:20 -05:00
Haoming-jpg
a53d3708c7 Create tools/mongo_database/
Create files for Tools for interacting with a MongoDB
2023-11-14 21:24:46 -05:00
Giselle Wang
41f5fcdf09 feat: add functions 2023-11-14 10:56:41 -05:00
Giselle Wang
db54101b61 feat: add basic template 2023-11-12 12:57:15 -05:00
21 changed files with 2028 additions and 813 deletions

View File

@@ -36,6 +36,13 @@ jobs:
working-directory: ${{ inputs.working-directory }}
cache-key: compile-integration
- name: MongoDB in GitHub Actions
uses: supercharge/mongodb-github-action@1.10.0
with:
mongodb-version: 7.0
mongodb-replica-set: test
mongodb-port: 27017
- name: Install integration dependencies
shell: bash
run: poetry install --with=test_integration

View File

@@ -0,0 +1,204 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# MongoDB Database\n",
"\n",
"This notebook showcases an experimental agent designed to interact with and query a `MongoDB` database using PyMongo. \n",
"The agent is similar to [SQL Database](https://python.langchain.com/docs/integrations/toolkits/sql_database).\n",
"\n",
"As this agent is in development, currently it only supports one database per instance `MongoDatabase` class using URI. Additionally, all answers may not be correct, and it is not guaranteed that the agent won't perform destructive commands on your database (or in general) given certain questions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialization"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents.agent_types import AgentType\n",
"from langchain.llms.openai import OpenAI\n",
"from langchain_experimental.agents.agent_toolkits import (\n",
" MongoDatabaseToolkit,\n",
" create_mongo_agent,\n",
")\n",
"from langchain_experimental.utilities import MongoDatabase\n",
"\n",
"db = MongoDatabase.from_uri(\"mongodb://localhost:27017/my_db\")\n",
"db._client[\"my_db\"][\"my_collection\"].insert_many(\n",
" [\n",
" {\"text\": \"Hello, world!\", \"language\": \"en\"},\n",
" {\"text\": \"Bonjour, monde!\", \"language\": \"fr\"},\n",
" {\"text\": \"Hola, mundo!\", \"language\": \"es\"},\n",
" {\"text\": \"Hallo, Welt!\", \"language\": \"de\"},\n",
" {\"text\": \"Ciao, mondo!\", \"language\": \"it\"},\n",
" {\"text\": \"Olá, mundo!\", \"language\": \"pt\"},\n",
" {\"text\": \"Привет, мир!\", \"language\": \"ru\"},\n",
" {\"text\": \"你好,世界!\", \"language\": \"zh\"},\n",
" {\"text\": \"こんにちは世界!\", \"language\": \"ja\"},\n",
" {\"text\": \"안녕, 세상아!\", \"language\": \"ko\"},\n",
" ]\n",
")\n",
"# insert more documents if you would like\n",
"toolkit = MongoDatabaseToolkit(db=db, llm=OpenAI(temperature=0))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using `ZERO_SHOT_REACT_DESCRIPTION`\n",
"\n",
"This shows how to initialize the agent using the `ZERO_SHOT_REACT_DESCRIPTION` agent type."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"agent_executor = create_mongo_agent(\n",
" llm=OpenAI(temperature=0),\n",
" toolkit=toolkit,\n",
" verbose=True,\n",
" agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using OpenAI Functions\n",
"\n",
"This shows how to initialize the agent using the `OPENAI_FUNCTIONS` agent type. Note that this is an alternative to the above."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# agent_executor = create_mongo_agent(\n",
"# llm=ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\"),\n",
"# toolkit=toolkit,\n",
"# verbose=True,\n",
"# agent_type=AgentType.OPENAI_FUNCTIONS\n",
"# )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: querying documents"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction: mongo_db_list\n",
"Action Input: \u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mmy_collection\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I should check the schema of my_collection\n",
"Action: mongo_db_schema\n",
"Action Input: my_collection\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mCollection Name: my_collection\n",
"\n",
"3 sample documents from my_collection:\n",
"{'_id': ObjectId('65650d24fd0c16012a7ed210'),\n",
" 'language': 'en',\n",
" 'text': 'Hello, world!'}\n",
"{'_id': ObjectId('65650d24fd0c16012a7ed211'),\n",
" 'language': 'fr',\n",
" 'text': 'Bonjour, monde!'}\n",
"{'_id': ObjectId('65650d24fd0c16012a7ed212'),\n",
" 'language': 'es',\n",
" 'text': 'Hola, mundo!'}\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I should query the documents with language field equal to 'fr'\n",
"Action: mongo_db_query\n",
"Action Input: db.my_collection.find({'language': 'fr'}).limit(10)\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mResult:\n",
"{'_id': ObjectId('65650d24fd0c16012a7ed211'),\n",
" 'language': 'fr',\n",
" 'text': 'Bonjour, monde!'}\n",
"{'_id': ObjectId('65650d2bfd0c16012a7ed21c'),\n",
" 'language': 'fr',\n",
" 'text': 'Bonjour, monde!'}\n",
"{'_id': ObjectId('65650f9af8d68bbbc66c0c2e'),\n",
" 'language': 'fr',\n",
" 'text': 'Bonjour, monde!'}\n",
"{'_id': ObjectId('656510ddf351300b9c26690c'),\n",
" 'language': 'fr',\n",
" 'text': 'Bonjour, monde!'}\n",
"{'_id': ObjectId('65651b16b16e9f51a2df5856'),\n",
" 'language': 'fr',\n",
" 'text': 'Bonjour, monde!'}\n",
"{'_id': ObjectId('65651b4e1b66b2ae39ced4a6'),\n",
" 'language': 'fr',\n",
" 'text': 'Bonjour, monde!'}\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Bonjour, monde!\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Bonjour, monde!'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent_executor.run(\"Find hello world in french\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,4 +1,8 @@
from langchain_experimental.agents.agent_toolkits.csv.base import create_csv_agent
from langchain_experimental.agents.agent_toolkits.mongo.base import (
MongoDatabaseToolkit,
create_mongo_agent,
)
from langchain_experimental.agents.agent_toolkits.pandas.base import (
create_pandas_dataframe_agent,
)
@@ -16,4 +20,6 @@ __all__ = [
"create_spark_dataframe_agent",
"create_python_agent",
"create_csv_agent",
"create_mongo_agent",
"MongoDatabaseToolkit",
]

View File

@@ -0,0 +1 @@
"""MongoDB agent."""

View File

@@ -0,0 +1,98 @@
"""MongoDB agent."""
from typing import Any, Dict, List, Optional, Sequence
from langchain.agents.agent import AgentExecutor, BaseSingleActionAgent
from langchain.agents.agent_types import AgentType
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS
from langchain.agents.openai_functions_agent.base import OpenAIFunctionsAgent
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.llm import LLMChain
from langchain.tools import BaseTool
from langchain_core.language_models import BaseLanguageModel
from langchain_core.messages import AIMessage, SystemMessage
from langchain_core.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
MessagesPlaceholder,
)
from langchain_experimental.agents.agent_toolkits.mongo.prompt import (
MONGO_FUNCTIONS_SUFFIX,
MONGO_PREFIX,
MONGO_SUFFIX,
)
from langchain_experimental.agents.agent_toolkits.mongo.toolkit import (
MongoDatabaseToolkit,
)
def create_mongo_agent(
llm: BaseLanguageModel,
toolkit: MongoDatabaseToolkit,
agent_type: AgentType = AgentType.ZERO_SHOT_REACT_DESCRIPTION,
callback_manager: Optional[BaseCallbackManager] = None,
prefix: str = MONGO_PREFIX,
suffix: Optional[str] = None,
format_instructions: str = FORMAT_INSTRUCTIONS,
input_variables: Optional[List[str]] = None,
top_k: int = 10,
max_iterations: Optional[int] = 15,
max_execution_time: Optional[float] = None,
early_stopping_method: str = "force",
verbose: bool = False,
agent_executor_kwargs: Optional[Dict[str, Any]] = None,
extra_tools: Sequence[BaseTool] = (),
**kwargs: Any,
) -> AgentExecutor:
"""Construct a MongoDB agent from an LLM and tools."""
tools = toolkit.get_tools() + list(extra_tools)
prefix = prefix.format(top_k=top_k)
agent: BaseSingleActionAgent
if agent_type == AgentType.ZERO_SHOT_REACT_DESCRIPTION:
prompt = ZeroShotAgent.create_prompt(
tools,
prefix=prefix,
suffix=suffix or MONGO_SUFFIX,
format_instructions=format_instructions,
input_variables=input_variables,
)
llm_chain = LLMChain(
llm=llm,
prompt=prompt,
callback_manager=callback_manager,
)
tool_names = [tool.name for tool in tools]
agent = ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names, **kwargs)
elif agent_type == AgentType.OPENAI_FUNCTIONS:
messages = [
SystemMessage(content=prefix),
HumanMessagePromptTemplate.from_template("{input}"),
AIMessage(content=suffix or MONGO_FUNCTIONS_SUFFIX),
MessagesPlaceholder(variable_name="agent_scratchpad"),
]
input_variables = ["input", "agent_scratchpad"]
_prompt = ChatPromptTemplate(input_variables=input_variables, messages=messages)
agent = OpenAIFunctionsAgent(
llm=llm,
prompt=_prompt,
tools=tools,
callback_manager=callback_manager,
**kwargs,
)
else:
raise ValueError(f"Agent type {agent_type} not supported at the moment.")
return AgentExecutor.from_agent_and_tools(
agent=agent,
tools=tools,
callback_manager=callback_manager,
verbose=verbose,
max_iterations=max_iterations,
max_execution_time=max_execution_time,
early_stopping_method=early_stopping_method,
**(agent_executor_kwargs or {}),
)

View File

@@ -0,0 +1,22 @@
# flake8: noqa
MONGO_PREFIX = """You are an agent designed to interact with a MongoDB database.
Given an input question, create a syntactically correct MongoDB PyMongo query, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
You can order the results by a relevant field to return the most interesting examples in the database.
You have access to tools for interacting with the database.
Only use the below tools. Only use the information returned by the below tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.
DO NOT make any DML commands (insert, update, delete, etc.) to the database.
If the question does not seem related to the database, just return "I don't know" as the answer.
"""
MONGO_SUFFIX = """Begin!
Question: {input}
Thought: I should look at the collections in the database to see what I can query using PyMongo. Then I should query the fields of the documents of the most relevant collections, but before that check the query using query checker.
{agent_scratchpad}"""
MONGO_FUNCTIONS_SUFFIX = """I should look at the collections in the database to see what I can query using PyMongo. Then I should query the fields of the documents of the most relevant collections, but before that check the query using query checker."""

View File

@@ -0,0 +1,64 @@
"""Toolkit for interacting with a Mongo database."""
from typing import List
from langchain.agents.agent_toolkits.base import BaseToolkit
from langchain.tools import BaseTool
from langchain_core.language_models import BaseLanguageModel
from langchain_core.pydantic_v1 import Field
from langchain_experimental.tools.mongo_database.tool import (
InfoMongoDBTool,
ListMongoDBTool,
QueryMongoDBCheckerTool,
QueryMongoDBTool,
)
from langchain_experimental.utilities.mongo_database import MongoDatabase
class MongoDatabaseToolkit(BaseToolkit):
llm: BaseLanguageModel = Field(exclude=True)
db: MongoDatabase = Field(exclude=True)
class Config:
"""Configuration for this pydantic object."""
arbitrary_types_allowed = True
def get_tools(self) -> List[BaseTool]:
"""Get the tools in the toolkit."""
list_mongo_database_tool = ListMongoDBTool(db=self.db)
info_mongo_database_tool_description = (
"Input to this tool is a comma-separated list of collections, output is "
"the name, indexes and sample documents for those collections. "
"Be sure that the collections actually exist by calling "
f"{list_mongo_database_tool.name} first! "
"Example Input: collection1, collection2, collection3"
)
info_mongo_database_tool = InfoMongoDBTool(
db=self.db, description=info_mongo_database_tool_description
)
query_mongo_database_tool_description = (
"Input to this tool is a detailed and correct MongoDB query, "
"output is a result from the database. If the query is not "
"correct, an error message will be returned. If an error is returned, "
"rewrite the query, check the query, and try again. If you encounter an "
"issue with Unknown field 'xxxx' in 'field list', use "
f"{info_mongo_database_tool.name} to query the correct document fields."
)
query_mongo_database_tool = QueryMongoDBTool(
db=self.db, description=query_mongo_database_tool_description
)
query_mongo_checker_tool_description = (
"Use this tool to double check if your query is correct before executing "
"it. Always use this tool before executing a query with "
f"{query_mongo_database_tool.name}."
)
query_mongo_checker_tool = QueryMongoDBCheckerTool(
db=self.db, llm=self.llm, description=query_mongo_checker_tool_description
)
return [
list_mongo_database_tool,
info_mongo_database_tool,
query_mongo_database_tool,
query_mongo_checker_tool,
]

View File

@@ -1,3 +1,16 @@
from langchain_experimental.tools.mongo_database.tool import (
InfoMongoDBTool,
ListMongoDBTool,
QueryMongoDBCheckerTool,
QueryMongoDBTool,
)
from langchain_experimental.tools.python.tool import PythonAstREPLTool, PythonREPLTool
__all__ = ["PythonREPLTool", "PythonAstREPLTool"]
__all__ = [
"PythonREPLTool",
"PythonAstREPLTool",
"InfoMongoDBTool",
"ListMongoDBTool",
"QueryMongoDBCheckerTool",
"QueryMongoDBTool",
]

View File

@@ -0,0 +1 @@
"""Tools for interacting with a MongoDB database."""

View File

@@ -0,0 +1,18 @@
# flake8: noqa
QUERY_CHECKER = """
{query}
Double check the MongoDB query above for common mistakes, including:
- Not using PyMongo syntax and instead using MongoDB shell syntax
- No quotes around keys in find() or find_one() filters
- Improper use of $nin operator with null values
- Using $merge instead of $concat for combining arrays
- Incorrect use of $not or $ne for exclusive ranges
- Data type mismatch in query conditions
- Improperly referencing field names in queries
- Using incorrect syntax for aggregation functions
- Casting to the incorrect BSON data type
- Using the improper fields for $lookup in aggregations
If there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.
MongoDB Query: """

View File

@@ -0,0 +1,131 @@
# flake8: noqa
"""Tools for interacting with a MongoDB database."""
from typing import Any, Dict, Optional
from langchain.pydantic_v1 import BaseModel, Extra, Field, root_validator
from langchain.schema.language_model import BaseLanguageModel
from langchain.callbacks.manager import (
AsyncCallbackManagerForToolRun,
CallbackManagerForToolRun,
)
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate
from langchain_experimental.utilities.mongo_database import MongoDatabase
from langchain.tools.base import BaseTool
from langchain_experimental.tools.mongo_database.prompt import QUERY_CHECKER
class BaseMongoDBTool(BaseModel):
"""Base tool for interacting with a MongoDB database."""
db: MongoDatabase = Field(exclude=True)
class Config(BaseTool.Config):
pass
class QueryMongoDBTool(BaseMongoDBTool, BaseTool):
"""Tool for querying a MongoDB database."""
name: str = "mongo_db_query"
description: str = """
Input to this tool is a detailed and correct MongoDB query, output is a result from the database.
If the query is not correct, an error message will be returned.
If an error is returned, rewrite the query, check the query, and try again.
"""
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Execute the query, return the results or an error message."""
return self.db.run_no_throw(query)
class InfoMongoDBTool(BaseMongoDBTool, BaseTool):
"""Tool for getting metadata about a MongoDB database."""
name: str = "mongo_db_schema"
description: str = """
Input to this tool is a comma-separated list of collections, output is the name, indexes, and sample documents for those collections.
Example Input: "collection1, collection2, collection3"
"""
def _run(
self,
collection_names: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Get information about specified collections."""
return self.db.get_collection_info_no_throw(collection_names.split(", "))
class ListMongoDBTool(BaseMongoDBTool, BaseTool):
"""Tool for listing collections in a MongoDB database."""
name: str = "mongo_db_list"
description: str = """
Input is an empty string, output is a comma separated list of collections in the database.
"""
def _run(
self,
tool_input: str = "",
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Get a list of collections in the database."""
return ", ".join(self.db.get_usable_collection_names)
class QueryMongoDBCheckerTool(BaseMongoDBTool, BaseTool):
"""Use an LLM to check if a query is correct"""
template: str = QUERY_CHECKER
llm: BaseLanguageModel
llm_chain: LLMChain = Field(init=False)
name: str = "mongo_db_query_checker"
description: str = """
Use this tool to double check a MongoDB query for common mistakes.
"""
@root_validator(pre=True)
def _init_llm_chain(cls, values: Dict[str, Any]) -> Dict[str, Any]:
"""Initialize the LLM chain."""
if "llm_chain" not in values:
values["llm_chain"] = LLMChain(
llm=values.get("llm"),
prompt=PromptTemplate(
template=QUERY_CHECKER, input_variables=["query"]
),
)
if values["llm_chain"].prompt.input_variables != ["query"]:
raise ValueError(
"LLM chain for QueryCheckerTool must have input variables ['query']"
)
return values
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the LLM to check the query."""
return self.llm_chain.predict(
query=query,
callbacks=run_manager.get_child() if run_manager else None,
)
async def _arun(
self,
query: str,
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
) -> str:
return await self.llm_chain.apredict(
query=query,
callbacks=run_manager.get_child() if run_manager else None,
)

View File

@@ -1,3 +1,4 @@
from langchain_experimental.utilities.mongo_database import MongoDatabase
from langchain_experimental.utilities.python import PythonREPL
__all__ = ["PythonREPL"]
__all__ = ["PythonREPL", "MongoDatabase"]

View File

@@ -0,0 +1,194 @@
"""MongoEngine wrapper around a database."""
from __future__ import annotations
import re
from ast import literal_eval
from pprint import pformat
from typing import Any, Iterable, List, Optional
from pymongo import MongoClient
from pymongo.errors import PyMongoError
def _format_index(index: dict) -> str:
"""Format an index for display."""
index_keys = index["key"]
index_keys_formatted = ", ".join(f"{k[0]}: {k[1]}" for k in index_keys)
unique = ""
if index_keys[0][0] == "_id" and not index["unique"]:
unique = ""
else:
unique = f' Unique: {index["unique"]},'
return f'Name: {index["name"]},{unique}' f' Keys: {{ {index_keys_formatted} }}'
class MongoDatabase:
"""MongoEngine wrapper around a database."""
def __init__(
self,
client: MongoClient,
ignore_collections: Optional[List[str]] = None,
include_collections: Optional[List[str]] = None,
sample_documents_in_collection_info: int = 3,
indexes_in_collection_info: bool = False,
):
# Connect to MongoDB using mongoengine
self._client = client
if not isinstance(sample_documents_in_collection_info, int):
raise TypeError("sample_documents_in_collection_info must be an integer")
db = self._client.get_default_database()
self._all_collections = set(db.list_collection_names())
self._include_collections = (
set(include_collections) if include_collections else set()
)
if self._include_collections:
missing_collections = self._include_collections - self._all_collections
if missing_collections:
raise ValueError(
f"collections {missing_collections} not found in database"
)
self._ignore_collections = (
set(ignore_collections) if ignore_collections else set()
)
if self._ignore_collections:
missing_collections = self._ignore_collections - self._all_collections
if missing_collections:
raise ValueError(
f"collections {missing_collections} not found in database"
)
if not isinstance(sample_documents_in_collection_info, int):
raise TypeError("sample_documents_in_collection_info must be an integer")
self._sample_documents_in_collection_info = sample_documents_in_collection_info
self._indexes_in_collection_info = indexes_in_collection_info
@classmethod
def from_uri(cls, database_uri: str, **kwargs: Any) -> MongoDatabase:
"""Construct a MongoEngine engine from URI."""
return cls(MongoClient(host=database_uri, **kwargs), **kwargs)
@property
def get_usable_collection_names(self) -> Iterable[str]:
"""Get names of collections available."""
if self._include_collections:
return sorted(self._include_collections)
return sorted(self._all_collections - self._ignore_collections)
@property
def collection_info(self) -> str:
"""Information about all collections in the database."""
return self.get_collection_info()
def get_collection_info(self, collection_names: Optional[List[str]] = None) -> str:
"""Get information about specified collections."""
all_collection_names = self.get_usable_collection_names
if collection_names is not None:
missing_collections = set(collection_names).difference(all_collection_names)
if missing_collections:
raise ValueError(
f"collection_names {missing_collections} not found in database"
)
all_collection_names = collection_names
collections = []
for collection_name in all_collection_names:
# Add document information
document_info = f"Collection Name: {collection_name}\n"
# Add indexes information
if self._indexes_in_collection_info:
document_info += f"\n{self._get_collection_indexes(collection_name)}\n"
# Sample rows or documents info (if required)
if self._sample_documents_in_collection_info:
document_info += f"\n{self._get_sample_documents(collection_name)}\n"
collections.append(document_info)
collections.sort()
final_str = "\n\n".join(collections)
return final_str
def get_collection_info_no_throw(
self, collection_names: Optional[List[str]] = None
) -> str:
"""Get information about specified collections.
If the collection does not exist, an error message is returned."""
try:
return self.get_collection_info(collection_names)
except ValueError as e:
return f"Error: {e}"
def _get_collection_indexes(self, collection_name: str) -> str:
"""Get indexes of a collection."""
db = self._client.get_default_database()
indexes = db[collection_name].index_information()
indexes_cleaned = [
{"name": k, "key": v["key"], "unique": "unique" in v and v["unique"]}
for k, v in indexes.items()
]
indexes_formatted = "\n".join(map(_format_index, indexes_cleaned))
return f"Collection Indexes:\n{indexes_formatted}"
def _get_sample_documents(self, collection_name: str) -> str:
db = self._client.get_default_database()
documents = (
db[collection_name].find().limit(self._sample_documents_in_collection_info)
)
documents_formatted = "\n".join(map(pformat, documents))
return (
f"{self._sample_documents_in_collection_info} sample documents from "
f"{collection_name}:\n{documents_formatted}"
)
def _execute(self, command: str) -> dict[str, Any]:
"""Execute a command and return the result."""
db = self._client.get_default_database()
result = {}
try:
command_dict = literal_eval(command)
if isinstance(command_dict, dict):
result = db.command(command_dict)
except ValueError:
pass
# checks if command is a find query
if not result and re.match(r"^db\.\w+\.find\w*\(\{.*\}\)", command):
cursor = eval(command) # dangerous, might need to find a better solution
result_list = []
for doc in cursor:
result_list.append(doc)
result = {"cursor": result_list}
return result
def run(self, command: str) -> str:
"""Run a command and return a string representing the results."""
result = self._execute(command)
result_formatted = ""
if "cursor" in result:
if "firstBatch" in result["cursor"]:
result_formatted = "\n".join(
map(pformat, list(result["cursor"]["firstBatch"]))
)
else:
result_formatted = "\n".join(map(pformat, result["cursor"]))
else:
result_formatted = pformat(result)
return f"Result:\n{result_formatted}"
def run_no_throw(self, command: str) -> str:
"""Run a command and return a string representing the results.
If the statement throws an error, the error message is returned."""
try:
return self.run(command)
except PyMongoError as e:
return f"Error: {e}"

View File

@@ -787,6 +787,25 @@ files = [
{file = "defusedxml-0.7.1.tar.gz", hash = "sha256:1bb3032db185915b62d7c6209c5a8792be6a32ab2fedacc84e01b52c51aa3e69"},
]
[[package]]
name = "dnspython"
version = "2.4.2"
description = "DNS toolkit"
optional = true
python-versions = ">=3.8,<4.0"
files = [
{file = "dnspython-2.4.2-py3-none-any.whl", hash = "sha256:57c6fbaaeaaf39c891292012060beb141791735dbb4004798328fc2c467402d8"},
{file = "dnspython-2.4.2.tar.gz", hash = "sha256:8dcfae8c7460a2f84b4072e26f1c9f4101ca20c071649cb7c34e8b6a93d58984"},
]
[package.extras]
dnssec = ["cryptography (>=2.6,<42.0)"]
doh = ["h2 (>=4.1.0)", "httpcore (>=0.17.3)", "httpx (>=0.24.1)"]
doq = ["aioquic (>=0.9.20)"]
idna = ["idna (>=2.1,<4.0)"]
trio = ["trio (>=0.14,<0.23)"]
wmi = ["wmi (>=1.5.1,<2.0.0)"]
[[package]]
name = "exceptiongroup"
version = "1.1.3"
@@ -2816,6 +2835,108 @@ files = [
[package.extras]
plugins = ["importlib-metadata"]
[[package]]
name = "pymongo"
version = "4.6.1"
description = "Python driver for MongoDB <http://www.mongodb.org>"
optional = true
python-versions = ">=3.7"
files = [
{file = "pymongo-4.6.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:4344c30025210b9fa80ec257b0e0aab5aa1d5cca91daa70d82ab97b482cc038e"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux1_i686.whl", hash = "sha256:1c5654bb8bb2bdb10e7a0bc3c193dd8b49a960b9eebc4381ff5a2043f4c3c441"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux2014_aarch64.whl", hash = "sha256:eaf2f65190c506def2581219572b9c70b8250615dc918b3b7c218361a51ec42e"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux2014_i686.whl", hash = "sha256:262356ea5fcb13d35fb2ab6009d3927bafb9504ef02339338634fffd8a9f1ae4"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux2014_ppc64le.whl", hash = "sha256:2dd2f6960ee3c9360bed7fb3c678be0ca2d00f877068556785ec2eb6b73d2414"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux2014_s390x.whl", hash = "sha256:ff925f1cca42e933376d09ddc254598f8c5fcd36efc5cac0118bb36c36217c41"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux2014_x86_64.whl", hash = "sha256:3cadf7f4c8e94d8a77874b54a63c80af01f4d48c4b669c8b6867f86a07ba994f"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:55dac73316e7e8c2616ba2e6f62b750918e9e0ae0b2053699d66ca27a7790105"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:154b361dcb358ad377d5d40df41ee35f1cc14c8691b50511547c12404f89b5cb"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2940aa20e9cc328e8ddeacea8b9a6f5ddafe0b087fedad928912e787c65b4909"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:010bc9aa90fd06e5cc52c8fac2c2fd4ef1b5f990d9638548dde178005770a5e8"},
{file = "pymongo-4.6.1-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:e470fa4bace5f50076c32f4b3cc182b31303b4fefb9b87f990144515d572820b"},
{file = "pymongo-4.6.1-cp310-cp310-win32.whl", hash = "sha256:da08ea09eefa6b960c2dd9a68ec47949235485c623621eb1d6c02b46765322ac"},
{file = "pymongo-4.6.1-cp310-cp310-win_amd64.whl", hash = "sha256:13d613c866f9f07d51180f9a7da54ef491d130f169e999c27e7633abe8619ec9"},
{file = "pymongo-4.6.1-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:6a0ae7a48a6ef82ceb98a366948874834b86c84e288dbd55600c1abfc3ac1d88"},
{file = "pymongo-4.6.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5bd94c503271e79917b27c6e77f7c5474da6930b3fb9e70a12e68c2dff386b9a"},
{file = "pymongo-4.6.1-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:2d4ccac3053b84a09251da8f5350bb684cbbf8c8c01eda6b5418417d0a8ab198"},
{file = "pymongo-4.6.1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:349093675a2d3759e4fb42b596afffa2b2518c890492563d7905fac503b20daa"},
{file = "pymongo-4.6.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:88beb444fb438385e53dc9110852910ec2a22f0eab7dd489e827038fdc19ed8d"},
{file = "pymongo-4.6.1-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d8e62d06e90f60ea2a3d463ae51401475568b995bafaffd81767d208d84d7bb1"},
{file = "pymongo-4.6.1-cp311-cp311-win32.whl", hash = "sha256:5556e306713e2522e460287615d26c0af0fe5ed9d4f431dad35c6624c5d277e9"},
{file = "pymongo-4.6.1-cp311-cp311-win_amd64.whl", hash = "sha256:b10d8cda9fc2fcdcfa4a000aa10413a2bf8b575852cd07cb8a595ed09689ca98"},
{file = "pymongo-4.6.1-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:b435b13bb8e36be11b75f7384a34eefe487fe87a6267172964628e2b14ecf0a7"},
{file = "pymongo-4.6.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e438417ce1dc5b758742e12661d800482200b042d03512a8f31f6aaa9137ad40"},
{file = "pymongo-4.6.1-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8b47ebd89e69fbf33d1c2df79759d7162fc80c7652dacfec136dae1c9b3afac7"},
{file = "pymongo-4.6.1-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:bbed8cccebe1169d45cedf00461b2842652d476d2897fd1c42cf41b635d88746"},
{file = "pymongo-4.6.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c30a9e06041fbd7a7590693ec5e407aa8737ad91912a1e70176aff92e5c99d20"},
{file = "pymongo-4.6.1-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:b8729dbf25eb32ad0dc0b9bd5e6a0d0b7e5c2dc8ec06ad171088e1896b522a74"},
{file = "pymongo-4.6.1-cp312-cp312-win32.whl", hash = "sha256:3177f783ae7e08aaf7b2802e0df4e4b13903520e8380915e6337cdc7a6ff01d8"},
{file = "pymongo-4.6.1-cp312-cp312-win_amd64.whl", hash = "sha256:00c199e1c593e2c8b033136d7a08f0c376452bac8a896c923fcd6f419e07bdd2"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:13552ca505366df74e3e2f0a4f27c363928f3dff0eef9f281eb81af7f29bc3c5"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:77e0df59b1a4994ad30c6d746992ae887f9756a43fc25dec2db515d94cf0222d"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:3a7f02a58a0c2912734105e05dedbee4f7507e6f1bd132ebad520be0b11d46fd"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux2014_i686.whl", hash = "sha256:026a24a36394dc8930cbcb1d19d5eb35205ef3c838a7e619e04bd170713972e7"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux2014_ppc64le.whl", hash = "sha256:3b287e814a01deddb59b88549c1e0c87cefacd798d4afc0c8bd6042d1c3d48aa"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux2014_s390x.whl", hash = "sha256:9a710c184ba845afb05a6f876edac8f27783ba70e52d5eaf939f121fc13b2f59"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux2014_x86_64.whl", hash = "sha256:30b2c9caf3e55c2e323565d1f3b7e7881ab87db16997dc0cbca7c52885ed2347"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ff62ba8ff70f01ab4fe0ae36b2cb0b5d1f42e73dfc81ddf0758cd9f77331ad25"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:547dc5d7f834b1deefda51aedb11a7af9c51c45e689e44e14aa85d44147c7657"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:1de3c6faf948f3edd4e738abdb4b76572b4f4fdfc1fed4dad02427e70c5a6219"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a2831e05ce0a4df10c4ac5399ef50b9a621f90894c2a4d2945dc5658765514ed"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:144a31391a39a390efce0c5ebcaf4bf112114af4384c90163f402cec5ede476b"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:33bb16a07d3cc4e0aea37b242097cd5f7a156312012455c2fa8ca396953b11c4"},
{file = "pymongo-4.6.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:b7b1a83ce514700276a46af3d9e481ec381f05b64939effc9065afe18456a6b9"},
{file = "pymongo-4.6.1-cp37-cp37m-win32.whl", hash = "sha256:3071ec998cc3d7b4944377e5f1217c2c44b811fae16f9a495c7a1ce9b42fb038"},
{file = "pymongo-4.6.1-cp37-cp37m-win_amd64.whl", hash = "sha256:2346450a075625c4d6166b40a013b605a38b6b6168ce2232b192a37fb200d588"},
{file = "pymongo-4.6.1-cp38-cp38-macosx_11_0_universal2.whl", hash = "sha256:061598cbc6abe2f382ab64c9caa83faa2f4c51256f732cdd890bcc6e63bfb67e"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux1_i686.whl", hash = "sha256:d483793a384c550c2d12cb794ede294d303b42beff75f3b3081f57196660edaf"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:f9756f1d25454ba6a3c2f1ef8b7ddec23e5cdeae3dc3c3377243ae37a383db00"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:1ed23b0e2dac6f84f44c8494fbceefe6eb5c35db5c1099f56ab78fc0d94ab3af"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux2014_i686.whl", hash = "sha256:3d18a9b9b858ee140c15c5bfcb3e66e47e2a70a03272c2e72adda2482f76a6ad"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux2014_ppc64le.whl", hash = "sha256:c258dbacfff1224f13576147df16ce3c02024a0d792fd0323ac01bed5d3c545d"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux2014_s390x.whl", hash = "sha256:f7acc03a4f1154ba2643edeb13658d08598fe6e490c3dd96a241b94f09801626"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux2014_x86_64.whl", hash = "sha256:76013fef1c9cd1cd00d55efde516c154aa169f2bf059b197c263a255ba8a9ddf"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3f0e6a6c807fa887a0c51cc24fe7ea51bb9e496fe88f00d7930063372c3664c3"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:dd1fa413f8b9ba30140de198e4f408ffbba6396864c7554e0867aa7363eb58b2"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:8d219b4508f71d762368caec1fc180960569766049bbc4d38174f05e8ef2fe5b"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:27b81ecf18031998ad7db53b960d1347f8f29e8b7cb5ea7b4394726468e4295e"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:56816e43c92c2fa8c11dc2a686f0ca248bea7902f4a067fa6cbc77853b0f041e"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:ef801027629c5b511cf2ba13b9be29bfee36ae834b2d95d9877818479cdc99ea"},
{file = "pymongo-4.6.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:d4c2be9760b112b1caf649b4977b81b69893d75aa86caf4f0f398447be871f3c"},
{file = "pymongo-4.6.1-cp38-cp38-win32.whl", hash = "sha256:39d77d8bbb392fa443831e6d4ae534237b1f4eee6aa186f0cdb4e334ba89536e"},
{file = "pymongo-4.6.1-cp38-cp38-win_amd64.whl", hash = "sha256:4497d49d785482cc1a44a0ddf8830b036a468c088e72a05217f5b60a9e025012"},
{file = "pymongo-4.6.1-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:69247f7a2835fc0984bbf0892e6022e9a36aec70e187fcfe6cae6a373eb8c4de"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux1_i686.whl", hash = "sha256:7bb0e9049e81def6829d09558ad12d16d0454c26cabe6efc3658e544460688d9"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:6a1810c2cbde714decf40f811d1edc0dae45506eb37298fd9d4247b8801509fe"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:e2aced6fb2f5261b47d267cb40060b73b6527e64afe54f6497844c9affed5fd0"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux2014_i686.whl", hash = "sha256:d0355cff58a4ed6d5e5f6b9c3693f52de0784aa0c17119394e2a8e376ce489d4"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux2014_ppc64le.whl", hash = "sha256:3c74f4725485f0a7a3862cfd374cc1b740cebe4c133e0c1425984bcdcce0f4bb"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux2014_s390x.whl", hash = "sha256:9c79d597fb3a7c93d7c26924db7497eba06d58f88f58e586aa69b2ad89fee0f8"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux2014_x86_64.whl", hash = "sha256:8ec75f35f62571a43e31e7bd11749d974c1b5cd5ea4a8388725d579263c0fdf6"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a5e641f931c5cd95b376fd3c59db52770e17bec2bf86ef16cc83b3906c054845"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9aafd036f6f2e5ad109aec92f8dbfcbe76cff16bad683eb6dd18013739c0b3ae"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:1f2b856518bfcfa316c8dae3d7b412aecacf2e8ba30b149f5eb3b63128d703b9"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5ec31adc2e988fd7db3ab509954791bbc5a452a03c85e45b804b4bfc31fa221d"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:9167e735379ec43d8eafa3fd675bfbb12e2c0464f98960586e9447d2cf2c7a83"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:1461199b07903fc1424709efafe379205bf5f738144b1a50a08b0396357b5abf"},
{file = "pymongo-4.6.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:3094c7d2f820eecabadae76bfec02669567bbdd1730eabce10a5764778564f7b"},
{file = "pymongo-4.6.1-cp39-cp39-win32.whl", hash = "sha256:c91ea3915425bd4111cb1b74511cdc56d1d16a683a48bf2a5a96b6a6c0f297f7"},
{file = "pymongo-4.6.1-cp39-cp39-win_amd64.whl", hash = "sha256:ef102a67ede70e1721fe27f75073b5314911dbb9bc27cde0a1c402a11531e7bd"},
{file = "pymongo-4.6.1.tar.gz", hash = "sha256:31dab1f3e1d0cdd57e8df01b645f52d43cc1b653ed3afd535d2891f4fc4f9712"},
]
[package.dependencies]
dnspython = ">=1.16.0,<3.0.0"
[package.extras]
aws = ["pymongo-auth-aws (<2.0.0)"]
encryption = ["certifi", "pymongo[aws]", "pymongocrypt (>=1.6.0,<2.0.0)"]
gssapi = ["pykerberos", "winkerberos (>=0.5.0)"]
ocsp = ["certifi", "cryptography (>=2.5)", "pyopenssl (>=17.2.0)", "requests (<3.0.0)", "service-identity (>=18.1.0)"]
snappy = ["python-snappy"]
test = ["pytest (>=7)"]
zstd = ["zstandard"]
[[package]]
name = "pytest"
version = "7.4.3"
@@ -4926,9 +5047,9 @@ docs = ["furo", "jaraco.packaging (>=9.3)", "jaraco.tidelift (>=1.4)", "rst.link
testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=2.2)", "pytest-ignore-flaky", "pytest-mypy (>=0.9.1)", "pytest-ruff"]
[extras]
extended-testing = ["faker", "presidio-analyzer", "presidio-anonymizer", "sentence-transformers", "vowpal-wabbit-next"]
extended-testing = ["faker", "presidio-analyzer", "presidio-anonymizer", "pymongo", "sentence-transformers", "vowpal-wabbit-next"]
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "82bebfc5475be48f180bcb5013850eb88f451ffdc1f126a12112e10ed56f6529"
content-hash = "a299bf636b758f1242cfbd5d944c9d82e3d73fcb2a37faa910cc8617f7b1a98c"

View File

@@ -17,6 +17,7 @@ presidio-analyzer = {version = "^2.2.33", optional = true}
faker = {version = "^19.3.1", optional = true}
vowpal-wabbit-next = {version = "0.6.0", optional = true}
sentence-transformers = {version = "^2", optional = true}
pymongo = {version = "^4.6.1", optional = true}
[tool.poetry.group.lint.dependencies]
ruff = "^0.1.5"
@@ -54,6 +55,7 @@ extended_testing = [
"faker",
"vowpal-wabbit-next",
"sentence-transformers",
"pymongo",
]
[tool.ruff]

View File

@@ -0,0 +1,20 @@
from langchain_experimental.agents.agent_toolkits import (
MongoDatabaseToolkit,
create_mongo_agent,
)
from langchain_experimental.utilities import MongoDatabase
from tests.unit_tests.fake_llm import FakeLLM
def test_create_mongo_agent() -> None:
db = MongoDatabase.from_uri("mongodb://localhost:27017/test_db")
queries = {"foo": "Final Answer: baz"}
llm = FakeLLM(queries=queries, sequential_responses=True)
toolkit = MongoDatabaseToolkit(db=db, llm=llm)
agent_executor = create_mongo_agent(
llm=llm,
toolkit=toolkit,
)
assert agent_executor.run("hello") == "baz"

View File

@@ -0,0 +1,80 @@
"""Test MongoDB database wrapper."""
import re
from pymongo import MongoClient
from langchain_experimental.utilities.mongo_database import MongoDatabase
uri = "mongodb://localhost:27017/test_db"
def test_collection_info() -> None:
"""Test that collection info is constructed properly."""
db = MongoDatabase.from_uri(uri)
collection = db._client["test_db"]["test_collection"]
if "test" not in collection.find_one({"test": "test"}): # type: ignore
collection.insert_many(
[
{"test": "test"},
{"test2": "test"},
{"test3": "test"},
{"test4": "test"},
]
)
output = db.collection_info
expected_output = """
Collection Name: test_collection
3 sample documents from test_collection:
{'_id': , 'test': 'test'}
{'_id': , 'test2': 'test'}
{'_id': , 'test3': 'test'}
"""
output = re.sub(r"ObjectId\('.+'\)", "", output)
assert sorted(" ".join(output.split())) == sorted(" ".join(expected_output.split()))
def test_collection_info_w_sample_documents() -> None:
"""Test that collection info is constructed properly."""
db = MongoDatabase(
MongoClient(uri),
sample_documents_in_collection_info=2,
)
collection = db._client["test_db"]["test_collection"]
if "test" not in collection.find_one({"test": "test"}): # type: ignore
collection.insert_many(
[
{"test": "test"},
{"test2": "test"},
{"test3": "test"},
{"test4": "test"},
]
)
output = db.collection_info
expected_output = """
Collection Name: test_collection
2 sample documents from test_collection:
{'_id': , 'test': 'test'}
{'_id': , 'test2': 'test'}
"""
output = re.sub(r"ObjectId\('.+'\)", "", output)
assert sorted(" ".join(output.split())) == sorted(" ".join(expected_output.split()))
def test_mongo_database_run() -> None:
"""Test that run works properly."""
db = MongoDatabase.from_uri(uri)
output = db.run("{ 'find': 'test_collection', 'filter': { 'test4': 'test' } }")
expected_output = """
Result:
{'_id': , 'test4': 'test'}
"""
output = re.sub(r"ObjectId\('.+'\)", "", output)
assert sorted(" ".join(output.split())) == sorted(" ".join(expected_output.split()))

View File

@@ -0,0 +1,175 @@
import asyncio
import os
import time
import urllib.request
import uuid
from enum import Enum
from typing import Any
from urllib.error import HTTPError
import pytest
from langchain.agents import AgentType, initialize_agent
from langchain.agents.agent_toolkits.ainetwork.toolkit import AINetworkToolkit
from langchain.chat_models import ChatOpenAI
from langchain.tools.ainetwork.utils import authenticate
class Match(Enum):
__test__ = False
ListWildcard = 1
StrWildcard = 2
DictWildcard = 3
IntWildcard = 4
FloatWildcard = 5
ObjectWildcard = 6
@classmethod
def match(cls, value: Any, template: Any) -> bool:
if template is cls.ListWildcard:
return isinstance(value, list)
elif template is cls.StrWildcard:
return isinstance(value, str)
elif template is cls.DictWildcard:
return isinstance(value, dict)
elif template is cls.IntWildcard:
return isinstance(value, int)
elif template is cls.FloatWildcard:
return isinstance(value, float)
elif template is cls.ObjectWildcard:
return True
elif type(value) != type(template):
return False
elif isinstance(value, dict):
if len(value) != len(template):
return False
for k, v in value.items():
if k not in template or not cls.match(v, template[k]):
return False
return True
elif isinstance(value, list):
if len(value) != len(template):
return False
for i in range(len(value)):
if not cls.match(value[i], template[i]):
return False
return True
else:
return value == template
@pytest.mark.requires("ain")
def test_ainetwork_toolkit() -> None:
def get(path: str, type: str = "value", default: Any = None) -> Any:
ref = ain.db.ref(path)
value = asyncio.run(
{
"value": ref.getValue,
"rule": ref.getRule,
"owner": ref.getOwner,
}[type]()
)
return default if value is None else value
def validate(path: str, template: Any, type: str = "value") -> bool:
value = get(path, type)
return Match.match(value, template)
if not os.environ.get("AIN_BLOCKCHAIN_ACCOUNT_PRIVATE_KEY", None):
from ain.account import Account
account = Account.create()
os.environ["AIN_BLOCKCHAIN_ACCOUNT_PRIVATE_KEY"] = account.private_key
interface = authenticate(network="testnet")
toolkit = AINetworkToolkit(network="testnet", interface=interface)
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = initialize_agent(
tools=toolkit.get_tools(),
llm=llm,
verbose=True,
agent=AgentType.OPENAI_FUNCTIONS,
)
ain = interface
self_address = ain.wallet.defaultAccount.address
co_address = "0x6813Eb9362372EEF6200f3b1dbC3f819671cBA69"
# Test creating an app
UUID = uuid.UUID(
int=(int(time.time() * 1000) << 64) | (uuid.uuid4().int & ((1 << 64) - 1))
)
app_name = f"_langchain_test__{str(UUID).replace('-', '_')}"
agent.run(f"""Create app {app_name}""")
validate(f"/manage_app/{app_name}/config", {"admin": {self_address: True}})
validate(f"/apps/{app_name}/DB", None, "owner")
# Test reading owner config
agent.run(f"""Read owner config of /apps/{app_name}/DB .""")
assert ...
# Test granting owner config
agent.run(
f"""Grant owner authority to {co_address} for edit write rule permission of /apps/{app_name}/DB_co .""" # noqa: E501
)
validate(
f"/apps/{app_name}/DB_co",
{
".owner": {
"owners": {
co_address: {
"branch_owner": False,
"write_function": False,
"write_owner": False,
"write_rule": True,
}
}
}
},
"owner",
)
# Test reading owner config
agent.run(f"""Read owner config of /apps/{app_name}/DB_co .""")
assert ...
# Test reading owner config
agent.run(f"""Read owner config of /apps/{app_name}/DB .""")
assert ... # Check if owner {self_address} exists
# Test reading a value
agent.run(f"""Read value in /apps/{app_name}/DB""")
assert ... # empty
# Test writing a value
agent.run(f"""Write value {{1: 1904, 2: 43}} in /apps/{app_name}/DB""")
validate(f"/apps/{app_name}/DB", {1: 1904, 2: 43})
# Test reading a value
agent.run(f"""Read value in /apps/{app_name}/DB""")
assert ... # check value
# Test reading a rule
agent.run(f"""Read write rule of app {app_name} .""")
assert ... # check rule that self_address exists
# Test sending AIN
self_balance = get(f"/accounts/{self_address}/balance", default=0)
transaction_history = get(f"/transfer/{self_address}/{co_address}", default={})
if self_balance < 1:
try:
with urllib.request.urlopen(
f"http://faucet.ainetwork.ai/api/test/{self_address}/"
) as response:
try_test = response.getcode()
except HTTPError as e:
try_test = e.getcode()
else:
try_test = 200
if try_test == 200:
agent.run(f"""Send 1 AIN to {co_address}""")
transaction_update = get(f"/transfer/{self_address}/{co_address}", default={})
assert any(
transaction_update[key]["value"] == 1
for key in transaction_update.keys() - transaction_history.keys()
)

View File

@@ -0,0 +1,47 @@
import pytest
from langchain.agents.agent_toolkits import PowerBIToolkit, create_pbi_agent
from langchain.chat_models import ChatOpenAI
from langchain.utilities.powerbi import PowerBIDataset
from langchain.utils import get_from_env
def azure_installed() -> bool:
try:
from azure.core.credentials import TokenCredential # noqa: F401
from azure.identity import DefaultAzureCredential # noqa: F401
return True
except Exception as e:
print(f"azure not installed, skipping test {e}")
return False
@pytest.mark.skipif(not azure_installed(), reason="requires azure package")
def test_daxquery() -> None:
from azure.identity import DefaultAzureCredential
DATASET_ID = get_from_env("", "POWERBI_DATASET_ID")
TABLE_NAME = get_from_env("", "POWERBI_TABLE_NAME")
NUM_ROWS = get_from_env("", "POWERBI_NUMROWS")
fast_llm = ChatOpenAI(
temperature=0.5, max_tokens=1000, model_name="gpt-3.5-turbo", verbose=True
)
smart_llm = ChatOpenAI(
temperature=0, max_tokens=100, model_name="gpt-4", verbose=True
)
toolkit = PowerBIToolkit(
powerbi=PowerBIDataset(
dataset_id=DATASET_ID,
table_names=[TABLE_NAME],
credential=DefaultAzureCredential(),
),
llm=smart_llm,
)
agent_executor = create_pbi_agent(llm=fast_llm, toolkit=toolkit, verbose=True)
output = agent_executor.run(f"How many rows are in the table, {TABLE_NAME}")
assert NUM_ROWS in output

1628
poetry.lock generated

File diff suppressed because it is too large Load Diff