Compare commits

...

438 Commits

Author SHA1 Message Date
Harrison Chase
035ad33a5b bump ver to 225 (#7244) 2023-07-05 21:22:18 -04:00
Shantanu Nair
cabd358c3a Add missing token_max in reduce.py acombine_docs (#7241)
Replace this comment with:
- Description: reduce.py reduce chain implementation's acombine_docs
call does not propagate token_max. Without this, the async call will end
up using 3000 tokens, the default, for the collapse chain.
  - Tag maintainer: @hwchase17 @agola11 @baskaryan 
  - Twitter handle: https://twitter.com/ShantanuNair

Related PR: https://github.com/hwchase17/langchain/pull/7201 and
https://github.com/hwchase17/langchain/pull/7204

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 21:02:45 -04:00
Harrison Chase
52b016920c Harrison/update anthropic (#7237)
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2023-07-05 21:02:35 -04:00
Harrison Chase
695e7027e6 Harrison/parameter (#7081)
add parameter to use original question or not

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-05 20:51:25 -04:00
Yevgnen
930e319ca7 Add concurrency to GitbookLoader (#7069)
- Description: Fetch all pages concurrently.
- Dependencies: `scrape_all` -> `fetch_all` -> `_fetch_with_rate_limit`
-> `_fetch` (might be broken currently:
https://github.com/hwchase17/langchain/pull/6519)
  - Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 20:51:10 -04:00
Hashem Alsaket
6aa66fd2b0 Update Hugging Face Hub notebook (#7236)
Description: `flan-t5-xl` hangs, updated to `flan-t5-xxl`. Tested all
stabilityai LLMs- all hang so removed from tutorial. Temperature > 0 to
prevent unintended determinism.
Issue: #3275 
Tag maintainer: @baskaryan
2023-07-05 20:45:02 -04:00
Mykola Zomchak
8afc8e6f5d Fix web_base.py (#6519)
Fix for bug in SitemapLoader

`aiohttp` `get` does not accept `verify` argument, and currently throws
error, so SitemapLoader is not working

This PR fixes it by removing `verify` param for `get` function call

Fixes #6107

#### Who can review?

Tag maintainers/contributors who might be interested:

@eyurtsev

---------

Co-authored-by: techcenary <127699216+techcenary@users.noreply.github.com>
2023-07-05 16:53:57 -07:00
William FH
f891f7d69f Skip evaluation of unfinished runs (#7235)
Cut down on errors logged

Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
2023-07-05 16:35:20 -07:00
William FH
83cf01683e Add 'eval' tag (#7209)
Add an "eval" tag to traced evaluation runs

Most of this PR is actually
https://github.com/hwchase17/langchain/pull/7207 but I can't diff off
two separate PRs

---------

Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
2023-07-05 16:28:34 -07:00
William FH
607708a411 Add tags support for langchaintracer (#7207) 2023-07-05 16:19:04 -07:00
William FH
75aa408f10 Send evaluator logs to new session (#7206)
Also stop specifying "eval" mode since explicit project modes are
deprecated
2023-07-05 16:15:29 -07:00
Harrison Chase
0dc700eebf Harrison/scene xplain (#7228)
Co-authored-by: Kevin Pham <37129444+deoxykev@users.noreply.github.com>
2023-07-05 18:34:50 -04:00
Harrison Chase
d6541da161 remove arize nb (#7238)
was causing some issues with docs build
2023-07-05 18:34:20 -04:00
Mike Nitsenko
d669b9ece9 Document loader for Cube Semantic Layer (#6882)
### Description

This pull request introduces the "Cube Semantic Layer" document loader,
which demonstrates the retrieval of Cube's data model metadata in a
format suitable for passing to LLMs as embeddings. This enhancement aims
to provide contextual information and improve the understanding of data.

Twitter handle:
@the_cube_dev

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-07-05 15:18:12 -07:00
Tom
e533da8bf2 Adding Marqo to vectorstore ecosystem (#7068)
This PR brings in a vectorstore interface for
[Marqo](https://www.marqo.ai/).

The Marqo vectorstore exposes some of Marqo's functionality in addition
the the VectorStore base class. The Marqo vectorstore also makes the
embedding parameter optional because inference for embeddings is an
inherent part of Marqo.

Docs, notebook examples and integration tests included.

Related PR:
https://github.com/hwchase17/langchain/pull/2807

---------

Co-authored-by: Tom Hamer <tom@marqo.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 14:44:12 -07:00
Filip Haltmayer
836d2009cb Update milvus and zilliz docstring (#7216)
Description:

Updating the docstrings for Milvus and Zilliz so that they appear
correctly on https://integrations.langchain.com/vectorstores. No changes
done to code.

Maintainer: 

@baskaryan

Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-07-05 17:03:51 -04:00
Matt Robinson
d65b1951bd docs: update docs strings for base unstructured loaders (#7222)
### Summary

Updates the docstrings for the unstructured base loaders so more useful
information appears on the integrations page. If these look good, will
add similar docstrings to the other loaders.

### Reviewers
  - @rlancemartin
  - @eyurtsev
  - @hwchase17
2023-07-05 17:02:26 -04:00
Mike Salvatore
265f05b10e Enable InMemoryDocstore to be constructed without providing a dict (#6976)
- Description: Allow `InMemoryDocstore` to be created without passing a
dict to the constructor; the constructor can create a dict at runtime if
one isn't provided.
- Tag maintainer: @dev2049
2023-07-05 16:56:31 -04:00
Harrison Chase
47e7d09dff fix arize nb (#7227) 2023-07-05 16:55:48 -04:00
Feras Almannaa
79b59a8e06 optimize pgvector add_texts (#7185)
- Description: At the moment, inserting new embeddings to pgvector is
querying all embeddings every time as the defined `embeddings`
relationship is using the default params, which sets `lazy="select"`.
This change drastically improves the performance and adds a few
additional cleanups:
* remove `collection.embeddings.append` as it was querying all
embeddings on insert, replace with `collection_id` param
* centralize storing logic in add_embeddings function to reduce
duplication
  * remove boilerplate

- Issue: No issue was opened.
- Dependencies: None.
- Tag maintainer: this is a vectorstore update, so I think
@rlancemartin, @eyurtsev
- Twitter handle: @falmannaa
2023-07-05 13:19:42 -07:00
Harrison Chase
6711854e30 Harrison/dataforseo (#7214)
Co-authored-by: Alexander <sune357@gmail.com>
2023-07-05 16:02:02 -04:00
Richy Wang
cab7d86f23 Implement delete interface of vector store on AnalyticDB (#7170)
Hi, there
  This pull request contains two commit:
**1. Implement delete interface with optional ids parameter on
AnalyticDB.**
**2. Allow customization of database connection behavior by exposing
engine_args parameter in interfaces.**
- This commit adds the `engine_args` parameter to the interfaces,
allowing users to customize the behavior of the database connection. The
`engine_args` parameter accepts a dictionary of additional arguments
that will be passed to the create_engine function. Users can now modify
various aspects of the database connection, such as connection pool size
and recycle time. This enhancement provides more flexibility and control
to users when interacting with the database through the exposed
interfaces.

This commit is related to VectorStores @rlancemartin @eyurtsev 

Thank you for your attention and consideration.
2023-07-05 13:01:00 -07:00
Mike Salvatore
3ae11b7582 Handle kwargs in FAISS.load_local() (#6987)
- Description: This allows parameters such as `relevance_score_fn` to be
passed to the `FAISS` constructor via the `load_local()` class method.
-  Tag maintainer: @rlancemartin @eyurtsev
2023-07-05 15:56:40 -04:00
Jamal
a2f191a322 Replace JIRA Arbitrary Code Execution vulnerability with finer grain API wrapper (#6992)
This fixes #4833 and the critical vulnerability
https://nvd.nist.gov/vuln/detail/CVE-2023-34540

Previously, the JIRA API Wrapper had a mode that simply pipelined user
input into an `exec()` function.
[The intended use of the 'other' mode is to cover any of Atlassian's API
that don't have an existing
interface](cc33bde74f/langchain/tools/jira/prompt.py (L24))

Fortunately all of the [Atlassian JIRA API methods are subfunctions of
their `Jira`
class](https://atlassian-python-api.readthedocs.io/jira.html), so this
implementation calls these subfunctions directly.

As well as passing a string representation of the function to call, the
implementation flexibly allows for optionally passing args and/or
keyword-args. These are given as part of the dictionary input. Example:
```
    {
        "function": "update_issue_field",   #function to execute
        "args": [                           #list of ordered args similar to other examples in this JiraAPIWrapper
            "key",
            {"summary": "New summary"}
        ],
        "kwargs": {}                        #dict of key value keyword-args pairs
    }
```

the above is equivalent to `self.jira.update_issue_field("key",
{"summary": "New summary"})`

Alternate query schema designs are welcome to make querying easier
without passing and evaluating arbitrary python code. I considered
parsing (without evaluating) input python code and extracting the
function, args, and kwargs from there and then pipelining them into the
callable function via `*f(args, **kwargs)` - but this seemed more
direct.

@vowelparrot @dev2049

---------

Co-authored-by: Jamal Rahman <jamal.rahman@builder.ai>
2023-07-05 15:56:01 -04:00
Hakan Tekgul
61938a02a1 Create arize_llm_observability.ipynb (#7000)
Adding documentation and notebook for Arize callback handler. 

  - @dev2049
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
2023-07-05 15:55:47 -04:00
Leonid Ganeline
ecee4d6e92 docs: update youtube videos and tutorials (#6515)
added tutorials.mdx; updated youtube.mdx

Rationale: the Tutorials section in the documentation is top-priority.
(for example, https://pytorch.org/docs/stable/index.html) Not every
project has resources to make tutorials. We have such a privilege.
Community experts created several tutorials on YouTube. But the tutorial
links are now hidden on the YouTube page and not easily discovered by
first-time visitors.

- Added new videos and tutorials that were created since the last
update.
- Made some reprioritization between videos on the base of the view
numbers.

#### Who can review?

  - @hwchase17
    - @dev2049
2023-07-05 12:50:31 -07:00
Santiago Delgado
fa55c5a16b Fixed Office365 tool __init__.py files, tests, and get_tools() function (#7046)
## Description
Added Office365 tool modules to `__init__.py` files
## Issue
As described in Issue
https://github.com/hwchase17/langchain/issues/6936, the Office365
toolkit can't be loaded easily because it is not included in the
`__init__.py` files.
## Reviewer
@dev2049
2023-07-05 15:46:21 -04:00
wewebber-merlin
8a7c95e555 Retryable exception for empty OpenAI embedding. (#7070)
Description:

The OpenAI "embeddings" API intermittently falls into a failure state
where an embedding is returned as [ Nan ], rather than the expected 1536
floats. This patch checks for that state (specifically, for an embedding
of length 1) and if it occurs, throws an ApiError, which will cause the
chunk to be retried.

Issue:

I have been unable to find an official langchain issue for this problem,
but it is discussed (by another user) at
https://stackoverflow.com/questions/76469415/getting-embeddings-of-length-1-from-langchain-openaiembeddings

Maintainer: @dev2049

Testing: 

Since this is an intermittent OpenAI issue, I have not provided a unit
or integration test. The provided code has, though, been run
successfully over several million tokens.

---------

Co-authored-by: William Webber <william@williamwebber.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 15:23:45 -04:00
Nuno Campos
e4459e423b Mark some output parsers as serializable (cross-checked w/ JS) (#7083)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-05 14:53:56 -04:00
Ankush Gola
4c1c05c2c7 support adding custom metadata to runs (#7120)
- [x] wire up tools
- [x] wire up retrievers
- [x] add integration test

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-05 11:11:38 -07:00
Josh Reini
30d8d1d3d0 add trulens integration (#7096)
Description: Add TruLens integration.

Twitter: @trulensml

For review:
  - Tracing: @agola11
  - Tools: @hinthornw
2023-07-05 14:04:55 -04:00
Hyoseung Kim
9abf1847f4 Fix steamship import error (#7133)
Description: Fix steamship import error

When running multi_modal_output_agent:
field "steamship" not yet prepared so type is still a ForwardRef, you
might need to call SteamshipImageGenerationTool.update_forward_refs().

Tag maintainer: @hinthornw

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 14:04:38 -04:00
Mohammad Mohtashim
7d92e9407b Jinja2 validation changed to issue warnings rather than issuing exceptions. (#7161)
- Description: If their are missing or extra variables when validating
Jinja 2 template then a warning is issued rather than raising an
exception. This allows for better flexibility for the developer as
described in #7044. Also changed the relevant test so pytest is checking
for raised warnings rather than exceptions.
  - Issue: #7044 
  - Tag maintainer: @hwchase17, @baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 14:04:29 -04:00
whying
e288410e72 fix: Chroma filter symbols not supporting LIKE and CONTAIN (#7169)
Fixing issue with SelfQueryRetriever due to unsupported LIKE and CONTAIN
comparators in Chroma's WHERE filter statements. This pull request
introduces a redefined set of comparators in Chroma to address the
problem and make it compatible with SelfQueryRetriever. For information
on the comparators supported by Chroma's filter, please refer to
https://docs.trychroma.com/usage-guide#using-where-filters.
<img width="495" alt="image"
src="https://github.com/hwchase17/langchain/assets/22267652/34789191-0293-4f63-9bdf-ad1e1f2567c4">

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 14:04:18 -04:00
Nuno Campos
26409b01bd Remove extra base model (#7213)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-05 14:02:27 -04:00
Samhita Alla
6f358bb04a make textstat optional in the flyte callback handler (#7186)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

This PR makes the `textstat` library optional in the Flyte callback
handler.

@hinthornw, would you mind reviewing this PR since you merged the flyte
callback handler code previously?

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
2023-07-05 13:15:56 -04:00
Conrad Fernandez
6eff0fa2ca Added documentation for add_texts function for Pinecone integration (#7134)
- Description: added some documentation to the Pinecone vector store
docs page.
- Issue: #7126 
- Dependencies: None
- Tag maintainer: @baskaryan 

I can add more documentation on the Pinecone integration functions as I
am going to go in great depth into this area. Just wanted to check with
the maintainers is if this is all good.
2023-07-05 13:11:37 -04:00
Nuno Campos
81e5b1ad36 Add serialized object to retriever start callback (#7074)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-05 18:04:43 +01:00
Efkan S. Goktepe
baf48d3583 Replace stop clause with shorter, pythonic alternative (#7159)
Replace this comment with:
- Description: Replace `if var is not None:` with `if var:`, a concise
and pythonic alternative
  - Issue: N/A
  - Dependencies: None
  - Tag maintainer: Unsure
  - Twitter handle: N/A

Signed-off-by: serhatgktp <efkan@ibm.com>
2023-07-05 13:03:22 -04:00
Shuqian
8045870a0f fix: prevent adding an empty string to the result queue in AsyncIteratorCallbackHandler (#7180)
- Description: Modify the code for
AsyncIteratorCallbackHandler.on_llm_new_token to ensure that it does not
add an empty string to the result queue.
- Tag maintainer: @agola11

When using AsyncIteratorCallbackHandler with OpenAIFunctionsAgent, if
the LLM response function_call instead of direct answer, the
AsyncIteratorCallbackHandler.on_llm_new_token would be called with empty
string.
see also: langchain.chat_models.openai.ChatOpenAI._generate

An alternative solution is to modify the
langchain.chat_models.openai.ChatOpenAI._generate and do not call the
run_manager.on_llm_new_token when the token is empty string.
I am not sure which solution is better.

@hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 13:00:35 -04:00
felixocker
db98c44f8f Support for SPARQL (#7165)
# [SPARQL](https://www.w3.org/TR/rdf-sparql-query/) for
[LangChain](https://github.com/hwchase17/langchain)

## Description
LangChain support for knowledge graphs relying on W3C standards using
RDFlib: SPARQL/ RDF(S)/ OWL with special focus on RDF \
* Works with local files, files from the web, and SPARQL endpoints
* Supports both SELECT and UPDATE queries
* Includes both a Jupyter notebook with an example and integration tests

## Contribution compared to related PRs and discussions
* [Wikibase agent](https://github.com/hwchase17/langchain/pull/2690) -
uses SPARQL, but specifically for wikibase querying
* [Cypher qa](https://github.com/hwchase17/langchain/pull/5078) - graph
DB question answering for Neo4J via Cypher
* [PR 6050](https://github.com/hwchase17/langchain/pull/6050) - tries
something similar, but does not cover UPDATE queries and supports only
RDF
* Discussions on [w3c mailing list](mailto:semantic-web@w3.org) related
to the combination of LLMs (specifically ChatGPT) and knowledge graphs

## Dependencies
* [RDFlib](https://github.com/RDFLib/rdflib)

## Tag maintainer
Graph database related to memory -> @hwchase17
2023-07-05 13:00:16 -04:00
Paul Cook
7cd0936b1c Update in_memory.py to fix "TypeError: keywords must be strings" (#7202)
Update in_memory.py to fix "TypeError: keywords must be strings" on
certain dictionaries

Simple fix to prevent a "TypeError: keywords must be strings" error I
encountered in my use case.

@baskaryan 

Thanks! Hope useful!

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 12:48:38 -04:00
Prakul Agarwal
38f853dfa3 Fixed typos in MongoDB Atlas Vector Search documentation (#7174)
Fix for typos in MongoDB Atlas Vector Search documentation
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-05 12:48:00 -04:00
Shuqian
ee1d488c03 fix: rename the invalid function name of GoogleSerperResults Tool for OpenAIFunctionCall (#7176)
- Description: rename the invalid function name of GoogleSerperResults
Tool for OpenAIFunctionCall
- Tag maintainer: @hinthornw

When I use the GoogleSerperResults in OpenAIFunctionCall agent, the
following error occurs:
```shell
openai.error.InvalidRequestError: 'Google Serrper Results JSON' does not match '^[a-zA-Z0-9_-]{1,64}$' - 'functions.0.name'
```

So I rename the GoogleSerperResults's property "name" from "Google
Serrper Results JSON" to "google_serrper_results_json" just like
GoogleSerperRun's name: "google_serper", and it works.
I guess this should be reasonable.
2023-07-05 12:47:50 -04:00
Nir Gazit
6666e422c6 fix: missing parameter in POST/PUT/PATCH HTTP requests (#7194)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
@hinthornw

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-05 12:47:30 -04:00
Harrison Chase
8410c6a747 add token max parameter (#7204) 2023-07-05 12:09:25 -04:00
Harrison Chase
7b585c7585 add tqdm to embeddings (#7205)
for longer running embeddings, can be helpful to visualize
2023-07-05 12:04:22 -04:00
Raouf Chebri
6fc24743b7 Add pg_hnsw vectorstore integration (#6893)
Hi @rlancemartin, @eyurtsev!

- Description: Adding HNSW extension support for Postgres. Similar to
pgvector vectorstore, with 3 differences
      1. it uses HNSW extension for exact and ANN searches, 
      2. Vectors are of type array of real
      3. Only supports L2
      
- Dependencies: [HNSW](https://github.com/knizhnik/hnsw) extension for
Postgres
  
  - Example:
  ```python
    db = HNSWVectoreStore.from_documents(
      embedding=embeddings,
      documents=docs,
      collection_name=collection_name,
      connection_string=connection_string
  )
  
  query = "What did the president say about Ketanji Brown Jackson"
docs_with_score: List[Tuple[Document, float]] =
db.similarity_search_with_score(query)
  ```

The example notebook is in the PR too.
2023-07-05 08:10:10 -07:00
Harrison Chase
79fb90aafd bump version to 224 (#7203) 2023-07-05 10:41:26 -04:00
Harrison Chase
1415966d64 propogate token max (#7201) 2023-07-05 10:25:48 -04:00
Harrison Chase
a94c4cca68 more formatting (#7200) 2023-07-05 10:03:02 -04:00
Harrison Chase
e18e838aae fix weird bold issues in docs (#7198) 2023-07-05 09:52:49 -04:00
Baichuan Sun
e27ba9d92b fix AmazonAPIGateway _identifying_params (#7167)
- correct `endpoint_name` to `api_url`
- add `headers`

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-04 23:14:51 -04:00
Harrison Chase
39e685b80f Harrison/conv retrieval docs (#7080)
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-04 20:17:43 -04:00
Shuqian
bf9e4ef35f feat: implement python repl tool arun (#7125)
Description: implement python repl tool arun
Tag maintainer: @agola11
2023-07-04 20:15:49 -04:00
Alex Iribarren
9cfb311ecb Remove duplicate lines (#7138)
I believe these two lines are unnecessary, the variable `function_call`
is already defined.
2023-07-04 20:13:27 -04:00
volodymyr-memsql
405865c91a feat(SingleStoreVectorStore): change connection attributes in the database connection (#7142)
Minor change to the SingleStoreVectorStore:

Updated connection attributes names according to the SingleStoreDB
recommendations

@rlancemartin, @eyurtsev

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
2023-07-04 20:12:56 -04:00
Hashem Alsaket
c9f696f063 LlamaCppEmbeddings not under langchain.llms (#7164)
Description: doc string suggests `from langchain.llms import
LlamaCppEmbeddings` under `LlamaCpp()` class example but
`LlamaCppEmbeddings` is not in `langchain.llms`
Issue: None open
Tag maintainer: @baskaryan
2023-07-04 19:32:40 -04:00
Harrison Chase
e8531769f7 improve docstring of doc formatting (#7162)
so it shows up nice
2023-07-04 19:31:29 -04:00
Max Cembalest
2984803597 cleaned Arthur tracking demo notebook (#7147)
Cleaned title and reduced clutter for integration demo notebook for the
Arthur callback handler
2023-07-04 18:15:25 -04:00
Deepankar Mahapatro
da69a6771f docs: update Jina ecosystem (#7149)
Documentation update for [Jina
ecosystem](https://python.langchain.com/docs/ecosystem/integrations/jina)
and `langchain-serve` in the deployments section to latest features.

@hwchase17 

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-04 18:07:50 -04:00
Harrison Chase
b39017dc11 add docstring for in memory class (#7160) 2023-07-04 14:59:17 -07:00
Bagatur
898087d02c bump 223 (#7155) 2023-07-04 14:13:41 -06:00
Harrison Chase
0ad984fa27 Docs combine document chain (#6994)
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-04 12:51:04 -06:00
Simon Cheung
81eebc4070 Add HugeGraphQAChain to support gremlin generating chain (#7132)
[Apache HugeGraph](https://github.com/apache/incubator-hugegraph) is a
convenient, efficient, and adaptable graph database, compatible with the
Apache TinkerPop3 framework and the Gremlin query language.

In this PR, the HugeGraph and HugeGraphQAChain provide the same
functionality as the existing integration with Neo4j and enables query
generation and question answering over HugeGraph database. The
difference is that the graph query language supported by HugeGraph is
not cypher but another very popular graph query language
[Gremlin](https://tinkerpop.apache.org/gremlin.html).

A notebook example and a simple test case have also been added.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-04 10:21:21 -06:00
Saverio Proto
5585607654 Improve Bing Search example (#7128)
# Description

Improve Bing Search example:
2023-07-04 09:58:03 -06:00
Lance Martin
265c285057 Fix GPT4All bug w/ "n_ctx" param (#7093)
Running `GPT4All` per the
[docs](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/gpt4all),
I see:

```
$ from langchain.llms import GPT4All
$ model = GPT4All(model=local_path)
$ model("The capital of France is ", max_tokens=10)
TypeError: generate() got an unexpected keyword argument 'n_ctx'
```

It appears `n_ctx` is [no longer a supported
param](https://docs.gpt4all.io/gpt4all_python.html#gpt4all.gpt4all.GPT4All.generate)
in the GPT4All API from https://github.com/nomic-ai/gpt4all/pull/1090.

It now uses `max_tokens`, so I set this.

And I also set other defaults used in GPT4All client
[here](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/python/gpt4all/gpt4all.py).

Confirm it now works:
```
$ from langchain.llms import GPT4All
$ model = GPT4All(model=local_path)
$ model("The capital of France is ", max_tokens=10)
< Model logging > 
"....Paris."
```

---------

Co-authored-by: R. Lance Martin <rlm@Rs-MacBook-Pro.local>
2023-07-04 08:53:52 -07:00
Stefano Lottini
6631fd5168 Align cassio versions between examples for Cassandra integration (#7099)
Just reducing confusion by requiring cassio>=0.0.7 consistently across
examples.
2023-07-04 04:21:48 -06:00
Nuno Campos
696886f397 Use serialized format for messages in tracer (#6827)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-04 10:19:08 +01:00
Ruixi Fan
0b69a7e9ab [Document fix] Fix an expired link qa_benchmarking_pg.ipynb (#7110)
## Change description

- Description: Fix an expired link that points to the readthedocs site.
  - Dependencies: No
2023-07-03 19:03:16 -06:00
Lance Martin
9ca4c54428 Minor updates to notebook for MultiQueryRetriever (#7102)
* Add an easier-to-run example.
* Add logging per https://github.com/hwchase17/langchain/pull/6891.
* Updated params per https://github.com/hwchase17/langchain/pull/5962.

---------

Co-authored-by: R. Lance Martin <rlm@Rs-MacBook-Pro.local>
Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-03 17:32:50 -07:00
William FH
dfa48dc3b5 Update sdk version (#7109) 2023-07-03 16:42:08 -07:00
William FH
04001ff077 Log errors (#7105)
Re-add change that was inadvertently undone in #6995
2023-07-03 14:47:32 -07:00
William FH
3f9744c9f4 Accept no 'reasoning' response in qa evaluator (#7107)
Re add since #6995 inadvertently undid #7031
2023-07-03 14:47:17 -07:00
Bagatur
fd3f8efec7 fix retriever signatures (#7097) 2023-07-03 14:21:36 -06:00
Nicolas
490fcf9d98 docs: New experimental UI for Mendable Search (#6558)
This PR introduces a new Mendable UI tailored to a better search
experience.

We're more closely integrating our traditional search with our AI
generation.
With this change, you won't have to tab back and forth between the
mendable bot and the keyword search. Both types of search are handled in
the same bar. This should make the docs easier to navigate. while still
letting users get code generations or AI-summarized answers if they so
wish. Also, it should reduce the cost.

Would love to hear your feedback :)

Cc: @dev2049 @hwchase17
2023-07-03 20:52:13 +01:00
Nuno Campos
c8f8b1b327 Add events to tracer runs (#7090)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-03 12:43:43 -07:00
genewoo
e49abd1277 Add Metal support to llama.cpp doc (#7092)
- Description: Add Metal support to llama.cpp doc
  - Issue: #7091 
  - Dependencies: N/A
  - Twitter handle: gene_wu
2023-07-03 13:35:39 -06:00
Bagatur
fad2c7e5e0 update pr tmpl (#7095) 2023-07-03 13:34:03 -06:00
Nuno Campos
98dbea6310 Add tags to all callback handler methods (#7073)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-03 10:39:46 -07:00
Mike Salvatore
d0c7f7c317 Remove None default value for FAISS relevance_score_fn (#7085)
## Description

The type hint for `FAISS.__init__()`'s `relevance_score_fn` parameter
allowed the parameter to be set to `None`. However, a default function
is provided by the constructor. This led to an unnecessary check in the
code, as well as a test to verify this check.

**ASSUMPTION**: There's no reason to ever set `relevance_score_fn` to
`None`.

This PR changes the type hint and removes the unnecessary code.
2023-07-03 10:11:49 -06:00
Bagatur
719316e84c bump 222 (#7086) 2023-07-03 10:03:55 -06:00
rjarun8
e2d61ab85a Add SpacyEmbeddings class (#6967)
- Description: Added a new SpacyEmbeddings class for generating
embeddings using the Spacy library.
- Issue: Sentencebert/Bert/Spacy/Doc2vec embedding support #6952
- Dependencies: This change requires the Spacy library and the
'en_core_web_sm' Spacy model.
- Tag maintainer: @dev2049
- Twitter handle: N/A

This change includes a new SpacyEmbeddings class, but does not include a
test or an example notebook.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-03 09:38:31 -06:00
Leonid Ganeline
16fbd528c5 docs: commented out editUrl option (#6440) 2023-07-03 07:59:11 -07:00
adam91holt
80e86b602e Remove duplicate mongodb integration doc (#7006) 2023-07-03 02:23:33 -06:00
joaomsimoes
c669d98693 Update get_started.mdx (#7005)
typo in chat = ChatOpenAI(open_api_key="...") should be openai_api_key
2023-07-03 02:23:12 -06:00
Bagatur
1cdb33a090 openapi chain nit (#7012) 2023-07-03 02:22:53 -06:00
Johnny Lim
a081e419a0 Fix sample in FAISS section (#7050)
This PR fixes a sample in the FAISS section in the reference docs.
2023-07-03 02:18:32 -06:00
Ikko Eltociear Ashimine
be93775ebc Fix typo in google_places_api.py (#7055) 2023-07-03 02:14:18 -06:00
Harrison Chase
60b05511d3 move base prompt to schema (#6995)
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-02 22:38:59 -04:00
Leonid Ganeline
200be43da6 added Brave Search document_loader (#6989)
- Added `Brave Search` document loader.
- Refactored BraveSearch wrapper
- Added a Jupyter Notebook example
- Added `Ecosystem/Integrations` BraveSearch page 

Please review:
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
2023-07-02 19:01:24 -07:00
Sergey Kozlov
6d15854cda Add JSON Lines support to JSONLoader (#6913)
**Description**:

The JSON Lines format is used by some services such as OpenAI and
HuggingFace. It's also a convenient alternative to CSV.

This PR adds JSON Lines support to `JSONLoader` and also updates related
tests.

**Tag maintainer**: @rlancemartin, @eyurtsev.

PS I was not able to build docs locally so didn't update related
section.
2023-07-02 12:32:41 -07:00
Ofer Mendelevitch
153b56d19b Vectara upd2 (#6506)
Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-07-02 12:15:50 -07:00
Leonid Ganeline
1feac83323 docstrings document_loaders 2 (#6890)
updated docstring for the `document_loaders`

Maintainer responsibilities:
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
2023-07-02 12:14:22 -07:00
Leonid Ganeline
77ae8084a0 docstrings document_loaders 1 (#6847)
- Updated docstrings in `document_loaders`
- several code fixes.
- added `docs/extras/ecosystem/integrations/airtable.md`

@rlancemartin, @eyurtsev
2023-07-02 12:13:04 -07:00
0xcha05
e41b382e1c Added filter and delete all option to delete function in Pinecone integration, updated base VectorStore's delete function (#6876)
### Description:
Updated the delete function in the Pinecone integration to allow for
deletion of vectors by specifying a filter condition, and to delete all
vectors in a namespace.

Made the ids parameter optional in the delete function in the base
VectorStore class and allowed for additional keyword arguments.

Updated the delete function in several classes (Redis, Chroma, Supabase,
Deeplake, Elastic, Weaviate, and Cassandra) to match the changes made in
the base VectorStore class. This involved making the ids parameter
optional and allowing for additional keyword arguments.
2023-07-02 11:46:19 -07:00
Bagatur
5a45363954 bump 221 (#7047) 2023-07-02 08:32:15 -06:00
Bagatur
7acd524210 Rm retriever kwargs (#7013)
Doesn't actually limit the Retriever interface but hopefully in practice
it does
2023-07-02 08:22:24 -06:00
Johnny Lim
9dc77614e3 Polish reference docs (#7045)
This PR fixes broken links in the reference docs.
2023-07-02 08:08:51 -06:00
skspark
e5f6f0ffc4 Support params on GoogleSearchApiWrapper (#6810) (#7014)
## Description
Support search params in GoogleSearchApiWrapper's result call, for the
extra filtering on search,
to support extra query parameters that google cse provides:

https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list?hl=ko

## Issue
#6810
2023-07-02 01:18:38 -06:00
Johnny Lim
052c797429 Fix typo (#7023)
This PR fixes a typo.
2023-07-02 01:17:30 -06:00
Alex Iribarren
dc2264619a Fix openai multi functions agent docs (#7028) 2023-07-02 01:16:40 -06:00
William FH
6a64870ea0 Accept no 'reasoning' response in qa evaluator (#7030) 2023-07-01 12:46:19 -07:00
William FH
7ebb76a5fa Log Errors in Evaluator Callback (#7031) 2023-07-01 12:10:00 -07:00
Stefano Lottini
8d2281a8ca Second Attempt - Add concurrent insertion of vector rows in the Cassandra Vector Store (#7017)
Retrying with the same improvements as in #6772, this time trying not to
mess up with branches.

@rlancemartin doing a fresh new PR from a branch with a new name. This
should do. Thank you for your help!

---------

Co-authored-by: Jonathan Ellis <jbellis@datastax.com>
Co-authored-by: rlm <pexpresss31@gmail.com>
2023-07-01 11:09:52 -07:00
Harrison Chase
3bfe7cf467 Harrison/split schema dir (#7025)
should be no functional changes

also keep __init__ exposing a lot for backwards compat

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-01 13:39:19 -04:00
Davis Chase
556c425042 Improve docstrings for langchain.schema.py (#6802)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-01 09:46:52 -07:00
Matt Robinson
0498dad562 feat: enable UnstructuredEmailLoader to process attachments (#6977)
### Summary

Updates `UnstructuredEmailLoader` so that it can process attachments in
addition to the e-mail content. The loader will process attachments if
the `process_attachments` kwarg is passed when the loader is
instantiated.

### Testing

```python

file_path = "fake-email-attachment.eml"
loader = UnstructuredEmailLoader(
    file_path, mode="elements", process_attachments=True
)
docs = loader.load()
docs[-1]
```

### Reviewers

-  @rlancemartin 
-  @eyurtsev
- @hwchase17
2023-07-01 06:09:26 -07:00
Matthew Foster Walsh
59697b406d Fix typo in quickstart.mdx (#6985)
Removed an extra "to" from a sentence. @dev2049 very minor documentation
fix.
2023-07-01 02:53:52 -06:00
Paul Grillenberger
aa37b10b28 Fix: Correct typo (#6988)
Description: Correct a minor typo in the docs. @dev2049
2023-07-01 02:53:34 -06:00
Zander Chase
b0859c9b18 Add New Retriever Interface with Callbacks (#5962)
Handle the new retriever events in a way that (I think) is entirely
backwards compatible? Needs more testing for some of the chain changes
and all.

This creates an entire new run type, however. We could also just treat
this as an event within a chain run presumably (same with memory)

Adds a subclass initializer that upgrades old retriever implementations
to the new schema, along with tests to ensure they work.

First commit doesn't upgrade any of our retriever implementations (to
show that we can pass the tests along with additional ones testing the
upgrade logic).

Second commit upgrades the known universe of retrievers in langchain.

- [X] Add callback handling methods for retriever start/end/error (open
to renaming to 'retrieval' if you want that)
- [X] Update BaseRetriever schema to support callbacks
- [X] Tests for upgrading old "v1" retrievers for backwards
compatibility
- [X] Update existing retriever implementations to implement the new
interface
- [X] Update calls within chains to .{a]get_relevant_documents to pass
the child callback manager
- [X] Update the notebooks/docs to reflect the new interface
- [X] Test notebooks thoroughly


Not handled:
- Memory pass throughs: retrieval memory doesn't have a parent callback
manager passed through the method

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2023-06-30 14:44:03 -07:00
William FH
a5b206caf3 Remove Promptlayer Notebook (#6996)
It's breaking our docs build
2023-06-30 14:30:24 -07:00
Daniel Chalef
b26cca8008 Zep Authentication (#6728)
## Description: Add Zep API Key argument to ZepChatMessageHistory and
ZepRetriever
- correct docs site links
- add zep api_key auth to constructors

ZepChatMessageHistory: @hwchase17, 
ZepRetriever: @rlancemartin, @eyurtsev
2023-06-30 14:24:26 -07:00
William FH
e4625846e5 Add Flyte Callback Handler (#6139) (#6986)
Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Co-authored-by: Samhita Alla <aallasamhita@gmail.com>
2023-06-30 12:25:22 -07:00
Bagatur
e3b7effc8f Beef up import test (#6979) 2023-06-30 09:26:05 -07:00
Bagatur
1ce9ef3828 Rm pytz dep (#6978) 2023-06-30 09:24:01 -07:00
Davis Chase
eb180e321f Page per class-style api reference (#6560)
can make it prettier, but what do we think of overall structure?

https://api.python.langchain.com/en/dev2049-page_per_class/api_ref.html

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-06-30 09:23:32 -07:00
William FH
64039b9f11 Promptlayer Callback (#6975)
Co-authored-by: Saleh Hindi <saleh.hindi.one@gmail.com>
Co-authored-by: jped <jonathanped@gmail.com>
2023-06-30 08:32:42 -07:00
William FH
13c62cf6b1 Arthur Callback (#6972)
Co-authored-by: Max Cembalest <115359769+arthuractivemodeling@users.noreply.github.com>
2023-06-30 07:48:02 -07:00
William FH
8c73037dff Simplify eval arg names (#6944)
It'll be easier to switch between these if the names of predictions are
consistent
2023-06-30 07:47:53 -07:00
Bagatur
8f5eca236f release v220 (#6962) 2023-06-30 06:52:09 -07:00
Bagatur
60b0d6ea35 Bagatur/openllm ensure available (#6960)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-30 00:54:23 -07:00
Siraj Aizlewood
521c6f0233 Provided default values for tags and inheritable_tags args in BaseRun… (#6858)
when running AsyncCallbackManagerForChainRun (from
langchain.callbacks.manager import AsyncCallbackManagerForChainRun),
provided default values for tags and inheritable_tages of empty lists in
manager.py BaseRunManager.


- Description: In manager.py, `BaseRunManager`, default values were
provided for the `__init__` args `tags` and `inheritable_tags`. They
default to empty lists (`[]`).
- Issue: When trying to use Nvidia NeMo Guardrails with LangChain, the
following exception was raised:
2023-06-29 22:01:08 -07:00
Davis Chase
bd6a0ee9e9 Redirect vecstores (#6948) 2023-06-29 19:22:21 -07:00
Davis Chase
f780678910 Add back in clickhouse mongo vecstore notebooks (#6949) 2023-06-29 19:21:47 -07:00
Jacob Lee
73831ef3d8 Change code block color scheme (#6945)
Adds contrast, makes code blocks more readable.
2023-06-29 19:21:11 -07:00
Tahjyei Thompson
7d8830f707 Add OpenAIMultiFunctionsAgent to import list in agents directory (#6824)
- Added OpenAIMultiFunctionsAgent to the import list of the Agents
directory

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-29 18:34:26 -07:00
Matt Florence
0f6737735d Order messages in PostgresChatMessageHistory (#6830)
Fixes issue: https://github.com/hwchase17/langchain/issues/6829

This guarantees message history is in the correct order. 

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-29 18:10:28 -07:00
lucasiscovici
e9950392dd Add password to PyPDR loader and parser (#6908)
Add password to PyPDR loader and parser

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-29 17:35:50 -07:00
Zander Chase
429f4dbe4d Add Input Mapper in run_on_dataset (#6894)
If you create a dataset from runs and run the same chain or llm on it
later, it usually works great.

If you have an agent dataset and want to run a different agent on it, or
have more complex schema, it's hard for us to automatically map these
values every time. This PR lets you pass in an input_mapper function
that converts the example inputs to whatever format your model expects
2023-06-29 16:53:49 -07:00
Lei Pan
76d03f398d support max_chunk_bytes in OpensearchVectorSearch to pass down to bulk (#6855)
Support `max_chunk_bytes` kwargs to pass down to `buik` helper, in order
to support the request limits in Opensearch locally and in AWS.

@rlancemartin, @eyurtsev
2023-06-29 15:50:08 -07:00
Hashem Alsaket
5861770a53 Updated QA notebook (#6801)
Description: `all_metadatas` was not defined, `OpenAIEmbeddings` was not
imported,
Issue: #6723 the issue # it fixes (if applicable),
Dependencies: lark,
Tag maintainer: @vowelparrot , @dev2049

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-29 15:41:53 -07:00
Kacper Łukawski
140ba682f1 Support named vectors in Qdrant (#6871)
# Description

This PR makes it possible to use named vectors from Qdrant in Langchain.
That was requested multiple times, as people want to reuse externally
created collections in Langchain. It doesn't change anything for the
existing applications. The changes were covered with some integration
tests and included in the docs.

## Example

```python
Qdrant.from_documents(
    docs,
    embeddings,
    location=":memory:",
    collection_name="my_documents",
    vector_name="custom_vector",
)
```

### Issue: #2594 

Tagging @rlancemartin & @eyurtsev. I'd appreciate your review.
2023-06-29 15:14:22 -07:00
bradcrossen
9ca1cf003c Re-add Support for SQLAlchemy <1.4 (#6895)
Support for SQLAlchemy 1.3 was removed in version 0.0.203 by change
#6086. Re-adding support.

- Description: Imports SQLAlchemy Row at class creation time instead of
at init to support SQLAlchemy <1.4. This is the only breaking change and
was introduced in version 0.0.203 #6086.
  
A similar change was merged before:
https://github.com/hwchase17/langchain/pull/4647
  
  - Dependencies: Reduces SQLAlchemy dependency to > 1.3
  - Tag maintainer: @rlancemartin, @eyurtsev, @hwchase17, @wangxuqi

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-29 14:49:35 -07:00
corranmac
20c6ade2fc Grobid parser for Scientific Articles from PDF (#6729)
### Scientific Article PDF Parsing via Grobid

`Description:`
This change adds the GrobidParser class, which uses the Grobid library
to parse scientific articles into a universal XML format containing the
article title, references, sections, section text etc. The GrobidParser
uses a local Grobid server to return PDFs document as XML and parses the
XML to optionally produce documents of individual sentences or of whole
paragraphs. Metadata includes the text, paragraph number, pdf relative
bboxes, pages (text may overlap over two pages), section title
(Introduction, Methodology etc), section_number (i.e 1.1, 2.3), the
title of the paper and finally the file path.
      
Grobid parsing is useful beyond standard pdf parsing as it accurately
outputs sections and paragraphs within them. This allows for
post-fitering of results for specific sections i.e. limiting results to
the methodology section or results. While sections are split via
headings, ideally they could be classified specifically into
introduction, methodology, results, discussion, conclusion. I'm
currently experimenting with chatgpt-3.5 for this function, which could
later be implemented as a textsplitter.

`Dependencies:`
For use, the grobid repo must be cloned and Java must be installed, for
colab this is:

```
!apt-get install -y openjdk-11-jdk -q
!update-alternatives --set java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
!git clone https://github.com/kermitt2/grobid.git
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-11-openjdk-amd64"
os.chdir('grobid')
!./gradlew clean install
```

Once installed the server is ran on localhost:8070 via
```
get_ipython().system_raw('nohup ./gradlew run > grobid.log 2>&1 &')
```

@rlancemartin, @eyurtsev

Twitter Handle: @Corranmac

Grobid Demo Notebook is
[here](https://colab.research.google.com/drive/1X-St_mQRmmm8YWtct_tcJNtoktbdGBmd?usp=sharing).

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-29 14:29:29 -07:00
Baichuan Sun
6157bdf9d9 Add API Header for Amazon API Gateway Authentication (#6902)
Add API Headers support for Amazon API Gateway to enable Authentication
using DynamoDB.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-06-29 12:58:07 -07:00
Wey Gu
1c66aa6d56 chore: NebulaGraph prompt optmization (#6904)
Was preparing for a demo project of NebulaGraphQAChain to find out the
prompt needed to be optimized a little bit.

Please @hwchase17 kindly help review.

Thanks!
2023-06-29 12:57:39 -07:00
Harrison Chase
0ba175e13f move octo notebook (#6901) 2023-06-29 12:20:55 -07:00
Stefano Lottini
75fb9d2fdc Cassandra support for chat history using CassIO library (#6771)
### Overview

This PR aims at building on #4378, expanding the capabilities and
building on top of the `cassIO` library to interface with the database
(as opposed to using the core drivers directly).

Usage of `cassIO` (a library abstracting Cassandra access for
ML/GenAI-specific purposes) is already established since #6426 was
merged, so no new dependencies are introduced.

In the same spirit, we try to uniform the interface for using Cassandra
instances throughout LangChain: all our appreciation of the work by
@jj701 notwithstanding, who paved the way for this incremental work
(thank you!), we identified a few reasons for changing the way a
`CassandraChatMessageHistory` is instantiated. Advocating a syntax
change is something we don't take lighthearted way, so we add some
explanations about this below.

Additionally, this PR expands on integration testing, enables use of
Cassandra's native Time-to-Live (TTL) features and improves the phrasing
around the notebook example and the short "integrations" documentation
paragraph.

We would kindly request @hwchase to review (since this is an elaboration
and proposed improvement of #4378 who had the same reviewer).

### About the __init__ breaking changes

There are
[many](https://docs.datastax.com/en/developer/python-driver/3.28/api/cassandra/cluster/)
options when creating the `Cluster` object, and new ones might be added
at any time. Choosing some of them and exposing them as `__init__`
parameters `CassandraChatMessageHistory` will prove to be insufficient
for at least some users.

On the other hand, working through `kwargs` or adding a long, long list
of arguments to `__init__` is not a desirable option either. For this
reason, (as done in #6426), we propose that whoever instantiates the
Chat Message History class provide a Cassandra `Session` object, ready
to use. This also enables easier injection of mocks and usage of
Cassandra-compatible connections (such as those to the cloud database
DataStax Astra DB, obtained with a different set of init parameters than
`contact_points` and `port`).

We feel that a breaking change might still be acceptable since LangChain
is at `0.*`. However, while maintaining that the approach we propose
will be more flexible in the future, room could be made for a
"compatibility layer" that respects the current init method. Honestly,
we would to that only if there are strong reasons for it, as that would
entail an additional maintenance burden.

### Other changes

We propose to remove the keyspace creation from the class code for two
reasons: first, production Cassandra instances often employ RBAC so that
the database user reading/writing from tables does not necessarily (and
generally shouldn't) have permission to create keyspaces, and second
that programmatic keyspace creation is not a best practice (it should be
done more or less manually, with extra care about schema mismatched
among nodes, etc). Removing this (usually unnecessary) operation from
the `__init__` path would also improve initialization performance
(shorter time).

We suggest, likewise, to remove the `__del__` method (which would close
the database connection), for the following reason: it is the
recommended best practice to create a single Cassandra `Session` object
throughout an application (it is a resource-heavy object capable to
handle concurrency internally), so in case Cassandra is used in other
ways by the app there is the risk of truncating the connection for all
usages when the history instance is destroyed. Moreover, the `Session`
object, in typical applications, is best left to garbage-collect itself
automatically.

As mentioned above, we defer the actual database I/O to the `cassIO`
library, which is designed to encode practices optimized for LLM
applications (among other) without the need to expose LangChain
developers to the internals of CQL (Cassandra Query Language). CassIO is
already employed by the LangChain's Vector Store support for Cassandra.

We added a few more connection options in the companion notebook example
(most notably, Astra DB) to encourage usage by anyone who cannot run
their own Cassandra cluster.

We surface the `ttl_seconds` option for automatic handling of an
expiration time to chat history messages, a likely useful feature given
that very old messages generally may lose their importance.

We elaborated a bit more on the integration testing (Time-to-live,
separation of "session ids", ...).

### Remarks from linter & co.

We reinstated `cassio` as a dependency both in the "optional" group and
in the "integration testing" group of `pyproject.toml`. This might not
be the right thing do to, in which case the author of this PR offer his
apologies (lack of confidence with Poetry - happy to be pointed in the
right direction, though!).

During linter tests, we were hit by some errors which appear unrelated
to the code in the PR. We left them here and report on them here for
awareness:

```
langchain/vectorstores/mongodb_atlas.py:137: error: Argument 1 to "insert_many" of "Collection" has incompatible type "List[Dict[str, Sequence[object]]]"; expected "Iterable[Union[MongoDBDocumentType, RawBSONDocument]]"  [arg-type]
langchain/vectorstores/mongodb_atlas.py:186: error: Argument 1 to "aggregate" of "Collection" has incompatible type "List[object]"; expected "Sequence[Mapping[str, Any]]"  [arg-type]

langchain/vectorstores/qdrant.py:16: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:19: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:20: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:22: error: Name "grpc" is not defined  [name-defined]
langchain/vectorstores/qdrant.py:23: error: Name "grpc" is not defined  [name-defined]
```

In the same spirit, we observe that to even get `import langchain` run,
it seems that a `pip install bs4` is missing from the minimal package
installation path.

Thank you!
2023-06-29 10:50:34 -07:00
Zander Chase
f5663603cf Throw error if evaluation key not present (#6874) 2023-06-29 10:30:39 -07:00
Zander Chase
be164b20d8 Accept any single input (#6888)
If I upload a dataset with a single input and output column, we should
be able to let the chain prepare the input without having to maintain a
strict dataset format.
2023-06-29 10:29:16 -07:00
Harrison Chase
8502117f62 bump version to 219 (#6899) 2023-06-28 23:48:42 -07:00
Pablo
6370808d41 Adding support for async (_acall) for VertexAICommon LLM (#5588)
# Adding support for async (_acall) for VertexAICommon LLM

This PR implements the `_acall` method under `_VertexAICommon`. Because
VertexAI itself does not provide an async interface, I implemented it
via a ThreadPoolExecutor that can delegate execution of VertexAI calls
to other threads.

Twitter handle: @polecitoem : )


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

fyi - @agola11 for async functionality
fyi - @Ark-kun from VertexAI
2023-06-28 23:07:41 -07:00
Mike Salvatore
cbd759aaeb Fix inconsistent logging_and_data_dir parameter in AwaDB (#6775)
## Description

Tag maintainer: @rlancemartin, @eyurtsev 

### log_and_data_dir
`AwaDB.__init__()` accepts a parameter named `log_and_data_dir`. But
`AwaDB.from_texts()` and `AwaDB.from_documents()` accept a parameter
named `logging_and_data_dir`. This inconsistency in this parameter name
can lead to confusion on the part of the caller.

This PR renames `logging_and_data_dir` to `log_and_data_dir` to make all
functions consistent with the constructor.

### embedding

`AwaDB.__init__()` accepts a parameter named `embedding_model`. But
`AwaDB.from_texts()` and `AwaDB.from_documents()` accept a parameter
named `embeddings`. This inconsistency in this parameter name can lead
to confusion on the part of the caller.

This PR renames `embedding_model` to `embeddings` to make AwaDB's
constructor consistent with the classmethod "constructors" as specified
by `VectorStore` abstract base class.
2023-06-28 23:06:52 -07:00
Harrison Chase
3ac08c3de4 Harrison/octo ml (#6897)
Co-authored-by: Bassem Yacoube <125713079+AI-Bassem@users.noreply.github.com>
Co-authored-by: Shotaro Kohama <khmshtr28@gmail.com>
Co-authored-by: Rian Dolphin <34861538+rian-dolphin@users.noreply.github.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Shashank Deshpande <shashankdeshpande18@gmail.com>
2023-06-28 23:04:11 -07:00
Jiří Moravčík
a6b40b73e5 Add call_actor_task to the Apify integration (#6862)
A user has been testing the Apify integration inside langchain and he
was not able to run saved Actor tasks.

This PR adds support for calling saved Actor tasks on the Apify platform
to the existing integration. The structure of very similar to the one of
calling Actors.
2023-06-28 22:13:47 -07:00
Shashank Deshpande
99cfe192da added example notebook - use custom functions with openai agent (#6865)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-06-28 22:07:33 -07:00
Rian Dolphin
2e39ede848 add with score option for max marginal relevance (#6867)
### Adding the functionality to return the scores with retrieved
documents when using the max marginal relevance
- Description: Add the method
`max_marginal_relevance_search_with_score_by_vector` to the FAISS
wrapper. Functionality operates the same as
`similarity_search_with_score_by_vector` except for using the max
marginal relevance retrieval framework like is used in the
`max_marginal_relevance_search_by_vector` method.
  - Dependencies: None
  - Tag maintainer: @rlancemartin @eyurtsev 
  - Twitter handle: @RianDolphin

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-28 22:00:34 -07:00
Shotaro Kohama
398e4cd2dc Update langchain.chains.create_extraction_chain_pydantic to parse results successfully (#6887)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
 
- Description: 
- The current code uses `PydanticSchema.schema()` and
`_get_extraction_function` at the same time. As a result, a response
from OpenAI has two nested `info`, and
`PydanticAttrOutputFunctionsParser` fails to parse it. This PR will use
the pydantic class given as an arg instead.
- Issue: no related issue yet
- Dependencies: no dependency change
- Tag maintainer: @dev2049
- Twitter handle: @shotarok28
2023-06-28 21:57:41 -07:00
Eduard van Valkenburg
57f370cde9 PowerBI Toolkit additional logs (#6881)
Added some additional logs to better be able to troubleshoot and
understand the performance of the call to PBI vs the rest of the work.
2023-06-28 18:16:41 -07:00
Robert Lewis
c9c8d2599e Update Zapier Jupyter notebook to include brief OAuth example (#6892)
Description: Adds a brief example of using an OAuth access token with
the Zapier wrapper. Also links to the Zapier documentation to learn more
about OAuth flows.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-28 18:06:22 -07:00
Zhicheng Geng
16b11bda83 Use getLogger instead of basicConfig in multi_query.py (#6891)
Remove `logging.basicConfig`, which turns on logging. Use `getLogger`
instead
2023-06-28 18:06:10 -07:00
Davis Chase
f07dd02b50 Docs /redirects (#6790)
Auto-generated a bunch of redirects from initial docs refactor commit
2023-06-28 17:07:53 -07:00
Harrison Chase
e5611565b7 bump version to 218 (#6857) 2023-06-27 23:36:37 -07:00
Yaohui Wang
9d1bd18596 feat (documents): add LarkSuite document loader (#6420)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

### Summary

This PR adds a LarkSuite (FeiShu) document loader. 
> [LarkSuite](https://www.larksuite.com/) is an enterprise collaboration
platform developed by ByteDance.

### Tests

- an integration test case is added
- an example notebook showing usage is added. [Notebook
preview](https://github.com/yaohui-wyh/langchain/blob/master/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb)

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

### Who can review?

- PTAL @eyurtsev @hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>
2023-06-27 23:08:05 -07:00
Jingsong Gao
a435a436c1 feat(document_loaders): add tencent cos directory and file loader (#6401)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

- add tencent cos directory and file support for document-loader

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

@eyurtsev
2023-06-27 23:07:20 -07:00
Ninely
d6cd0deaef feat: Add streaming only final aiter of agent (#6274)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

#### Add streaming only final async iterator of agent
This callback returns an async iterator and only streams the final
output of an agent.

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested: @agola11

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-27 23:06:25 -07:00
Shashank Deshpande
1db266b20d Update link in apis.mdx (#6812)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-06-27 23:00:26 -07:00
Lance Martin
3f9900a864 Create MultiQueryRetriever (#6833)
Distance-based vector database retrieval embeds (represents) queries in
high-dimensional space and finds similar embedded documents based on
"distance". But, retrieval may produce difference results with subtle
changes in query wording or if the embeddings do not capture the
semantics of the data well. Prompt engineering / tuning is sometimes
done to manually address these problems, but can be tedious.

The `MultiQueryRetriever` automates the process of prompt tuning by
using an LLM to generate multiple queries from different perspectives
for a given user input query. For each query, it retrieves a set of
relevant documents and takes the unique union across all queries to get
a larger set of potentially relevant documents. By generating multiple
perspectives on the same question, the `MultiQueryRetriever` might be
able to overcome some of the limitations of the distance-based retrieval
and get a richer set of results.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-27 22:59:40 -07:00
Tim Asp
3ca1a387c2 Web Loader: Add proxy support (#6792)
Proxies are helpful, especially when you start querying against more
anti-bot websites.

[Proxy
services](https://developers.oxylabs.io/advanced-proxy-solutions/web-unblocker/making-requests)
(of which there are many) and `requests` make it easy to rotate IPs to
prevent banning by just passing along a simple dict to `requests`.

CC @rlancemartin, @eyurtsev
2023-06-27 22:27:49 -07:00
Ayan Bandyopadhyay
f92ccf70fd Update to the latest Psychic python library version (#6804)
Update the Psychic document loader to use the latest `psychicapi` python
library version: `0.8.0`
2023-06-27 22:26:38 -07:00
Hun-soo Jung
f3d178f600 Specify utilities package in SerpAPIWrapper docstring (#6821)
- Description: Specify utilities package in SerpAPIWrapper docstring
  - Issue: Not an issue
  - Dependencies: (n/a)
  - Tag maintainer: @dev2049 
  - Twitter handle: (n/a)
2023-06-27 22:26:20 -07:00
Matt Robinson
dd2a151543 Docs/unstructured api key (#6781)
### Summary

The Unstructured API will soon begin requiring API keys. This PR updates
the Unstructured integrations docs with instructions on how to generate
Unstructured API keys.

### Reviewers

@rlancemartin
@eyurtsev
@hwchase17
2023-06-27 16:54:15 -07:00
Matthew Plachter
d6664af0ee add async to zapier nla tools (#6791)
Replace this comment with:
  - Description: Add Async functionality to Zapier NLA Tools
  - Issue:  n/a 
  - Dependencies: n/a
  - Tag maintainer: 

Maintainer responsibilities:
  - Agents / Tools / Toolkits: @vowelparrot
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
2023-06-27 16:53:35 -07:00
Neil Neuwirth
efe0d39c6a Adjusted OpenAI cost calculation (#6798)
Added parentheses to ensure the division operation is performed before
multiplication. This now correctly calculates the cost by dividing the
number of tokens by 1000 first (to get the cost per token), and then
multiplies it with the model's cost per 1k tokens @agola11
2023-06-27 16:53:06 -07:00
Ian
b4c196f785 fix pinecone delete bug (#6816)
The implementation of delete in pinecone vector omits the namespace,
which will cause delete failed
2023-06-27 16:50:17 -07:00
Janos Tolgyesi
f1070de038 WebBaseLoader: optionally raise exception in the case of http error (#6823)
- **Description**: this PR adds the possibility to raise an exception in
the case the http request did not return a 2xx status code. This is
particularly useful in the situation when the url points to a
non-existent web page, the server returns a http status of 404 NOT
FOUND, but WebBaseLoader anyway parses and returns the http body of the
error message.
  - **Dependencies**: none,
  - **Tag maintainer**: @rlancemartin, @eyurtsev,
  - **Twitter handle**: jtolgyesi
2023-06-27 16:43:59 -07:00
rafael
ef72a7cf26 rail_parser: Allow creation from pydantic (#6832)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Adds a way to create the guardrails output parser from a pydantic model.
2023-06-27 16:40:52 -07:00
Augustine Theodore
a980095efc Enhancement : Ignore deleted messages and media in WhatsAppChatLoader (#6839)
- Description: Ignore deleted messages and media
  - Issue: #6838 
  - Dependencies: No new dependencies
  - Tag maintainer: @rlancemartin, @eyurtsev
2023-06-27 16:36:55 -07:00
Robert Lewis
74848aafea Zapier - Add better error messaging for 401 responses (#6840)
Description: When a 401 response is given back by Zapier, hint to the
end user why that may have occurred

- If an API Key was initialized with the wrapper, ask them to check
their API Key value
- if an access token was initialized with the wrapper, ask them to check
their access token or verify that it doesn't need to be refreshed.

Tag maintainer: @dev2049
2023-06-27 16:35:42 -07:00
Matt Robinson
b24472eae3 feat: Add UnstructuredOrgModeLoader (#6842)
### Summary

Adds `UnstructuredOrgModeLoader` for processing
[Org-mode](https://en.wikipedia.org/wiki/Org-mode) documents.

### Testing

```python
from langchain.document_loaders import UnstructuredOrgModeLoader

loader = UnstructuredOrgModeLoader(
    file_path="example_data/README.org", mode="elements"
)
docs = loader.load()
print(docs[0])
```

### Reviewers

- @rlancemartin
- @eyurtsev
- @hwchase17
2023-06-27 16:34:17 -07:00
Piyush Jain
e53995836a Added missing attribute value object (#6849)
## Description
Adds a missing type class for
[AdditionalResultAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_AdditionalResultAttributeValue.html).
Fixes validation failure for the query API that have
`AdditionalAttributes` in the response.

cc @dev2049 
cc @zhichenggeng
2023-06-27 16:30:11 -07:00
Cristóbal Carnero Liñán
e494b0a09f feat (documents): add a source code loader based on AST manipulation (#6486)
#### Summary

A new approach to loading source code is implemented:

Each top-level function and class in the code is loaded into separate
documents. Then, an additional document is created with the top-level
code, but without the already loaded functions and classes.

This could improve the accuracy of QA chains over source code.

For instance, having this script:

```
class MyClass:
    def __init__(self, name):
        self.name = name

    def greet(self):
        print(f"Hello, {self.name}!")

def main():
    name = input("Enter your name: ")
    obj = MyClass(name)
    obj.greet()

if __name__ == '__main__':
    main()
```

The loader will create three documents with this content:

First document:
```
class MyClass:
    def __init__(self, name):
        self.name = name

    def greet(self):
        print(f"Hello, {self.name}!")
```

Second document:
```
def main():
    name = input("Enter your name: ")
    obj = MyClass(name)
    obj.greet()
```

Third document:
```
# Code for: class MyClass:

# Code for: def main():

if __name__ == '__main__':
    main()
```

A threshold parameter is added to control whether small scripts are
split in this way or not.

At this moment, only Python and JavaScript are supported. The
appropriate parser is determined by examining the file extension.

#### Tests

This PR adds:

- Unit tests
- Integration tests

#### Dependencies

Only one dependency was added as optional (needed for the JavaScript
parser).

#### Documentation

A notebook is added showing how the loader can be used.

#### Who can review?

@eyurtsev @hwchase17

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-27 15:58:47 -07:00
Robert Lewis
da462d9dd4 Zapier update oauth support (#6780)
Description: Update documentation to

1) point to updated documentation links at Zapier.com (we've revamped
our help docs and paths), and
2) To provide clarity how to use the wrapper with an access token for
OAuth support

Demo:

Initializing the Zapier Wrapper with an OAuth Access Token

`ZapierNLAWrapper(zapier_nla_oauth_access_token="<redacted>")`

Using LangChain to resolve the current weather in Vancouver BC
leveraging Zapier NLA to lookup weather by coords.

```
> Entering new  chain...
 I need to use a tool to get the current weather.
Action: The Weather: Get Current Weather
Action Input: Get the current weather for Vancouver BC
Observation: {"coord__lon": -123.1207, "coord__lat": 49.2827, "weather": [{"id": 802, "main": "Clouds", "description": "scattered clouds", "icon": "03d", "icon_url": "http://openweathermap.org/img/wn/03d@2x.png"}], "weather[]icon_url": ["http://openweathermap.org/img/wn/03d@2x.png"], "weather[]icon": ["03d"], "weather[]id": [802], "weather[]description": ["scattered clouds"], "weather[]main": ["Clouds"], "base": "stations", "main__temp": 71.69, "main__feels_like": 71.56, "main__temp_min": 67.64, "main__temp_max": 76.39, "main__pressure": 1015, "main__humidity": 64, "visibility": 10000, "wind__speed": 3, "wind__deg": 155, "wind__gust": 11.01, "clouds__all": 41, "dt": 1687806607, "sys__type": 2, "sys__id": 2011597, "sys__country": "CA", "sys__sunrise": 1687781297, "sys__sunset": 1687839730, "timezone": -25200, "id": 6173331, "name": "Vancouver", "cod": 200, "summary": "scattered clouds", "_zap_search_was_found_status": true}
Thought: I now know the current weather in Vancouver BC.
Final Answer: The current weather in Vancouver BC is scattered clouds with a temperature of 71.69 and wind speed of 3
```
2023-06-27 11:46:32 -07:00
Joshua Carroll
24e4ae95ba Initial Streamlit callback integration doc (md) (#6788)
**Description:** Add a documentation page for the Streamlit Callback
Handler integration (#6315)

Notes:
- Implemented as a markdown file instead of a notebook since example
code runs in a Streamlit app (happy to discuss / consider alternatives
now or later)
- Contains an embedded Streamlit app ->
https://mrkl-minimal.streamlit.app/ Currently this app is hosted out of
a Streamlit repo but we're working to migrate the code to a LangChain
owned repo


![streamlit_docs](https://github.com/hwchase17/langchain/assets/116604821/0b7a6239-361f-470c-8539-f22c40098d1a)

cc @dev2049 @tconkling
2023-06-27 11:43:49 -07:00
Harrison Chase
8392ca602c bump version to 217 (#6831) 2023-06-27 09:39:56 -07:00
Ismail Pelaseyed
fcb3a64799 Add support for passing headers and search params to openai openapi chain (#6782)
- Description: add support for passing headers and search params to
OpenAI OpenAPI chains.
  - Issue: n/a
  - Dependencies: n/a
  - Tag maintainer: @hwchase17
  - Twitter handle: @pelaseyed

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-27 09:09:03 -07:00
Zander Chase
e1fdb67440 Update description in Evals notebook (#6808) 2023-06-27 00:26:49 -07:00
Zander Chase
ad028bbb80 Permit Constitutional Principles (#6807)
In the criteria evaluator.
2023-06-27 00:23:54 -07:00
Zander Chase
6ca383ecf6 Update to RunOnDataset helper functions to accept evaluator callbacks (#6629)
Also improve docstrings and update the tracing datasets notebook to
focus on "debug, evaluate, monitor"
2023-06-26 23:58:13 -07:00
WaseemH
7ac9b22886 RecusiveUrlLoader to RecursiveUrlLoader (#6787) 2023-06-26 23:12:14 -07:00
Mshoven
4535b0b41e 🎯Bug: format the url and path_params (#6755)
- Description: format the url and path_params correctly, 
  - Issue: #6753,
  - Dependencies: None,
  - Tag maintainer: @vowelparrot,
  - Twitter handle: @0xbluesecurity
2023-06-26 23:03:57 -07:00
Zander Chase
07d802d088 Don't raise error if parent not found (#6538)
Done so that you can pass in a run from the low level api
2023-06-26 22:57:52 -07:00
Leonid Ganeline
49c864fa18 docs: vectorstore upgrades 2 (#6796)
updated vectorstores/ notebooks; added new integrations into
ecosystem/integrations/
@dev2049
@rlancemartin, @eyurtsev
2023-06-26 22:55:04 -07:00
Zander Chase
d7dbf4aefe Clean up agent trajectory interface (#6799)
- Enable reference
- Enable not specifying tools at the start
- Add methods with keywords
2023-06-26 22:54:04 -07:00
Zander Chase
cc60fed3be Add a Pairwise Comparison Chain (#6703)
Notebook shows preference scoring between two chains and reports wilson
score interval + p value

I think I'll add the option to insert ground truth labels but doesn't
have to be in this PR
2023-06-26 20:47:41 -07:00
Hakan Tekgul
2928b080f6 Update arize_callback.py - bug fix (#6784)
- Description: Bug Fix - Added a step variable to keep track of prompts
- Issue: Bug from internal Arize testing - The prompts and responses
that are ingested were not mapped correctly
  - Dependencies: N/A
2023-06-26 16:49:46 -07:00
Zander Chase
c460b04c64 Update String Evaluator (#6615)
- Add protocol for `evaluate_strings` 
- Move the criteria evaluator out so it's not restricted to being
applied on traced runs
2023-06-26 14:16:14 -07:00
AaaCabbage
b3f8324de9 feat: fix the Chinese characters in the solution content will be conv… (#6734)
fix the Chinese characters in the solution content will be converted to
ascii encoding, resulting in an abnormally long number of tokens


Co-authored-by: qixin <qixin@fintec.ai>
2023-06-26 13:14:48 -07:00
Chris Pappalardo
70f7c2bb2e align chroma vectorstore get with chromadb to enable where filtering (#6686)
allows for where filtering on collection via get

- Description: aligns langchain chroma vectorstore get with underlying
[chromadb collection
get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103)
allowing for where filtering, etc.
  - Issue: NA
  - Dependencies: none
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @pappanaka
2023-06-26 10:51:20 -07:00
Zander Chase
9ca3b4645e Add support for tags in chain group context manager (#6668)
Lets you specify local and inheritable tags in the group manager.

Also, add more verbose docstrings for our reference docs.
2023-06-26 10:37:33 -07:00
Harrison Chase
d1bcc58beb bump version to 216 (#6770) 2023-06-26 09:46:19 -07:00
Zander Chase
6d30acffcb Fix breaking tags (#6765)
Fix tags change that broke old way of initializing agent

Closes #6756
2023-06-26 09:28:11 -07:00
James Croft
ba622764cb Improve performance when retrieving Notion DB pages (#6710) 2023-06-26 05:46:09 -07:00
Richy Wang
ec8247ec59 Fixed bug in AnalyticDB Vector Store caused by upgrade SQLAlchemy version (#6736) 2023-06-26 05:35:25 -07:00
Santiago Delgado
d84a3bcf7a Office365 Tool (#6306)
#### Background
With the development of [structured
tools](https://blog.langchain.dev/structured-tools/), the LangChain team
expanded the platform's functionality to meet the needs of new
applications. The GMail tool, empowered by structured tools, now
supports multiple arguments and powerful search capabilities,
demonstrating LangChain's ability to interact with dynamic data sources
like email servers.

#### Challenge
The current GMail tool only supports GMail, while users often utilize
other email services like Outlook in Office365. Additionally, the
proposed calendar tool in PR
https://github.com/hwchase17/langchain/pull/652 only works with Google
Calendar, not Outlook.

#### Changes
This PR implements an Office365 integration for LangChain, enabling
seamless email and calendar functionality with a single authentication
process.

#### Future Work
With the core Office365 integration complete, future work could include
integrating other Office365 tools such as Tasks and Address Book.

#### Who can review?
@hwchase17 or @vowelparrot can review this PR

#### Appendix
@janscas, I utilized your [O365](https://github.com/O365/python-o365)
library extensively. Given the rising popularity of LangChain and
similar AI frameworks, the convergence of libraries like O365 and tools
like this one is likely. So, I wanted to keep you updated on our
progress.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-26 02:59:09 -07:00
Xiaochao Dong
a15afc102c Relax the action input check for actions that require no input (#6357)
When the tool requires no input, the LLM often gives something like
this:
```json
{
    "action": "just_do_it"
}
```
I have attempted to enhance the prompt, but it doesn't appear to be
functioning effectively. Therefore, I believe we should consider easing
the check a little bit.



Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2023-06-26 02:30:17 -07:00
Ethan Bowen
cc33bde74f Confluence added (#6432)
Adding Confluence to Jira tool. Can create a page in Confluence with
this PR. If accepted, will extend functionality to Bitbucket and
additional Confluence features.



---------

Co-authored-by: Ethan Bowen <ethan.bowen@slalom.com>
2023-06-26 02:28:04 -07:00
Surya Nudurupati
2aeb8e7dbc Improved Documentation: Eliminating Redundancy in the Introduction.mdx (#6360)
When the documentation was originally written there was a redundant
typing of the word "using the"
2023-06-26 02:27:36 -07:00
rajib
0f6ef048d2 The openai_info.py does not have gpt-35-turbo which is the underlying Azure Open AI model name (#6321)
Since this model name is not there in the list MODEL_COST_PER_1K_TOKENS,
when we use get_openai_callback(), for gpt 3.5 model in Azure AI, we do
not get the cost of the tokens. This will fix this issue


#### Who can review?
 @hwchase17
 @agola11

Co-authored-by: rajib76 <rajib76@yahoo.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-26 02:16:39 -07:00
ArchimedesFTW
fe941cb54a Change tags(str) to tags(dict) in mlflow_callback.py docs (#6473)
Fixes #6472

#### Who can review?

@agola11
2023-06-26 02:12:23 -07:00
0xcrusher
9187d2f3a9 Fixed caching bug for Multiple Caching types by correctly checking types (#6746)
- Fixed an issue where some caching types check the wrong types, hence
not allowing caching to work


Maintainer responsibilities:
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
2023-06-26 01:14:32 -07:00
Harrison Chase
e9877ea8b1 Tiktoken override (#6697) 2023-06-26 00:49:32 -07:00
Gabriel Altay
f9771700e4 prevent DuckDuckGoSearchAPIWrapper from consuming top result (#6727)
remove the `next` call that checks for None on the results generator
2023-06-25 19:54:15 -07:00
Pau Ramon Revilla
87802c86d9 Added a MHTML document loader (#6311)
MHTML is a very interesting format since it's used both for emails but
also for archived webpages. Some scraping projects want to store pages
in disk to process them later, mhtml is perfect for that use case.

This is heavily inspired from the beautifulsoup html loader, but
extracting the html part from the mhtml file.

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-25 13:12:08 -07:00
Janos Tolgyesi
05eec99269 beautifulsoup get_text kwargs in WebBaseLoader (#6591)
# beautifulsoup get_text kwargs in WebBaseLoader

- Description: this PR introduces an optional `bs_get_text_kwargs`
parameter to `WebBaseLoader` constructor. It can be used to pass kwargs
to the downstream BeautifulSoup.get_text call. The most common usage
might be to pass a custom text separator, as seen also in
`BSHTMLLoader`.
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: jtolgyesi
2023-06-25 12:42:27 -07:00
Matt Robinson
be68f6f8ce feat: Add UnstructuredRSTLoader (#6594)
### Summary

Adds an `UnstructuredRSTLoader` for loading
[reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText) file.

### Testing

```python
from langchain.document_loaders import UnstructuredRSTLoader

loader = UnstructuredRSTLoader(
    file_path="example_data/README.rst", mode="elements"
)
docs = loader.load()
print(docs[0])
```

### Reviewers

- @hwchase17 
- @rlancemartin 
- @eyurtsev
2023-06-25 12:41:57 -07:00
Chip Davis
b32cc01c9f feat: added tqdm progress bar to UnstructuredURLLoader (#6600)
- Description: Adds a simple progress bar with tqdm when using
UnstructuredURLLoader. Exposes new paramater `show_progress_bar`. Very
simple PR.
- Issue: N/A
- Dependencies: N/A
- Tag maintainer: @rlancemartin @eyurtsev

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-25 12:41:25 -07:00
Augustine Theodore
afc292e58d Fix WhatsAppChatLoader : Enable parsing additional formats (#6663)
- Description: Updated regex to support a new format that was observed
when whatsapp chat was exported.
  - Issue: #6654
  - Dependencies: No new dependencies
  - Tag maintainer: @rlancemartin, @eyurtsev
2023-06-25 12:08:43 -07:00
Sumanth Donthula
3e30a5d967 updated sql_database.py for returning sorted table names. (#6692)
Added code to get the tables info in sorted order in methods
get_usable_table_names and get_table_info.

Linked to Issue: #6640
2023-06-25 12:04:24 -07:00
刘 方瑞
9d1b3bab76 Fix Typo in LangChain MyScale Integration Doc (#6705)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

- Description: Fix Typo in LangChain MyScale Integration  Doc

@hwchase17
2023-06-25 11:54:00 -07:00
sudolong
408c8d0178 fix chroma _similarity_search_with_relevance_scores missing kwargs … (#6708)
Issue: https://github.com/hwchase17/langchain/issues/6707
2023-06-25 11:53:42 -07:00
Zander Chase
d89e10d361 Fix Multi Functions Agent Tracing (#6702)
Confirmed it works now:
https://dev.langchain.plus/public/0dc32ce0-55af-432e-b09e-5a1a220842f5/r
2023-06-25 10:39:04 -07:00
Harrison Chase
1742db0c30 bump version to 215 (#6719) 2023-06-25 08:52:51 -07:00
Ankush Gola
e1b801be36 split up batch llm calls into separate runs (#5804) 2023-06-24 21:03:31 -07:00
Davis Chase
1da99ce013 bump v214 (#6694) 2023-06-24 14:23:11 -07:00
Lance Martin
dd36adc0f4 Make bs4 a local import in recursive_url_loader.py (#6693)
Resolve https://github.com/hwchase17/langchain/issues/6679
2023-06-24 13:54:10 -07:00
Harrison Chase
ef4c7b54ef bump to version 213 (#6688) 2023-06-24 11:56:37 -07:00
UmerHA
068142fce2 Add caching to BaseChatModel (issue #1644) (#5089)
#  Add caching to BaseChatModel
Fixes #1644

(Sidenote: While testing, I noticed we have multiple implementations of
Fake LLMs, used for testing. I consolidated them.)

## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
Models
- @hwchase17
- @agola11

Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord:
RicChilligerDude#7589

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-24 11:45:09 -07:00
Harrison Chase
c289cc891a Harrison/optional ids opensearch (#6684)
Co-authored-by: taekimsmar <66041442+taekimsmar@users.noreply.github.com>
2023-06-24 09:19:57 -07:00
Hrag Balian
2518e6c95b Session deletion method in motorhead memory (#6609)
Motorhead Memory module didn't support deletion of a session. Added a
method to enable deletion.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-23 21:27:42 -07:00
Baichuan Sun
9fbe346860 Amazon API Gateway hosted LLM (#6673)
This PR adds a new LLM class for the Amazon API Gateway hosted LLM. The
PR also includes example notebooks for using the LLM class in an Agent
chain.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-23 21:27:25 -07:00
Davis Chase
fa1bb873e2 Fix openapi parameter parsing (#6676)
Ensure parameters are json serializable, related to #6671
2023-06-23 21:19:12 -07:00
Akash
b7e1c54947 Just corrected a small inconsistency on a doc page (#6603)
### Just corrected a small inconsistency on a doc page (not exactly a
typo, per se)
- Description: There was inconsistency due to the use of single quotes
at one place on the [Squential
Chains](https://python.langchain.com/docs/modules/chains/foundational/sequential_chains)
page of the docs,
  - Issue: NA,
  - Dependencies: NA,
  - Tag maintainer: @dev2049,
  - Twitter handle: kambleakash0
2023-06-23 16:09:29 -07:00
Davis Chase
2da1aab50b Wiki loader lint (#6670) 2023-06-23 16:05:42 -07:00
Leonid Ganeline
1c81883d42 added docstrings where they missed (#6626)
This PR targets the `API Reference` documentation.
- Several classes and functions missed `docstrings`. These docstrings
were created.
- In several places this

```
except ImportError:
        raise ValueError(
```

        was replaced to 

```
except ImportError:
        raise ImportError(
```
2023-06-23 15:49:44 -07:00
Shashank
3364e5818b Changed generate_prompt.py (#6644)
Modified regex for Fix: ValueError: Could not parse output
2023-06-23 15:48:33 -07:00
Davis Chase
f1e1ac2a01 chroma nb close img tag (#6669) 2023-06-23 15:41:54 -07:00
eLafo
db8b13df4c adds doc_content_chars_max argument to WikipediaLoader (#6645)
# Description
It adds a new initialization param in `WikipediaLoader` so we can
override the `doc_content_chars_max` param used in `WikipediaAPIWrapper`
under the hood, e.g:

```python
from langchain.document_loaders import WikipediaLoader

# doc_content_chars_max is the new init param
loader = WikipediaLoader(query="python", doc_content_chars_max=90000)
```

## Decisions
`doc_content_chars_max` default value will be 4000, because it's the
current value
I have added pycode comments

# Issue
#6639

# Dependencies
None


# Twitter handle
[@elafo](https://twitter.com/elafo)
2023-06-23 15:22:09 -07:00
Davis Chase
5e5b30b74f openapi -> openai nit (#6667) 2023-06-23 15:09:02 -07:00
Jeff Huber
2acf109c4b update chroma notebook (#6664)
@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.
2023-06-23 15:03:06 -07:00
Eduard van Valkenburg
48381f1f78 PowerBI: catch outdated token (#6634)
This adds just a small tweak to catch the error that says the token is
expired rather then retrying.
2023-06-23 15:01:08 -07:00
Piyush Jain
b1de927f1b Kendra retriever api (#6616)
## Description
Replaces [Kendra
Retriever](https://github.com/hwchase17/langchain/blob/master/langchain/retrievers/aws_kendra_index_retriever.py)
with an updated version that uses the new [retriever
API](https://docs.aws.amazon.com/kendra/latest/dg/searching-retrieve.html)
which is better suited for retrieval augmented generation (RAG) systems.

**Note**: This change requires the latest version (1.26.159) of boto3 to
work. `pip install -U boto3` to upgrade the boto3 version.

cc @hupe1980
cc @dev2049
2023-06-23 14:59:35 -07:00
ChrisLovejoy
4e5d78579b fix minor typo in vector_db_qa.mdx (#6604)
- Description: minor typo fixed - doesn't instead of does. No other
changes.
2023-06-23 14:57:37 -07:00
Ikko Eltociear Ashimine
73da193a4b Fix typo in myscale_self_query.ipynb (#6601) 2023-06-23 14:57:12 -07:00
Saarthak Maini
ba256b23f2 Fix Typo (#6595)
Resolves #6582
2023-06-23 14:56:54 -07:00
kourosh hakhamaneshi
f6fdabd20b Fix ray-project/Aviary integration (#6607)
- Description: The aviary integration has changed url link. This PR
provide fix for those changes and also it makes providing the input URL
optional to the API (since they can be set via env variables).
  - Issue: N/A
  - Dependencies: N/A
  - Twitter handle: N/A

---------

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2023-06-23 14:49:53 -07:00
northern-64bit
dbe1d029ec Fix grammar mistake in base.py in planners (#6611)
Fix a typo in
`langchain/experimental/plan_and_execute/planners/base.py`, by changing
"Given input, decided what to do." to "Given input, decide what to do."

This is in the docstring for functions running LLM chains which shall
create a plan, "decided" does not make any sense in this context.
2023-06-23 14:47:10 -07:00
Aaron Pham
082976d8d0 fix(docs): broken link for OpenLLM (#6622)
This link for the notebook of OpenLLM is not migrated to the new format

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change,
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-23 13:59:17 -07:00
Davis Chase
fe828185ed Dev2049/bump 212 (#6665) 2023-06-23 13:48:02 -07:00
Hassan Ouda
9e52134d30 ChatVertexAI broken - Fix error with sending context in params (#6652)
vertex Ai chat is broken right now. That is because context is in params
and chat.send_message doesn't accept that as a params.

- Closes issue [ChatVertexAI Error: _ChatSessionBase.send_message() got
an unexpected keyword argument 'context'
#6610](https://github.com/hwchase17/langchain/issues/6610)
2023-06-23 13:38:21 -07:00
Lance Martin
c2b25c17c5 Recursive URL loader (#6455)
We may want to process load all URLs under a root directory.

For example, let's look at the [LangChain JS
documentation](https://js.langchain.com/docs/).

This has many interesting child pages that we may want to read in bulk.

Of course, the `WebBaseLoader` can load a list of pages. 

But, the challenge is traversing the tree of child pages and actually
assembling that list!
 
We do this using the `RecusiveUrlLoader`.

This also gives us the flexibility to exclude some children (e.g., the
`api` directory with > 800 child pages).
2023-06-23 13:09:00 -07:00
Lance Martin
be02572d58 Add delete and ensure add_texts performs upsert (w/ ID optional) (#6126)
## Goal 

We want to ensure consistency across vectordbs:
1/ add `delete` by ID method to the base vectorstore class
2/ ensure `add_texts` performs `upsert` with ID optionally passed

## Testing
- [x] Pinecone: notebook test w/ `langchain_test` vectorstore.
- [x] Chroma: Review by @jeffchuber, notebook test w/ in memory
vectorstore.
- [x] Supabase: Review by @copple, notebook test w/ `langchain_test`
table.
- [x] Weaviate: Notebook test w/ `langchain_test` index. 
- [x] Elastic: Revied by @vestal. Notebook test w/ `langchain_test`
table.
- [ ] Redis: Asked for review from owner of recent `delete` method
https://github.com/hwchase17/langchain/pull/6222
2023-06-23 13:03:10 -07:00
Lance Martin
393f469eb3 Create merge loader that combines documents from a set of loaders (#6659)
Simple utility loader that combines documents from a set of specified
loaders.
2023-06-23 13:02:48 -07:00
Davis Chase
6988039975 openapi_openai docstring (#6661) 2023-06-23 11:38:33 -07:00
Davis Chase
b25933b607 bump 211 (#6660) 2023-06-23 11:10:48 -07:00
Davis Chase
e013459b18 Openapi to openai (#6658) 2023-06-23 11:00:34 -07:00
Davis Chase
b062a3f938 bump 210 (#6656) 2023-06-23 09:37:58 -07:00
Alejandra De Luna
980c865174 fix: remove callbacks arg from Tool and StructuredTool inferred schema (#6483)
Fixes #5456 

This PR removes the `callbacks` argument from a tool's schema when
creating a `Tool` or `StructuredTool` with the `from_function` method
and `infer_schema` is set to `True`. The `callbacks` argument is now
removed in the `create_schema_from_function` and `_get_filtered_args`
methods. As suggested by @vowelparrot, this fix provides a
straightforward solution that minimally affects the existing
implementation.

A test was added to verify that this change enables the expected use of
`Tool` and `StructuredTool` when using a `CallbackManager` and inferring
the tool's schema.

  - @hwchase17
2023-06-23 01:48:27 -07:00
Zander Chase
b4fe7f3a09 Session to project (#6249)
Sessions are being renamed to projects in the tracer
2023-06-23 01:11:01 -07:00
Zander Chase
9c09861946 Add tags in agent initialization (#6559)
Add better docstrings for agent executor as well

Inspo: https://github.com/hwchase17/langchainjs/pull/1722

![image](https://github.com/hwchase17/langchain/assets/130414180/d11662bc-0c0e-4166-9ff3-354d41a9144a)
2023-06-22 22:35:00 -07:00
Lance Martin
6e69bfbb28 Loader for OpenCityData and minor cleanups to Pandas, Airtable loaders (#6301)
Many cities have open data portals for events like crime, traffic, etc.

Socrata provides an API for many, including SF (e.g., see
[here](https://dev.socrata.com/foundry/data.sfgov.org/tmnf-yvry)).

This is a new data loader for city data that uses Socrata API.
2023-06-22 22:20:42 -07:00
Christoph Kahl
9d42621fa4 added redis method to delete entries by keys (#6222)
In addition to my last pr (return keys of added entries), we also need a
method to delete the entries by keys.

@dev2049
2023-06-22 13:26:47 -07:00
Tim Conkling
c28990d871 StreamlitCallbackHandler (#6315)
A new implementation of `StreamlitCallbackHandler`. It formats Agent
thoughts into Streamlit expanders.

You can see the handler in action here:
https://langchain-mrkl.streamlit.app/

Per a discussion with Harrison, we'll be adding a
`StreamlitCallbackHandler` implementation to an upcoming
[Streamlit](https://github.com/streamlit/streamlit) release as well, and
will be updating it as we add new LLM- and LangChain-specific features
to Streamlit.

The idea with this PR is that the LangChain `StreamlitCallbackHandler`
will "auto-update" in a way that keeps it forward- (and backward-)
compatible with Streamlit. If the user has an older Streamlit version
installed, the LangChain `StreamlitCallbackHandler` will be used; if
they have a newer Streamlit version that has an updated
`StreamlitCallbackHandler`, that implementation will be used instead.

(I'm opening this as a draft to get the conversation going and make sure
we're on the same page. We're really excited to land this into
LangChain!)

#### Who can review?

@agola11, @hwchase17
2023-06-22 13:14:28 -07:00
Nuno Campos
74ac6fb6b9 Allow callback handlers to opt into being run inline (#6424)
This is useful eg for callback handlers that use context vars (like open
telemetry)

See https://github.com/hwchase17/langchain/pull/6095
2023-06-22 11:36:19 -07:00
Harrison Chase
a9108c1809 add mongo (HOLD) (#6437)
do not merge in
2023-06-22 11:08:12 -07:00
Lance Martin
30f7288082 MD header text splitter returns Documents (#6571)
Return `Documents` from MD header text splitter to simplify UX.

Updates the test as well as example notebooks.
2023-06-22 09:25:38 -07:00
Rogério Chaves
3436da65a4 Fix callback forwarding in async plan method for OpenAI function agent (#6584)
The callback argument was missing, preventing me to get callbacks to
work properly when using it async
2023-06-22 08:18:31 -07:00
Davis Chase
b909bc8b58 bump 209 (#6593) 2023-06-22 08:18:19 -07:00
minhajul-clarifai
6e57306a13 Clarifai integration (#5954)
# Changes
This PR adds [Clarifai](https://www.clarifai.com/) integration to
Langchain. Clarifai is an end-to-end AI Platform. Clarifai offers user
the ability to use many types of LLM (OpenAI, cohere, ect and other open
source models). As well, a clarifai app can be treated as a vector
database to upload and retrieve data. The integrations includes:
- Clarifai LLM integration: Clarifai supports many types of language
model that users can utilize for their application
- Clarifai VectorDB: A Clarifai application can hold data and
embeddings. You can run semantic search with the embeddings

#### Before submitting
- [x] Added integration test for LLM 
- [x] Added integration test for VectorDB 
- [x] Added notebook for LLM 
- [x] Added notebook for VectorDB 

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-22 08:00:15 -07:00
Jeroen Van Goey
7f6f5c2a6a Add missing word in comment (#6587)
Changed

```
# Do this so we can exactly what's going on under the hood
```
to
```
# Do this so we can see exactly what's going on under the hood
```
2023-06-22 07:54:28 -07:00
Davis Chase
d50de2728f Add AzureML endpoint LLM wrapper (#6580)
### Description

We have added a new LLM integration `azureml_endpoint` that allows users
to leverage models from the AzureML platform. Microsoft recently
announced the release of [Azure Foundation

Models](https://learn.microsoft.com/en-us/azure/machine-learning/concept-foundation-models?view=azureml-api-2)
which users can find in the AzureML Model Catalog. The Model Catalog
contains a variety of open source and Hugging Face models that users can
deploy on AzureML. The `azureml_endpoint` allows LangChain users to use
the deployed Azure Foundation Models.

### Dependencies

No added dependencies were required for the change.

### Tests

Integration tests were added in
`tests/integration_tests/llms/test_azureml_endpoint.py`.

### Notebook

A Jupyter notebook demonstrating how to use `azureml_endpoint` was added
to `docs/modules/llms/integrations/azureml_endpoint_example.ipynb`.

### Twitters

[Prakhar Gupta](https://twitter.com/prakhar_in)
[Matthew DeGuzman](https://twitter.com/matthew_d13)

---------

Co-authored-by: Matthew DeGuzman <91019033+matthewdeguzman@users.noreply.github.com>
Co-authored-by: prakharg-msft <75808410+prakharg-msft@users.noreply.github.com>
2023-06-22 01:46:01 -07:00
Davis Chase
4fabd02d25 Add OpenLLM wrapper(#6578)
LLM wrapper for models served with OpenLLM

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Chaoyu <paranoyang@gmail.com>
2023-06-22 01:18:14 -07:00
Brendan Graham
d718f3b6d0 feat: interfaces for async embeddings, implement async openai (#6563)
Since it seems like #6111 will be blocked for a bit, I've forked
@tyree731's fork and implemented the requested changes.

This change adds support to the base Embeddings class for two methods,
aembed_query and aembed_documents, those two methods supporting async
equivalents of embed_query and
embed_documents respectively. This ever so slightly rounds out async
support within langchain, with an initial implementation of this
functionality being implemented for openai.

Implements https://github.com/hwchase17/langchain/issues/6109

---------

Co-authored-by: Stephen Tyree <tyree731@gmail.com>
2023-06-21 23:16:33 -07:00
ljeagle
ca24dc2d5f Upgrade the version of AwaDB and add some new interfaces (#6565)
1. upgrade the version of AwaDB
2. add some new interfaces
3. fix bug of packing page content error

@dev2049  please review, thanks!

---------

Co-authored-by: vincent <awadb.vincent@gmail.com>
2023-06-21 23:15:18 -07:00
Harrison Chase
937a7e93f2 add motherduck docs (#6572) 2023-06-21 23:13:45 -07:00
Muhammad Vaid
ae81b96b60 Detailed using the Twilio tool to send messages with 3rd party apps incl. WhatsApp (#6562)
Everything needed to support sending messages over WhatsApp Business
Platform (GA), Facebook Messenger (Public Beta) and Google Business
Messages (Private Beta) was present. Just added some details on
leveraging it.
2023-06-21 19:26:50 -07:00
Kenzie Mihardja
b8d78424ab Change Data Loader Namespace (#6568)
Description:
Update the artifact name of the xml file and the namespaces. Co-authored
with @tjaffri
Co-authored-by: Kenzie Mihardja <kenzie@docugami.com>
2023-06-21 19:24:04 -07:00
Gengliang Wang
0673245d0c Remove duplicate databricks entries in ecosystem integrations (#6569)
Currently, there are two Databricks entries in
https://python.langchain.com/docs/ecosystem/integrations/
<img width="277" alt="image"
src="https://github.com/hwchase17/langchain/assets/1097932/86ab4ad2-6bce-4459-9d56-1ab2fbb69f6d">

The reason is that there are duplicated notebooks for Databricks
integration:
*
https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks.ipynb
*
https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks/databricks.ipynb

This PR is to remove the second one for simplicity.
2023-06-21 19:14:33 -07:00
Suri Chen
14b9418cc5 Fix whatsappchatloader - enable parsing new datetime format on WhatsApp chat (#6555)
- Description: observed new format on WhatsApp exported chat - example:
`[2023/5/4, 16:17:13] ~ Carolina: 🥺`
  - Dependencies: no additional dependencies required
  - Tag maintainer: @rlancemartin, @eyurtsev
2023-06-21 19:11:49 -07:00
Zander Chase
5322bac5fc Wait for all futures (#6554)
- Expose method to wait for all futures
- Wait for submissions in the run_on_dataset functions to ensure runs
are fully submitted before cleaning up
2023-06-21 18:20:17 -07:00
HenriZuber
e0605b464b feat: faiss filter from list (#6537)
### Feature

Using FAISS on a retrievalQA task, I found myself wanting to allow in
multiple sources. From what I understood, the filter feature takes in a
dict of form {key: value} which then will check in the metadata for the
exact value linked to that key.
I added some logic to be able to pass a list which will be checked
against instead of an exact value. Passing an exact value will also
work.

Here's an example of how I could then use it in my own project:

```
    pdfs_to_filter_in = ["file_A", "file_B"]
    filter_dict = {
        "source": [f"source_pdfs/{pdf_name}.pdf" for pdf_name in pdfs_to_filter_in]
    }
    retriever = db.as_retriever()
    retriever.search_kwargs = {"filter": filter_dict}
```

I added an integration test based on the other ones I found in
`tests/integration_tests/vectorstores/test_faiss.py` under
`test_faiss_with_metadatas_and_list_filter()`.

It doesn't feel like this is worthy of its own notebook or doc, but I'm
open to suggestions if needed.

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-21 10:49:01 -07:00
Davis Chase
00a7403236 update pr tmpl (#6552) 2023-06-21 10:03:52 -07:00
Jeroen Van Goey
57b5f42847 Remove unintended double negation in docstring (#6541)
Small typo fix.

`ImportError: If importing vertexai SDK didn't not succeed.` ->
`ImportError: If importing vertexai SDK did not succeed.`.
2023-06-21 10:01:28 -07:00
Andrey E. Vedishchev
a2a0715bd4 Minor Grammar Fixes in Docs and Comments (#6536)
Just some grammar fixes: I found "retriver" instead of "retriever" in
several comments across the documentation and in the comments. I fixed
it.


Co-authored-by: andrey.vedishchev <andrey.vedishchev@rgigroup.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-21 09:53:31 -07:00
dirtysalt
57cc3d1d3d [Feature][VectorStore] Support StarRocks as vector db (#6119)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

Here are some examples to use StarRocks as vectordb

```
from langchain.vectorstores import StarRocks
from langchain.vectorstores.starrocks import StarRocksSettings

embeddings = OpenAIEmbeddings()

# conifgure starrocks settings
settings = StarRocksSettings()
settings.port = 41003
settings.host = '127.0.0.1'
settings.username = 'root'
settings.password = ''
settings.database = 'zya'

# to fill new embeddings
docsearch = StarRocks.from_documents(split_docs, embeddings, config = settings)   


# or to use already-built embeddings in database.
docsearch = StarRocks(embeddings, settings)
```

#### Who can review?

Tag maintainers/contributors who might be interested:

@dev2049 

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-21 09:02:33 -07:00
Zander Chase
7a4ff424fc Relax string input mapper check (#6544)
for run evaluator. It could be that an evalutor doesn't need the output
2023-06-21 08:01:42 -07:00
Harrison Chase
ace442b992 bump to ver 208 (#6540) 2023-06-21 07:32:36 -07:00
Harrison Chase
53c1f120a8 Harrison/multi tool (#6518) 2023-06-21 07:19:52 -07:00
Naman Modi
37a89918e0 Infino integration for simplified logs, metrics & search across LLM data & token usage (#6218)
### Integration of Infino with LangChain for Enhanced Observability

This PR aims to integrate [Infino](https://github.com/infinohq/infino),
an open source observability platform written in rust for storing
metrics and logs at scale, with LangChain, providing users with a
streamlined and efficient method of tracking and recording LangChain
experiments. By incorporating Infino into LangChain, users will be able
to gain valuable insights and easily analyze the behavior of their
language models.

#### Please refer to the following files related to integration:
- `InfinoCallbackHandler`: A [callback
handler](https://github.com/naman-modi/langchain/blob/feature/infino-integration/langchain/callbacks/infino_callback.py)
specifically designed for storing chain responses within Infino.
- Example `infino.ipynb` file: A comprehensive notebook named
[infino.ipynb](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/modules/callbacks/integrations/infino.ipynb)
has been included to guide users on effectively leveraging Infino for
tracking LangChain requests.
- [Integration
Doc](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/ecosystem/integrations/infino.mdx)
for Infino integration.

By integrating Infino, LangChain users will gain access to powerful
visualization and debugging capabilities. Infino enables easy tracking
of inputs, outputs, token usage, execution time of LLMs. This
comprehensive observability ensures a deeper understanding of individual
executions and facilitates effective debugging.

Co-authors: @vinaykakade @savannahar68
---------

Co-authored-by: Vinay Kakade <vinaykakade@gmail.com>
2023-06-21 01:38:20 -07:00
Elijah Tarr
e0f468f6c1 Update model token mappings/cost to include 0613 models (#6122)
Add `gpt-3.5-turbo-16k` to model token mappings, as per the following
new OpenAI blog post:
https://openai.com/blog/function-calling-and-other-api-updates

Fixes #6118 


Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-21 01:37:16 -07:00
Jakub Misiło
5d149e4d50 Fix issue with non-list To header in GmailSendMessage Tool (#6242)
Fixing the problem of feeding `str` instead of `List[str]` to the email
tool.

Fixes #6234 
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-21 01:25:49 -07:00
Anubhav Bindlish
94c7899257 Integrate Rockset as Vectorstore (#6216)
This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-21 01:22:27 -07:00
ElReyZero
ab7ecc9c30 Feat: Add a prompt template parameter to qa with structure chains (#6495)
This pull request introduces a new feature to the LangChain QA Retrieval
Chains with Structures. The change involves adding a prompt template as
an optional parameter for the RetrievalQA chains that utilize the
recently implemented OpenAI Functions.

The main purpose of this enhancement is to provide users with the
ability to input a more customizable prompt to the chain. By introducing
a prompt template as an optional parameter, users can tailor the prompt
to their specific needs and context, thereby improving the flexibility
and effectiveness of the RetrievalQA chains.

## Changes Made
- Created a new optional parameter, "prompt", for the RetrievalQA with
structure chains.
- Added an example to the RetrievalQA with sources notebook.

My twitter handle is @El_Rey_Zero

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-21 00:23:36 -07:00
Mircea Pasoi
2e024823d2 Add async support for HuggingFaceTextGenInference (#6507)
Adding support for async calls in `HuggingFaceTextGenInference`


Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-20 23:12:24 -07:00
Hassan Ouda
456ca3d587 Be able to use Codey models on Vertex AI (#6354)
Added the functionality to leverage 3 new Codey models from Vertex AI:
- code-bison - Code generation using the existing LLM integration
- code-gecko - Code completion using the existing LLM integration
- codechat-bison - Code chat using the existing chat_model integration

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-20 23:11:54 -07:00
囧囧
0fce8ef178 Add KuzuQAChain (#6454)
This PR adds `KuzuGraph` and `KuzuQAChain` for interacting with [Kùzu
database](https://github.com/kuzudb/kuzu). Kùzu is an in-process
property graph database management system (GDBMS) built for query speed
and scalability. The `KuzuGraph` and `KuzuQAChain` provide the same
functionality as the existing integration with NebulaGraph and Neo4j and
enables query generation and question answering over Kùzu database.

A notebook example and a simple test case have also been added.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-20 22:07:00 -07:00
Chanin Nantasenamat
6e07283dd5 Update index.mdx (#6326)
#### Fix
Added the mention of "store" amongst the tasks that the data connection
module can perform aside from the existing 3 (load, transform and
query). Particularly, this implies the generation of embeddings vectors
and the creation of vector stores.
2023-06-20 21:40:20 -07:00
Zander Chase
ffa4ff1a2e Export trajectory eval fn (#6509)
from the run_evaluators dir
2023-06-20 21:18:28 -07:00
TheOnlyWayUp
bb437646fc typo(llamacpp.ipynb): 'condiser' -> 'consider' (#6474) 2023-06-20 18:48:25 -07:00
northern-64bit
7492060525 Fix typo in docstring of format_tool_to_openai_function (#6479)
Fixes typo "open AI" to "OpenAI" in docstring of
`format_tool_to_openai_function` in
`langchain/tools/convert_to_openai.py`.
2023-06-20 18:42:30 -07:00
Davis Chase
b3c49e94a0 Make streamlit import optional (#6510) 2023-06-20 18:41:59 -07:00
Daniel McDonald
cece8c8bf0 Fixed: 'readible' -> readable (#6492)
Hello there👋

I have made a pull request to fix a small typo.
2023-06-20 18:39:59 -07:00
hsparmar
834c3378af Documentation Fix: Correct the example code output in the prompt templates doc (#6496)
Documentation is showing the wrong example output for the prompt
templates code snippet. This PR fixes that issue.
2023-06-20 17:21:09 -07:00
Davis Chase
c91cf68754 Fix link (#6501) 2023-06-20 14:44:22 -07:00
Davis Chase
3298bf4f00 docs/fix links (#6498) 2023-06-20 14:06:50 -07:00
Lance Martin
ae6196507d Update notebook for MD header splitter and create new cookbook (#6399)
Move MD header text splitter example to its own cookbook.
2023-06-20 13:53:41 -07:00
Stefano Lottini
22af93d851 Vector store support for Cassandra (#6426)
This addresses #6291 adding support for using Cassandra (and compatible
databases, such as DataStax Astra DB) as a [Vector
Store](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor(ANN)+Vector+Search+via+Storage-Attached+Indexes).

A new class `Cassandra` is introduced, which complies with the contract
and interface for a vector store, along with the corresponding
integration test, a sample notebook and modified dependency toml.

Dependencies: the implementation relies on the library `cassio`, which
simplifies interacting with Cassandra for ML- and LLM-oriented
workloads. CassIO, in turn, uses the `cassandra-driver` low-lever
drivers to communicate with the database. The former is added as
optional dependency (+ in `extended_testing`), the latter was already in
the project.

Integration testing relies on a locally-running instance of Cassandra.
[Here](https://cassio.org/more_info/#use-a-local-vector-capable-cassandra)
a detailed description can be found on how to compile and run it (at the
time of writing the feature has not made it yet to a release).

During development of the integration tests, I added a new "fake
embedding" class for what I consider a more controlled way of testing
the MMR search method. Likewise, I had to amend what looked like a
glitch in the behaviour of `ConsistentFakeEmbeddings` whereby an
`embed_query` call would have bypassed storage of the requested text in
the class cache for use in later repeated invocations.

@dev2049 might be the right person to tag here for a review. Thank you!

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-20 10:46:20 -07:00
Harrison Chase
cac6e45a67 improve documentation on base chain (#6468)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-06-20 10:34:57 -07:00
Zeeland
ad7089a6d0 fix: change ddg to DDGS (#6480)
This commit updates the duckduckgo search utility by using a more
accurate name in the import statement.
2023-06-20 10:15:05 -07:00
Davis Chase
8cd5f65a6f release 207 (#6488) 2023-06-20 10:14:29 -07:00
zhaoshengbo
ab44c24333 Add Alibaba Cloud OpenSearch as a new vector store (#6154)
Hello Folks,

Thanks for creating and maintaining this great project. I'm excited to
submit this PR to add Alibaba Cloud OpenSearch as a new vector store.

OpenSearch is a one-stop platform to develop intelligent search
services. OpenSearch was built based on the large-scale distributed
search engine developed by Alibaba. OpenSearch serves more than 500
business cases in Alibaba Group and thousands of Alibaba Cloud
customers. OpenSearch helps develop search services in different search
scenarios, including e-commerce, O2O, multimedia, the content industry,
communities and forums, and big data query in enterprises.

OpenSearch provides the vector search feature. In specific scenarios,
especially test question search and image search scenarios, you can use
the vector search feature together with the multimodal search feature to
improve the accuracy of search results.


This PR includes:

A AlibabaCloudOpenSearch class that can connect to the Alibaba Cloud
OpenSearch instance.
add embedings and metadata into a opensearch datasource.
querying by squared euclidean and metadata.
integration tests.
ipython notebook and docs.

I have read your contributing guidelines. And I have passed the tests
below

- [x]  make format
- [x]  make lint
- [x]  make coverage
- [x]  make test

---------

Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>
2023-06-20 10:07:40 -07:00
Davis Chase
b7ad4c4c30 fix openai qa chain (#6487) 2023-06-20 10:01:13 -07:00
thehunmonkgroup
10adec5f1b add FunctionMessage support to _convert_dict_to_message() in OpenAI chat model (#6382)
Already supported in the reverse operation in
`_convert_message_to_dict()`, this just provides parity.

@hwchase17
@agola11

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-20 08:25:55 -07:00
Harrison Chase
7414e9d196 bump version to 206 (#6465) 2023-06-19 23:05:09 -07:00
Hubert
22601b0b63 fix neo4j schema query (#6381)
Fix issue #6380 

<!-- Remove if not applicable -->

Fixes #6380  (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: HubertKl <HubertKl>
2023-06-19 22:48:35 -07:00
Gavin
b0d80c4b3e Update serpapi.py Support baidu list type answer_box (#6386)
Support baidu list type answer_box

From [this document](https://serpapi.com/baidu-answer-box), we can know
that the answer_box attribute returned by the Baidu interface is a list,
and the list contains only one Object, but an error will occur when the
current code is executed.

So when answer_box is a list, we reset res["answer_box"] so that the
code can execute successfully.
2023-06-19 22:48:18 -07:00
Bryce Drennan
384fa43fc3 fix: llm caching for replicate (#6396)
Caching wasn't accounting for which model was used so a result for the
first executed model would return for the same prompt on a different
model.

This was because `Replicate._identifying_params` did not include the
`model` parameter.

FYI
- @cbh123
- @hwchase17
- @agola11
2023-06-19 22:47:59 -07:00
Zeeland
8a604b93ab feat: use latest duckduckgo_search API to call (#6409)
# Provider the latest duckduckgo_search API

The Git commit contents involve two files related to some DuckDuckGo
query operations, and an upgrade of the DuckDuckGo module to version
3.8.3. A suitable commit message could be "Upgrade DuckDuckGo module to
version 3.8.3, including query operations". Specifically, in the
duckduckgo_search.py file, a DDGS() class instance is newly added to
replace the previous ddg() function, and the time parameter name in the
get_snippets() and results() methods is changed from "time" to
"timelimit" to accommodate recent changes. In the pyproject.toml file,
the duckduckgo-search module is upgraded to version 3.8.3.

[duckduckgo_search readme
attention](https://github.com/deedy5/duckduckgo_search): Versions before
v2.9.4 no longer work as of May 12, 2023

## Who can review?

@vowelparrot

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:47:39 -07:00
Harrison Chase
9eec7c3206 Harrison/unstructured page number (#6464)
Co-authored-by: Reza Sanaie <reza@sanaie.ca>
2023-06-19 22:31:43 -07:00
Alonso Silva Allende
b82ddf9cfb Improve error message (#6275)
Trying to use OpenAI models like 'text-davinci-002' or
'text-davinci-003' the agent doesn't work and the message is 'Only
supported with OpenAI models.' The error message should be 'Only
supported with ChatOpenAI models.'

My Twitter handle is @alonsosilva
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

Co-authored-by: SILVA Alonso <alonso.silva@nokia-bell-labs.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:21:01 -07:00
zengbo
7e5f5ebf86 Fix the issue where ANTHROPIC_API_URL set in environment is not takin… (#6400)
I apologize for the error: the 'ANTHROPIC_API_URL' environment variable
doesn't take effect if the 'anthropic_api_url' parameter has a default
value.

#### Who can review?
  Models
  - @hwchase17
  - @agola11
2023-06-19 22:20:36 -07:00
Grayson Adkins
9f5f747dc3 Fix broken links in autonomous agents docs (#6398)
Fixes broken links here:  
https://python.langchain.com/docs/use_cases/autonomous_agents.html

#### Who can review?

Tag maintainers/contributors who might be interested:

  Agents / Tools / Toolkits
  - @hwchase17
2023-06-19 22:20:00 -07:00
volodymyr-memsql
d2e9b621ab Update SinglStoreDB vectorstore (#6423)
1. Introduced new distance strategies support: **DOT_PRODUCT** and
**EUCLIDEAN_DISTANCE** for enhanced flexibility.
2. Implemented a feature to filter results based on metadata fields.
3. Incorporated connection attributes specifying "langchain python sdk"
usage for enhanced traceability and debugging.
4. Expanded the suite of integration tests for improved code
reliability.
5. Updated the existing notebook with the usage example

@dev2049

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:08:58 -07:00
Avinash Raj
6efd5fa2b9 Fix for #6431 - chatprompt template with partial variables giing validation error (#6456)
W.r.t recent changes, ChatPromptTemplate does not accepting partial
variables. This PR should fix that issue.


Fixes #6431




#### Who can review?



  @hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:08:15 -07:00
Harrison Chase
02c0a1e77e Harrison/functions in retrieval (#6463) 2023-06-19 22:07:58 -07:00
Swapnil Sharma
dc4ffa8d9b Incorrect argument count handling (#5543)
Throwing ToolException when incorrect arguments are passed to tools so
that that agent can course correct them.

# Incorrect argument count handling

I was facing an error where the agent passed incorrect arguments to
tools. As per the discussions going around, I started throwing
ToolException to allow the model to course correct.

## Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:06:20 -07:00
kYLe
3a58c4c3a0 Fixed a link typo /-/route -> /-/routes. and change endpoint format (#6186)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes a link typo from `/-/route` to `/-/routes`. 
and change endpoint format
from `f"{self.anyscale_service_url}/{self.anyscale_service_route}"` to
`f"{self.anyscale_service_url}{self.anyscale_service_route}"`
Also adding documentation about the format of the endpoint
#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:05:54 -07:00
Leonid Ganeline
03b16ed2b1 docs retrievers fixes (#6299)
Fixed several inconsistencies:
- file names and notebook titles should be similar otherwise ToC on the
[retrievers
page](https://python.langchain.com/en/latest/modules/indexes/retrievers.html)
and on the left ToC tab are different. For example, now, `Self-querying
with Chroma` is not correctly alphabetically sorted because its file
named `chroma_self_query.ipynb`
- `Stringing compressors and document transformers...` demoted from `#`
to `##`. Otherwise, it appears in Toc.
- several formatting problems

#### Who can review?

@hwchase17 
@dev2049

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 22:04:35 -07:00
M. Tolga Cangöz
bccee85c8f Update introduction.mdx (#6425)
Fix typo
2023-06-19 22:04:09 -07:00
Nir Gazit
95b77a5215 Fix Custom LLM Agent example (#6429)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

The `CustomOutputParser` needs to throw `OutputParserException` when it
fails to parse the response from the agent, so that the executor can
[catch it and
retry](be9371ca8f/langchain/agents/agent.py (L767))
when `handle_parsing_errors=True`.

<!-- Remove if not applicable -->

#### Who can review?

Tag maintainers/contributors who might be interested: @hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-19 22:03:58 -07:00
ykerus
b697bbb5b5 Remove backticks without clear purpose from docs (#6442)
#### Description

- Removed two backticks surrounding the phrase "chat messages as"
- This phrase stood out among other formatted words/phrases such as
`prompt`, `role`, `PromptTemplate`, etc., which all seem to have a clear
function.
- `chat messages as`, formatted as such, confused me while reading,
leading me to believe the backticks were misplaced.

#### Who can review?

@hwchase17
<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-19 22:03:38 -07:00
Dhruvil Shah
9494623869 Update web_base.ipynb (#6430)
Minor new line character in the markdown.

Also, this option is not yet in the latest version of LangChain
(0.0.190) from Conda. Maybe in the next update.

@eyurtsev
@hwchase17
2023-06-19 21:43:35 -07:00
Wenchen Li
76ae9da9db Add _similarity_search_with_relevance_scores in Pinecone (#6446)
Just so it is consistent with other `VectorStore` classes.

This is a follow-up of #6056 which also discussed the potential of
adding `similarity_search_by_vector_returning_embeddings` that we will
continue the discussion here.

potentially related: #6286 


#### Who can review?

Tag maintainers/contributors who might be interested: @rlancemartin 

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-19 21:36:40 -07:00
Ismail Pelaseyed
d4e8e0f5ab Add example for question answering over documents with OpenAI Function Agent (#6448)
This PR adds an example of doing question answering over documents using
OpenAI Function Agents.

#### Who can review?

@hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-19 21:35:45 -07:00
Andrey Avtomonov
68a675cc68 Remove extra word in the introduction documentation (#6450)
Removed an extra word in the introduction documentation, a simple typo
2023-06-19 21:31:17 -07:00
Ankush Gola
a9246333fd fix anthropic chat model mutating input list (#6457)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes: ChatAnthropic was mutating the input message list during
formatting which isn't ideal bc you could be changing the behavior for
other chat models when using the same input

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
2023-06-19 21:30:52 -07:00
Zander Chase
bc0af67aaf Add Trajectory Eval RunEvaluator (#6449) 2023-06-19 21:11:50 -07:00
Hakan Tekgul
6a157cf8bb Update arize_callback.py (#6433)
Arize released a new Generative LLM Model Type, adjusting the callback
function to new logging.

Added arize imports, please delete if not necessary.

Specifically, this change makes sure that the prompt and response pairs
from LangChain agents are logged into Arize as a Generative LLM model,
instead of our previous categorical model. In order to do this, the
callback functions collects the necessary data and passes the data into
Arize using Python Pandas SDK.

Arize library, specifically pandas.logger is an additional dependency.

Notebook For Test:
https://docs.arize.com/arize/resources/integrations/langchain

Who can review?
Tag maintainers/contributors who might be interested:

@hwchase17 - project lead

Tracing / Callbacks

@agola11
2023-06-19 18:33:49 -07:00
Zander Chase
00f276d23f Run eval in eval mode (#6447)
For the `run_on_dataset` sessions
2023-06-19 18:31:38 -07:00
Harrison Chase
1300a4bc8c expose docs chains (#6453) 2023-06-19 17:18:54 -07:00
Harrison Chase
286452c7f0 remove mongo 2023-06-19 10:04:14 -07:00
David Duong
be9371ca8f Include placeholder value for all secrets, not just kwargs (#6421)
Mirror PR for https://github.com/hwchase17/langchainjs/pull/1696

Secrets passed via environment variables should be present in the
serialised chain
2023-06-19 15:41:45 +01:00
Harrison Chase
df40cd233f bump version to 205 (#6410) 2023-06-18 23:21:26 -07:00
Harrison Chase
e9c2b280db Harrison/refactor functions (#6408) 2023-06-18 23:13:42 -07:00
Harrison Chase
6a4a950a3c changes to llm chain (#6328)
- return raw and full output (but keep run shortcut method functional)
- change output parser to take in generations (good for working with
messages)
- add output parser to base class, always run (default to same as
current)

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-06-18 22:49:47 -07:00
Davis Chase
d3c2eab0b3 Docs nit (#6350) 2023-06-18 20:58:12 -07:00
Davis Chase
af96de6552 fix prod docs build (#6402) 2023-06-18 20:56:12 -07:00
Fei Wang
50556f3b35 support memory for functions (#6165)
#### Before submitting
Add memory support for `OpenAIFunctionsAgent` like
`StructuredChatAgent`.


#### Who can review?
 @hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-18 19:00:40 -07:00
Dhruvil Shah
b2b9ded12f Update web_base.py _fetch() method For SiteMapLoader (#6256)
A must-include for SiteMap Loader to avoid the SSL verification error.
Setting the 'verify' to False by ``` sitemap_loader.requests_kwargs =
{"verify": False}``` does not bypass the SSL verification in some
websites.

There are websites (https:// researchadmin.asu.edu/ sitemap.xml) where
setting "verify" to False as shown below would not work:
sitemap_loader.requests_kwargs = {"verify": False} 

We need this merge to tell the Session to use a connector with a
specific argument about SSL:
 \# For SiteMap SSL verification
if not self.request_kwargs['verify']:
    connector = aiohttp.TCPConnector(ssl=False)
else:
    connector = None
 
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

Fixes #5483 

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

@hwchase17 
@eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-18 18:34:18 -07:00
Harrison Chase
10bff4ecc4 Harrison/chroma fix (#6390)
Co-authored-by: Junu Moon(Fran) <francomoon7@gmail.com>
2023-06-18 18:33:26 -07:00
Harrison Chase
5c1fa3e70e Harrison/typesense fix (#6391)
Co-authored-by: Gaurav Chauhan <2796gaurav@gmail.com>
Co-authored-by: gaurav <gaurav.chauhan1@rksv.in>
2023-06-18 18:33:15 -07:00
Harrison Chase
5ccebce777 rm pandas from arize (#6392) 2023-06-18 18:33:04 -07:00
matias-biatoz
3b7c4c51d5 Added gpt-3.5-turbo 0613 16k and 16k-0613 pricing (#6287)
@agola11 

Issue
#6193 

I added the new pricing for the new models.

Also, now gpt-3.5-turbo got split into "input" and "output" pricing. It
currently does not support that.
2023-06-18 18:32:20 -07:00
Ly Nguyen
1e0af59f69 - Fix pass system_message argument in new feature openai_functions_agent (#6297)
can't pass system_message argument, the prompt always show default
message "System: You are a helpful AI assistant."
```
system_message = SystemMessage(
    content="You are an AI that provides information to Human regarding documentation."
)
agent = initialize_agent(
    tools,
    llm=openai_llm_chat,
    agent=AgentType.OPENAI_FUNCTIONS,
    system_message=system_message,
    agent_kwargs={
        "system_message": system_message,
    },
    verbose=False,
)
```

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-18 17:54:00 -07:00
georgian
e64bafed3a Fixes typo in Vectara.similarity_search (#6277)
Fixes a simple typo.

@hwchase17
@dev2049

Co-authored-by: Georgian Sarghi <georgian.sarghi@gmail.com>
2023-06-18 17:48:54 -07:00
Ted
112695e4da Iterate through filtered file types instead of all listed files (#6258)
# Iterate through filtered file types instead of all listed files

Fixes https://github.com/hwchase17/langchain/issues/6257

https://github.com/hwchase17/langchain/pull/4926 originally added the
functionality to filter by file type, storing the filtered files in
`_files`

https://github.com/hwchase17/langchain/pull/5220 removed the
functionality when adding code to filter trashed files by using the
`files` variables instead of the `_files` variable.

This PR simply adds the functionality back by using `_files` again.

#### Who can review?

@hwchase17 - project lead
@eyurtsev
2023-06-18 17:47:58 -07:00
Dhruvil Shah
ba90e3c990 Update web_base.ipynb for guiding purposes (#6248)
To bypass SSL verification errors during fetching, you can include the
`verify=False` parameter. This markdown proves useful, especially for
beginners in the field of web scraping.

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

Fixes #6079 

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17 
@eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-18 17:47:10 -07:00
Dhruvil Shah
92f05a67a4 Add markdown to specify important arguments (#6246)
To bypass SSL verification errors during web scraping, you can include
the ssl_verify=False parameter along with the headers parameter. This
combination of arguments proves useful, especially for beginners in the
field of web scraping.

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

Fixes #1829 

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17 @eyurtsev 
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-18 17:47:00 -07:00
ikebo
ca7a44d024 add max_context_size property in BaseOpenAI (#6239)
Hi, I make a small improvement for BaseOpenAI.

I added a max_context_size attribute to BaseOpenAI so that we can get
the max context size directly instead of only getting the maximum token
size of the prompt through the max_tokens_for_prompt method.

Who can review?
@hwchase17 @agola11

I followed the [Common
Tasks](c7db9febb0/.github/CONTRIBUTING.md),
the test is all passed.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-18 17:46:35 -07:00
Jan Pawellek
3e3ed8c5c9 Fix LLM types so that they can be loaded from config dicts (#6235)
LLM configurations can be loaded from a Python dict (or JSON file
deserialized as dict) using the
[load_llm_from_config](8e1a7a8646/langchain/llms/loading.py (L12))
function.

However, the type string in the `type_to_cls_dict` lookup dict differs
from the type string defined in some LLM classes. This means that the
LLM object can be saved, but not loaded again, because the type strings
differ.
2023-06-18 17:46:22 -07:00
Shu
46782ad79b Fixed an unhandled error that was raised when DynamoDB did not have any chat history. (#6141)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.



After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

The current version of chat history with DynamoDB doesn't handle the
case correctly when a table has no chat history. This change solves this
error handling.

<!-- Remove if not applicable -->

Fixes https://github.com/hwchase17/langchain/issues/6088

#### Who can review?

Tag maintainers/contributors who might be interested:

@hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-18 17:39:19 -07:00
Cameron Vetter
2286204354 Correct AzureSearch Vector Store not applying search_kwargs when searching (#6132)
Fixes #6131 

Simply passes kwargs forward from similarity_search to helper functions
so that search_kwargs are applied to search as originally intended. See
bug for repro steps.

#### Who can review?
  @hwchase17
  @dev2049 

Twitter: poshporcupine
2023-06-18 17:39:06 -07:00
Pierre Dulac
395a2a3724 Fix typo in the CAI critique prompt (#6123)
Very small typo in the Constitutional AI critique default prompt. The
negation "If there is *no* material critique of ..." is used two times,
should be used only on the first one.

Cheers,
Pierre
2023-06-18 17:38:56 -07:00
Hao Chen
38057f0d2e Fix latest clickhouse vector schema change (#6385)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

Fixes https://github.com/hwchase17/langchain/issues/6208

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
 
 VectorStores / Retrievers / Memory
  - @dev2049
2023-06-18 17:34:53 -07:00
Davit Buniatyan
1ab9dc8293 [hotfix] Deep Lake fails on newer version due to hardcode (#6383)
Hot Fixes for Deep Lake [would highly appreciate expedited review]

* deeplake version was hardcoded and since deeplake upgraded the
integration fails with confusing error
* an additional integration test fixed due to embedding function
* Additionally fixed docs for code understanding links after docs
upgraded
* notebook removal of public parameter to make sure code understanding
notebook works

#### Who can review?
  @hwchase17  @dev2049

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-06-18 17:33:49 -07:00
hp0404
6aa7b04f79 Fix integration tests for Faiss vector store (#6281)
Fixes #5807 (issue)

#### Who can review?

Tag maintainers/contributors who might be interested: @dev2049

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-18 17:25:49 -07:00
Chakib Benziane
ddd518a161 searx_search: updated tools and doc (#6276)
- Allows using the  same wrapper to create multiple tools
```python
wrapper = SearxSearchWrapper(searx_host="**")
github_tool = SearxSearchResults(name="Github",
                            wrapper=wrapper,
                            kwargs = {
                                "engines": ["github"],
                                })

arxiv_tool = SearxSearchResults(name="Arxiv",
                            wrapper=wrapper,
                            kwargs = {
                                "engines": ["arxiv"]
                                })
```

- Updated link to searx documentation

  Agents / Tools / Toolkits
  - @hwchase17
2023-06-18 17:23:12 -07:00
ju-bezdek
e2f36ee608 OpenAI functions dont work with async streaming... #6225 (#6226)
Related to this https://github.com/hwchase17/langchain/issues/6225

Just copied the implementation from `generate` function to `agenerate`
and tested it.

Didn't run any official tests thought

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes #6225

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
  @hwchase17, @agola11

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-18 17:05:16 -07:00
Jan Pawellek
ea6a5b03e0 Fix output final text for HuggingFaceTextGenInference when streaming (#6211)
The LLM integration
[HuggingFaceTextGenInference](https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_text_gen_inference.py)
already has streaming support.

However, when streaming is enabled, it always returns an empty string as
the final output text when the LLM is finished. This is because `text`
is instantiated with an empty string and never updated.

This PR fixes the collection of the final output text by concatenating
new tokens.
2023-06-18 17:01:15 -07:00
Tomaz Bratanic
b3bccabc66 Add option to save/load graph cypher QA (#6219)
Similar as https://github.com/hwchase17/langchain/pull/5818

Added the functionality to save/load Graph Cypher QA Chain due to a user
reporting the following error

> raise NotImplementedError("Saving not supported for this chain
type.")\nNotImplementedError: Saving not supported for this chain
type.\n'
2023-06-18 17:00:27 -07:00
Harrison Chase
495128ba95 Harrison/functions docs improvements (#6389)
Co-authored-by: Sumanth Donthula <46747610+sumanthdonthula@users.noreply.github.com>
2023-06-18 16:57:33 -07:00
Leonid Ganeline
c7ca350cd3 Fix class promotion (#6187)
In LangChain, all module classes are enumerated in the `__init__.py`
file of the correspondent module. But some classes were missed and were
not included in the module `__init__.py`

This PR:
- added the missed classes to the module `__init__.py` files
- `__init__.py:__all_` variable value (a list of the class names) was
sorted
- `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into
the `QuerySQLCheckerTool` because it conflicted with
`langchain.tools.spark_sql.tool.QueryCheckerTool`
- changes to `pyproject.toml`:
  - added `pgvector` to `pyproject.toml:extended_testing`
- added `pandas` to
`pyproject.toml:[tool.poetry.group.test.dependencies]`
- commented out the `streamlit` from `collbacks/__init__.py`, It is
because now the `streamlit` requires Python >=3.7, !=3.9.7
- fixed duplicate names in `tools`
- fixed correspondent ut-s

#### Who can review?
@hwchase17
@dev2049
2023-06-18 16:55:18 -07:00
Harrison Chase
c0c2fd0782 Harrison/zep mem (#6388)
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
2023-06-18 16:53:35 -07:00
Harrison Chase
b7159c15cc Harrison/metaphor search fix (#6387)
Co-authored-by: jeffzwang <jeffreyzhiyuanwang@gmail.com>
2023-06-18 16:53:24 -07:00
Harrison Chase
9bf5b0defa Harrison/myscale self query (#6376)
Co-authored-by: Fangrui Liu <fangruil@moqi.ai>
Co-authored-by: 刘 方瑞 <fangrui.liu@outlook.com>
Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>
2023-06-18 16:53:10 -07:00
Harrison Chase
bd8d418a95 Merge branch 'master' of github.com:hwchase17/langchain 2023-06-18 16:45:49 -07:00
Harrison Chase
3a75d59c3d searx - docs 2023-06-18 16:45:42 -07:00
MIDORIBIN
5be465bd86 Fixed PermissionError on windows (#6170)
Fixed PermissionError that occurred when downloading PDF files via http
in BasePDFLoader on windows.

When downloading PDF files via http in BasePDFLoader, NamedTemporaryFile
is used.
This function cannot open the file again on **Windows**.[Python
Doc](https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile)

So, we created a **temporary directory** with TemporaryDirectory and
placed the downloaded file there.
temporary directory is deleted in the deconstruct.

Fixes #2698

#### Who can review?

Tag maintainers/contributors who might be interested:

  - @eyurtsev
  - @hwchase17
2023-06-18 16:39:57 -07:00
xleven
4fc7939848 fix link of callbacks on modules page (#6323)
Since
[Callbacks](https://python.langchain.com/docs/modules/callbacks/getting_started/)
on [Modules](https://python.langchain.com/docs/modules/) went to a "Page
Not Found".
2023-06-18 15:08:12 -07:00
Vijay
2b3b4e0f60 Add the ability to run the map_reduce chains process results step as async (#6181)
This will add the ability to add an AsyncCallbackManager (handler) for
the reducer chain, which would be able to stream the tokens via the
`async def on_llm_new_token` callback method



Fixes # (issue)
[5532](https://github.com/hwchase17/langchain/issues/5532)


 @hwchase17  @agola11 
The following code snippet explains how this change would be used to
enable `reduce_llm` with streaming support in a `map_reduce` chain

I have tested this change and it works for the streaming use-case of
reducer responses. I am happy to share more information if this makes
solution sense.

```

AsyncHandler
..........................
class StreamingLLMCallbackHandler(AsyncCallbackHandler):
    """Callback handler for streaming LLM responses."""

    def __init__(self, websocket):
        self.websocket = websocket
    
    # This callback method is to be executed in async
    async def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
        resp = ChatResponse(sender="bot", message=token, type="stream")
        await self.websocket.send_json(resp.dict())


Chain
..........
stream_handler = StreamingLLMCallbackHandler(websocket)
stream_manager = AsyncCallbackManager([stream_handler])

streaming_llm = ChatOpenAI(
        streaming=True,
        callback_manager=stream_manager,
        verbose=False,
        temperature=0,
    )
    main_llm = OpenAI(
        temperature=0,
        verbose=False,
    )

    doc_chain = load_qa_chain(
        llm=main_llm,
        reduce_llm=streaming_llm,
        chain_type="map_reduce", 
        callback_manager=manager
    )
    qa_chain = ConversationalRetrievalChain(
        retriever=vectorstore.as_retriever(),
        combine_docs_chain=doc_chain,
        question_generator=question_generator,
        callback_manager=manager,
    )
    
    # Here `acall` will trigger `acombine_docs` on `map_reduce` which should then call `_aprocess_result` which in turn will call `self.combine_document_chain.arun` hence async callback will be awaited
    result = await qa_chain.acall(
         {"question": question, "chat_history": chat_history}
      )
```
2023-06-18 13:19:56 -07:00
Alvaro Bartolome
e0dea577ee Extend ArgillaCallbackHandler support (#6153)
Hi again @agola11! 🤗

## What's in this PR?

After playing around with different chains we noticed that some chains
were using different `output_key`s and we were just handling some, so
we've extended the support to any output, either if it's a Python list
or a string.

Kudos to @dvsrepo for spotting this!

---------

Co-authored-by: Daniel Vila Suero <daniel@argilla.io>
2023-06-18 11:18:33 -07:00
Harrison Chase
a8cb9ee013 Harrison/gdrive enhancements (#6375)
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>
2023-06-18 11:07:23 -07:00
rafael
ebfffaa38f Guardrails output parser: Pass LLM api for reasking (#6089)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes https://github.com/ShreyaR/guardrails/issues/155 

Enables guardrails reasking by specifying an LLM api in the output
parser.
2023-06-18 10:50:20 -07:00
Davis Chase
ec850e607f bump 203 (#6372) 2023-06-18 09:20:47 -07:00
Lance Martin
370becdfc2 Add self query retriever example with MD header splitting (#6359)
Flesh out the notebook example for `MarkdownHeaderTextSplitter`
2023-06-17 21:40:20 -07:00
Lance Martin
2c97fbabbd Update MD header text splitter notebook (#6339)
Highlight use case for maintaining header groups when splitting.
2023-06-17 13:19:27 -07:00
Harrison Chase
a2bbe3dda4 Harrison/mmr support for opensearch (#6349)
Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>
2023-06-17 12:22:37 -07:00
Davis Chase
2eea5d4cb4 Add ignore vercel preview script (#6320)
skip building preview of docs for anything branch that doesn't start
with `__docs__`. will eventually update to look at code diff directories
but patching for now
2023-06-17 11:17:08 -07:00
Harrison Chase
7a48d9ee82 Merge branch 'master' of github.com:hwchase17/langchain 2023-06-17 11:16:19 -07:00
Kenny
e30fdffd1e Add new openai 0613 model costs (#6110)
Added costs for gpt-4-32k-0613, gpt-4-0613, gpt-3.5-turbo-16k,
gpt-3.5-turbo-0613, and gpt-3.5-turbo-16k-0613 to openai_info callback
based on this [OpenAI
post](https://openai.com/blog/function-calling-and-other-api-updates)

@agola11
2023-06-17 11:11:47 -07:00
Dhruvil Shah
2eec687474 update web_base.py to have verify option (#6107)
We propose an enhancement to the web-based loader initialize method by
introducing a "verify" option. This enhancement addresses the issue of
SSL verification errors encountered on certain web pages. By providing
users with the option to set the verify parameter to False, we offer
greater flexibility and control.
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

### Fixes #6079 

#### Who can review?
@eyurtsev @hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-17 11:10:48 -07:00
Harrison Chase
680d6bbbf8 fix titles in documentation 2023-06-17 11:09:11 -07:00
Nuno Campos
e194dc5306 Make lckwargs private (#6344)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-17 19:08:25 +01:00
Harrison Chase
8cfb52ddbb fix spelling 2023-06-17 11:06:54 -07:00
zengbo
5d5298087f Custom Anthropic API URL (#6221)
[Feature] User can custom the Anthropic API URL

#### Who can review?

Tag maintainers/contributors who might be interested:

  Models
  - @hwchase17
  - @agola11
2023-06-17 11:01:29 -07:00
Harrison Chase
61e4a1adf9 Harrison/faiss score (#6341)
Co-authored-by: Frank Stein <16441059+simonfromla@users.noreply.github.com>
Co-authored-by: Sims Juju <sims@Ju.lan>
2023-06-17 11:00:47 -07:00
Harrison Chase
42a28ac1ba Harrison/error zero tools (#6340)
Co-authored-by: Juhee Kim <46583939+juppytt@users.noreply.github.com>
2023-06-17 11:00:35 -07:00
Slawomir Gonet
eef62bf4e9 qdrant: search by vector (#6043)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Added support to `search_by_vector` to Qdrant Vector store.

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->


### Who can review
VectorStores / Retrievers / Memory
- @dev2049
<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17



 -->
2023-06-17 09:44:28 -07:00
Mark
b7ba7e8a7b Allow GoogleDrive to authenticate via application default credentials on Cloud Run/GCE etc without service key (#6035)
@eyurtsev

The existing GoogleDrive implementation always needs a service account
to be available at the credentials location. When running on GCP
services such as Cloud Run, a service account already exists in the
metadata of the service, so no physical key is necessary. This change
adds a check to see if it is running in such an environment, and uses
that authentication instead.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-17 09:44:17 -07:00
lonestriker
6f36f0f930 Add oobabooga/text-generation-webui support as a llm (#5997)
Add oobabooga/text-generation-webui support as an LLM. Currently,
supports using text-generation-webui's non-streaming API interface.
Allows users who already have text-gen running to use the same models
with langchain.

#### Before submitting

Simple usage, similar to existing LLM supported:

```
from langchain.llms import TextGen
llm = TextGen(model_url = "http://localhost:5000")
```
#### Who can review?

 @hwchase17 - project lead

---------

Co-authored-by: Hien Ngo <Hien.Ngo@adia.ae>
2023-06-17 09:42:15 -07:00
Richy Wang
444ca3f669 Improve AnalyticDB Vector Store implementation without affecting user (#6086)
Hi there:

As I implement the AnalyticDB VectorStore use two table to store the
document before. It seems just use one table is a better way. So this
commit is try to improve AnalyticDB VectorStore implementation without
affecting user behavior:

**1. Streamline the `post_init `behavior by creating a single table with
vector indexing.
2. Update the `add_texts` API for document insertion.
3. Optimize `similarity_search_with_score_by_vector` to retrieve results
directly from the table.
4. Implement `_similarity_search_with_relevance_scores`.
5. Add `embedding_dimension` parameter to support different dimension
embedding functions.**

Users can continue using the API as before. 
Test cases added before is enough to meet this commit.
2023-06-17 09:36:31 -07:00
Ja-sonYun
cdd1d78bf2 make modelname_to_contextsize as a staticmethod (#6040)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes ##6039

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17 @agola11
<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-17 09:13:08 -07:00
Saba Sturua
427551eabf DocArray as a Retriever (#6031)
## DocArray as a Retriever

[DocArray](https://github.com/docarray/docarray) is an open-source tool
for managing your multi-modal data. It offers flexibility to store and
search through your data using various document index backends. This PR
introduces `DocArrayRetriever` - which works with any available backend
and serves as a retriever for Langchain apps.

Also, I added 2 notebooks:
DocArray Backends - intro to all 5 currently supported backends, how to
initialize, index, and use them as a retriever
DocArray Usage - showcasing what additional search parameters you can
pass to create versatile retrievers

Example:
```python
from docarray.index import InMemoryExactNNIndex
from docarray import BaseDoc, DocList
from docarray.typing import NdArray
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.retrievers import DocArrayRetriever


# define document schema
class MyDoc(BaseDoc):
    description: str
    description_embedding: NdArray[1536]


embeddings = OpenAIEmbeddings()
# create documents
descriptions = ["description 1", "description 2"]
desc_embeddings = embeddings.embed_documents(texts=descriptions)
docs = DocList[MyDoc](
    [
        MyDoc(description=desc, description_embedding=embedding)
        for desc, embedding in zip(descriptions, desc_embeddings)
    ]
)

# initialize document index with data
db = InMemoryExactNNIndex[MyDoc](docs)

# create a retriever
retriever = DocArrayRetriever(
    index=db,
    embeddings=embeddings,
    search_field="description_embedding",
    content_field="description",
)

# find the relevant document
doc = retriever.get_relevant_documents("action movies")
print(doc)
```

#### Who can review?

@dev2049

---------

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
2023-06-17 09:09:33 -07:00
Masafumi Mori
7bb437146d fix links to prompt templates and example selectors (#6332)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # 
links to prompt templates and example selectors on the
[Prompts](https://python.langchain.com/docs/modules/model_io/prompts/)
page are invalid.

#### Before submitting
Just a small note that I tried to run `make docs_clean` and other
related commands before PR written
[here](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md#build-documentation-locally),
it gives me an error:
```bash
langchain % make docs_clean
Traceback (most recent call last):
  File "/Users/masafumi/Downloads/langchain/.venv/bin/make", line 5, in <module>
    from scripts.proto import main
ModuleNotFoundError: No module named 'scripts'
make: *** [docs_clean] Error 1
# Poetry (version 1.5.1)
# Python 3.9.13
```
I couldn't figure out how to fix this, so I didn't run those command.
But links should work.

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17

Similar issue #6323

Co-authored-by: masafumimori <m.masafumimori@outlook.com>
2023-06-17 09:07:14 -07:00
Francisco Ingham
83eea230f3 changed height in the nb example (#6327)
changed height in the example to a more reasonable number (from 9 feet
to 6 feet)
2023-06-17 00:05:48 -07:00
James O'Dwyer
0475d015fe Handle Managed Motorhead Data Key (#6169)
# Handle Managed Motorhead Data Key
Managed motorhead will return a payload with a `data` key. we need to
handle this to properly access messages from the server.
2023-06-16 20:36:18 -07:00
Luke Stanley
364f8e7b5d Better Entity Memory code documentation (#6318)
Just adds some comments and docstring improvements.

There was some behaviour that was quite unclear to me at first like:
- "when do things get updated?"
- "why are there only entity names and no summaries?"
- "why do the entity names disappear?" 

Now it can be much more obvious to many.

I am lukestanley on Twitter.
2023-06-16 18:08:44 -07:00
Harrison Chase
af18413d97 Harrison/deeplake new features (#6263)
Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-16 17:53:55 -07:00
Davis Chase
6640293087 fix eval guide links (#6319) 2023-06-16 17:53:46 -07:00
ljeagle
ad324a39ae Improve the performance of add_texts interface and upgrade the AwaDB from 0.3.2 to 0.3.3 (#6316)
1. Changed the implementation of add_texts interface for the AwaDB
vector store in order to improve the performance
2. Upgrade the AwaDB from 0.3.2 to 0.3.3

---------

Co-authored-by: vincent <awadb.vincent@gmail.com>
2023-06-16 16:50:01 -07:00
Davis Chase
24b2af5218 nit (#6305) 2023-06-16 16:21:27 -07:00
Pierre Alexandre SCHEMBRI
9ca11c06b7 Fixes #6282 (#6283)
Fixes #6282 

1 liner to fix default http headers not passed by `LLMRequestsChain`
2023-06-16 16:21:01 -07:00
Davis Chase
23cdebddc4 Del linkcheck readme (#6317) 2023-06-16 16:18:45 -07:00
Brigit Murtaugh
ccd916babe Update dev container (#6189)
Fixes https://github.com/hwchase17/langchain/issues/6172

As described in https://github.com/hwchase17/langchain/issues/6172, I'd
love to help update the dev container in this project.

**Summary of changes:**
- Dev container now builds (the current container in this repo won't
build for me)
- Dockerfile updates
- Update image to our [currently-maintained Python
image](https://github.com/devcontainers/images/tree/main/src/python/.devcontainer)
(`mcr.microsoft.com/devcontainers/python`) rather than the deprecated
image from vscode-dev-containers
- Move Dockerfile to root of repo - in order for `COPY` to work
properly, it needs the files (in this case, `pyproject.toml` and
`poetry.toml`) in the same directory
- devcontainer.json updates
- Removed `customizations` and `remoteUser` since they should be covered
by the updated image in the Dockerfile
     - Update comments
- Update docker-compose.yaml to properly point to updated Dockerfile
- Add a .gitattributes to avoid line ending conversions, which can
result in hundreds of pending changes
([info](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files))
- Add a README in the .devcontainer folder and info on the dev container
in the contributing.md

**Outstanding questions:**
- Is it expected for `poetry install` to take some time? It takes about
30 minutes for this dev container to finish building in a Codespace, but
a user should only have to experience this once. Through some online
investigation, this doesn't seem unusual
- Versions of poetry newer than 1.3.2 failed every time - based on some
of the guidance in contributing.md and other online resources, it seemed
changing poetry versions might be a good solution. 1.3.2 is from Jan
2023

---------

Co-authored-by: bamurtaugh <brmurtau@microsoft.com>
Co-authored-by: Samruddhi Khandale <samruddhikhandale@github.com>
2023-06-16 15:42:14 -07:00
Davis Chase
03b5891cf7 more redirect (#6314) 2023-06-16 14:43:59 -07:00
Davis Chase
eaee492dbc basic redirect (#6309) 2023-06-16 13:39:58 -07:00
Davis Chase
d2243757a3 update readme (#6304) 2023-06-16 12:27:16 -07:00
Davis Chase
2f47e5c766 update api link (#6303) 2023-06-16 12:18:17 -07:00
Davis Chase
d558bcfad8 rm ignore_vercel (#6302) 2023-06-16 12:06:58 -07:00
Davis Chase
87e502c6bc Doc refactor (#6300)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-16 11:52:56 -07:00
Harrison Chase
94c82a189d bump to 202 (#6262) 2023-06-16 06:52:36 -07:00
hp0404
b01cf0dd54 ArxivAPIWrapper - doc_content_chars_max (#6063)
This PR refactors the ArxivAPIWrapper class making
`doc_content_chars_max` parameter optional. Additionally, tests have
been added to ensure the functionality of the doc_content_chars_max
parameter.

Fixes #6027 (issue)
2023-06-15 22:16:42 -07:00
Daniel King
a9b97aa6f4 Update output format of MosaicML endpoint to be more flexible (#6060)
There will likely be another change or two coming over the next couple
weeks as we stabilize the API, but putting this one in now which just
makes the integration a bit more flexible with the response output
format.

```
(langchain) danielking@MML-1B940F4333E2 langchain % pytest tests/integration_tests/llms/test_mosaicml.py tests/integration_tests/embeddings/test_mosaicml.py 
=================================================================================== test session starts ===================================================================================
platform darwin -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0
rootdir: /Users/danielking/github/langchain
configfile: pyproject.toml
plugins: asyncio-0.20.3, mock-3.10.0, dotenv-0.5.2, cov-4.0.0, anyio-3.6.2
asyncio: mode=strict
collected 12 items                                                                                                                                                                        

tests/integration_tests/llms/test_mosaicml.py ......                                                                                                                                [ 50%]
tests/integration_tests/embeddings/test_mosaicml.py ......                                                                                                                          [100%]

=================================================================================== slowest 5 durations ===================================================================================
4.76s call     tests/integration_tests/llms/test_mosaicml.py::test_retry_logic
4.74s call     tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_llm_call
4.13s call     tests/integration_tests/llms/test_mosaicml.py::test_instruct_prompt
0.91s call     tests/integration_tests/llms/test_mosaicml.py::test_short_retry_does_not_loop
0.66s call     tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_extra_kwargs
=================================================================================== 12 passed in 19.70s ===================================================================================
```

#### Who can review?

  @hwchase17
  @dev2049
2023-06-15 22:15:39 -07:00
JaysonAlbert
50d9c7d5a4 Fix: change the chatgpt plugin retriever metadata format (#5920)
the current implement put the doc itself as the metadata, but the
document chatgpt plugin retriever returned already has a `metadata`
field, it's better to use that instead.

the original code will throw the following exception when using
`RetrievalQAWithSourcesChain`, becuse it can not find the field
`metadata`:

```python
Exception has occurred: ValueError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
Document prompt requires documents to have metadata variables: ['source']. Received document with missing metadata: ['source'].
  File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 27, in format_document
    raise ValueError(
  File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in <listcomp>
    doc_strings = [format_document(doc, self.document_prompt) for doc in docs]
  File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in _get_inputs
    doc_strings = [format_document(doc, self.document_prompt) for doc in docs]
  File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 85, in combine_docs
    inputs = self._get_inputs(docs, **kwargs)
  File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 84, in _call
    output, extra_return_dict = self.combine_docs(
  File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py", line 140, in __call__
    raise e
```

Additionally, the `metadata` filed in the `chatgpt plugin retriever`
have these fileds by default:
```json
{
    "source":  "file",   //email, file or chat
    "source_id": "filename.docx", // the filename
    "url": "", 
    ...
}
```
so, we should set `source_id` to `source` in the langchain metadata.

```python
metadata = d.pop("metadata", d)
if(metadata.get("source_id")):
    metadata["source"] = metadata.pop("source_id")
```

#### Who can review?
@dev2049

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: wangjie <wangjie@htffund.com>
2023-06-15 22:04:45 -07:00
Harrison Chase
e67b26eee9 Harrison/openai functions (#6261)
Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>
2023-06-15 21:54:39 -07:00
Harrison Chase
6aafb46807 Harrison/openai functions (#6223)
Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>
2023-06-15 21:43:33 -07:00
Zander Chase
bc9b8c8239 Improve Error Message for failed callback (#6247)
Include the handler class name in the warning
2023-06-15 19:18:37 -07:00
Alon Roth
0013256e81 Support chat history persistence in AutoGPT (#5716)
**Short Description**
Added a new argument to AutoGPT class which allows to persist the chat
history to a file.

**Changes**
1. Removed the `self.full_message_history: List[BaseMessage] = []`
2. Replaced it with `chat_history_memory` which can take any subclasses
of `BaseChatMessageHistory`

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-15 17:49:03 -07:00
Martin Antos
1913320cbe Feature/add acreom loader (#5780)
adding new loader for [acreom](https://acreom.com) vaults. It's based on
the Obsidian loader with some additional text processing for acreom
specific markdown elements.

 @eyurtsev please take a look!

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-15 11:53:00 -07:00
Zander Chase
ae76e473e1 Add Tags for LLMs (#6229)
- [x] Add tracing tags to LLMs + Chat Models (both inheritable and
local)
- [x] Add tags for the run_on_dataset helper function(s)
2023-06-15 11:24:11 -07:00
Harrison Chase
8e1a7a8646 bump version to 201 (#6233) 2023-06-15 08:28:47 -07:00
Harrison Chase
e82687ddf4 Harrison/use functions agent (#6185)
Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>
2023-06-15 08:18:50 -07:00
Ryo Kanazawa
7d2b946d0b Fix typo pandocs to pandoc (#6203)
Fixes https://github.com/hwchase17/langchain/issues/6204

### Context

An typo issue with `pandoc`.

#### Who can review?
@hwchase17
2023-06-15 08:18:27 -07:00
Kyle Roth
c7db9febb0 count tokens for new OpenAI model versions (#6195)
Trying to call `ChatOpenAI.get_num_tokens_from_messages` returns the
following error for the newly announced models `gpt-3.5-turbo-0613` and
`gpt-4-0613`:

```
NotImplementedError: get_num_tokens_from_messages() is not presently implemented for model gpt-3.5-turbo-0613.See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens.
```

This adds support for counting tokens for those models, by counting
tokens the same way they're counted for the previous versions of
`gpt-3.5-turbo` and `gpt-4`.

#### reviewers

  - @hwchase17
  - @agola11
2023-06-15 06:16:03 -07:00
xu0o0
7ad13cdbdb feat: add content_format param to ConfluenceLoader.load() (#5922)
Confluence API supports difference format of page content. The storage
format is the raw XML representation for storage. The view format is the
HTML representation for viewing with macros rendered as though it is
viewed by users.

Add the `content_format` parameter to `ConfluenceLoader.load()` to
specify the content format, this is
set to `ContentFormat.STORAGE` by default.

#### Who can review?

Tag maintainers/contributors who might be interested: @eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-14 16:56:28 -07:00
0xJordan
c5a46e7435 feat: Add support for the Solidity language (#6054)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

## Add Solidity programming language support for code splitter.

Twitter: @0xjord4n_

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-14 14:25:02 -07:00
Nuno Campos
17c4ec4812 Add docs for tags (#6155)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-14 14:01:58 -07:00
thiswillbeyourgithub
4a649e3b14 typo: 'following following' to 'following' (#6163)
Co-authored-by: thiswillbeyourgithub <github@32mail.33mail.com>
2023-06-14 10:58:47 -07:00
Maciej Bryński
8a44c879c6 Update readthedocs_documentation.ipynb (#6148)
Minor fix in documentation. 
Change URL in wget call to proper one.
2023-06-14 07:21:48 -07:00
Zander Chase
e0e3ef1c57 Update Name (#6136) 2023-06-13 22:25:36 -07:00
Zander Chase
4555ad5d1f Add Run Collector Callback (#6133)
Add a callback handler that can collect nested run objects. Useful for
evaluation.
2023-06-13 22:17:37 -07:00
Harrison Chase
6ac120f299 bump ver to 200 (#6130) 2023-06-13 19:33:51 -07:00
Harrison Chase
e41f0b341c add functions agent (#6113) 2023-06-13 18:51:01 -07:00
Zander Chase
b3b155d488 Return session name in runner response (#6112)
Makes it easier to then run evals w/o thinking about specifying a
session
2023-06-13 16:59:43 -07:00
Harrison Chase
e74733ab9e support streaming for functions (#6115) 2023-06-13 15:26:26 -07:00
Nuno Campos
11ab0be11a Add support for tags (#5898)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
2023-06-13 12:30:59 -07:00
Harrison Chase
1281fdf0f2 Harrison/notebook functions (#6103) 2023-06-13 10:52:54 -07:00
1727 changed files with 95199 additions and 46901 deletions

37
.devcontainer/README.md Normal file
View File

@@ -0,0 +1,37 @@
# Dev container
This project includes a [dev container](https://containers.dev/), which lets you use a container as a full-featured dev environment.
You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).
## GitHub Codespaces
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)
You may use the button above, or follow these steps to open this repo in a Codespace:
1. Click the **Code** drop-down menu at the top of https://github.com/hwchase17/langchain.
1. Click on the **Codespaces** tab.
1. Click **Create codespace on master** .
For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).
## VS Code Dev Containers
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)
If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
You can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).
2. Open a locally cloned copy of the code:
- Clone this repository to your local filesystem.
- Press <kbd>F1</kbd> and select the **Dev Containers: Open Folder in Container...** command.
- Select the cloned copy of this folder, wait for the container to start, and try things out!
You can learn more in the [Dev Containers documentation](https://code.visualstudio.com/docs/devcontainers/containers).
## Tips and tricks
* If you are working with the same repository folder in a container and Windows, you'll want consistent line endings (otherwise you may see hundreds of changes in the SCM view). The `.gitattributes` file in the root of this repo will disable line ending conversion and should prevent this. See [tips and tricks](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files) for more info.
* If you'd like to review the contents of the image used in this dev container, you can check it out in the [devcontainers/images](https://github.com/devcontainers/images/tree/main/src/python) repo.

View File

@@ -1,24 +1,26 @@
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-dockerfile
// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-docker-compose
{
"dockerComposeFile": "./docker-compose.yaml",
"service": "langchain",
"workspaceFolder": "/workspaces/langchain",
// Name for the dev container
"name": "langchain",
"customizations": {
"vscode": {
"extensions": [
"ms-python.python"
],
"settings": {
"python.defaultInterpreterPath": "/home/vscode/langchain-py-env/bin/python3.11"
}
}
},
// Features to add to the dev container. More info: https://containers.dev/features.
"features": {},
// Point to a Docker Compose file
"dockerComposeFile": "./docker-compose.yaml",
// Required when using Docker Compose. The name of the service to connect to once running
"service": "langchain",
// The optional 'workspaceFolder' property is the path VS Code should open by default when
// connected. This is typically a file mount in .devcontainer/docker-compose.yml
"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",
// Prevent the container from shutting down
"overrideCommand": true
// Features to add to the dev container. More info: https://containers.dev/features
// "features": {
// "ghcr.io/devcontainers-contrib/features/poetry:2": {}
// }
// Use 'forwardPorts' to make a list of ports inside the container available locally.
// "forwardPorts": [],
@@ -26,8 +28,9 @@
// Uncomment the next line to run commands after the container is created.
// "postCreateCommand": "cat /etc/os-release",
// Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.
// "remoteUser": "devcontainer"
"remoteUser": "vscode",
"overrideCommand": true
// Configure tool-specific properties.
// "customizations": {},
// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
// "remoteUser": "root"
}

View File

@@ -2,10 +2,11 @@ version: '3'
services:
langchain:
build:
dockerfile: .devcontainer/Dockerfile
context: ../
dockerfile: dev.Dockerfile
context: ..
volumes:
- ../:/workspaces/langchain
# Update this to wherever you want VS Code to mount the folder of your project
- ..:/workspaces:cached
networks:
- langchain-network
# environment:

3
.gitattributes vendored Normal file
View File

@@ -0,0 +1,3 @@
* text=auto eol=lf
*.{cmd,[cC][mM][dD]} text eol=crlf
*.{bat,[bB][aA][tT]} text eol=crlf

View File

@@ -59,6 +59,8 @@ we do not want these to get in the way of getting good code into the codebase.
## 🚀 Quick Start
> **Note:** You can run this repository locally (which is described below) or in a [development container](https://containers.dev/) (which is described in the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer)).
This project uses [Poetry](https://python-poetry.org/) as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.
❗Note: If you use `Conda` or `Pyenv` as your environment / package manager, avoid dependency conflicts by doing the following first:

View File

@@ -1,56 +1,26 @@
<!--
Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution.
<!-- Thank you for contributing to LangChain!
Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change.
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer (see below),
- Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out!
After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on network access,
2. an example notebook showing its use.
Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle!
-->
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
<!-- Remove if not applicable -->
Fixes # (issue)
#### Before submitting
<!-- If you're adding a new integration, please include:
1. a test for the integration - favor unit tests that does not rely on network access.
2. an example notebook showing its use
See contribution guidelines for more information on how to write tests, lint
etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
#### Who can review?
Tag maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @hwchase17
VectorStores / Retrievers / Memory
- @dev2049
If no one reviews your PR within a few days, feel free to @-mention the same people again.
See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

View File

@@ -1,38 +0,0 @@
name: linkcheck
on:
push:
branches: [master]
pull_request:
paths:
- 'docs/**'
env:
POETRY_VERSION: "1.4.2"
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.11"
steps:
- uses: actions/checkout@v3
- name: Install poetry
run: |
pipx install poetry==$POETRY_VERSION
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: poetry
- name: Install dependencies
run: |
poetry install --with docs
- name: Build the docs
run: |
make docs_build
- name: Analyzing the docs with linkcheck
run: |
make docs_linkcheck

14
.gitignore vendored
View File

@@ -73,6 +73,7 @@ instance/
# Sphinx documentation
docs/_build/
docs/docs/_build/
# PyBuilder
target/
@@ -152,4 +153,15 @@ data_map*
\[('_type', 'fake'), ('stop', None)]
# Replit files
*replit*
*replit*
node_modules
docs/.yarn/
docs/node_modules/
docs/.docusaurus/
docs/.cache-loader/
docs/_dist
docs/api_reference/_build
docs/docs_skeleton/build
docs/docs_skeleton/node_modules
docs/docs_skeleton/yarn.lock

4
.gitmodules vendored Normal file
View File

@@ -0,0 +1,4 @@
[submodule "docs/_docs_skeleton"]
path = docs/_docs_skeleton
url = https://github.com/langchain-ai/langchain-shared-docs
branch = main

View File

@@ -9,10 +9,13 @@ build:
os: ubuntu-22.04
tools:
python: "3.11"
jobs:
pre_build:
- python docs/api_reference/create_api_rst.py
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
configuration: docs/api_reference/conf.py
# If using Sphinx, optionally build your docs in additional formats such as PDF
# formats:
@@ -23,4 +26,4 @@ python:
install:
- requirements: docs/requirements.txt
- method: pip
path: .
path: .

View File

@@ -10,6 +10,9 @@ coverage:
clean: docs_clean
docs_compile:
poetry run nbdoc_build --srcdir $(srcdir)
docs_build:
cd docs && poetry run make html

View File

@@ -5,7 +5,6 @@
[![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)
[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml)
[![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml)
[![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml)
[![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
@@ -36,22 +35,22 @@ This library aims to assist in the development of those types of applications. C
**❓ Question Answering over specific documents**
- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html)
- [Documentation](https://python.langchain.com/docs/use_cases/question_answering/)
- End-to-end Example: [Question Answering over Notion Database](https://github.com/hwchase17/notion-qa)
**💬 Chatbots**
- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/chatbots.html)
- [Documentation](https://python.langchain.com/docs/use_cases/chatbots/)
- End-to-end Example: [Chat-LangChain](https://github.com/hwchase17/chat-langchain)
**🤖 Agents**
- [Documentation](https://langchain.readthedocs.io/en/latest/modules/agents.html)
- [Documentation](https://python.langchain.com/docs/modules/agents/)
- End-to-end Example: [GPT+WolframAlpha](https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain)
## 📖 Documentation
Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:
Please see [here](https://python.langchain.com) for full documentation on:
- Getting started (installation, setting up the environment, simple examples)
- How-To examples (demos, integrations, helper functions)
@@ -87,7 +86,7 @@ Memory refers to persisting state between calls of a chain/agent. LangChain prov
[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
For more information on these concepts, please see our [full documentation](https://langchain.readthedocs.io/en/latest/).
For more information on these concepts, please see our [full documentation](https://python.langchain.com).
## 💁 Contributing

View File

@@ -1,15 +1,15 @@
# This is a Dockerfile for Developer Container
# This is a Dockerfile for the Development Container
# Use the Python base image
ARG VARIANT="3.11-bullseye"
FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT} AS langchain-dev-base
FROM mcr.microsoft.com/devcontainers/python:0-${VARIANT} AS langchain-dev-base
USER vscode
# Define the version of Poetry to install (default is 1.4.2)
# Define the directory of python virtual environment
ARG PYTHON_VIRTUALENV_HOME=/home/vscode/langchain-py-env \
POETRY_VERSION=1.4.2
POETRY_VERSION=1.3.2
ENV POETRY_VIRTUALENVS_IN_PROJECT=false \
POETRY_NO_INTERACTION=true
@@ -35,8 +35,7 @@ FROM langchain-dev-base AS langchain-dev-dependencies
ARG PYTHON_VIRTUALENV_HOME
# Copy only the dependency files for installation
COPY pyproject.toml poetry.lock poetry.toml ./
COPY pyproject.toml poetry.toml ./
# Install the Poetry dependencies (this layer will be cached as long as the dependencies don't change)
RUN poetry install --no-interaction --no-ansi --with dev,test,docs
RUN poetry install --no-interaction --no-ansi --with dev,test,docs

12
docs/.local_build.sh Executable file
View File

@@ -0,0 +1,12 @@
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
mkdir -p _dist/docs_skeleton/static/api_reference
cd api_reference
poetry run make html
cp -r _build/* ../_dist/docs_skeleton/static/api_reference
cd ..
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
poetry run nbdoc_build
yarn install
yarn start

View File

@@ -1,57 +0,0 @@
# Tracing
By enabling tracing in your LangChain runs, youll be able to more effectively visualize, step through, and debug your chains and agents.
First, you should install tracing and set up your environment properly.
You can use either a locally hosted version of this (uses Docker) or a cloud hosted version (in closed alpha).
If you're interested in using the hosted platform, please fill out the form [here](https://forms.gle/tRCEMSeopZf6TE3b6).
- [Locally Hosted Setup](../tracing/local_installation.md)
- [Cloud Hosted Setup](../tracing/hosted_installation.md)
## Tracing Walkthrough
When you first access the UI, you should see a page with your tracing sessions.
An initial one "default" should already be created for you.
A session is just a way to group traces together.
If you click on a session, it will take you to a page with no recorded traces that says "No Runs."
You can create a new session with the new session form.
![](../tracing/homepage.png)
If we click on the `default` session, we can see that to start we have no traces stored.
![](../tracing/default_empty.png)
If we now start running chains and agents with tracing enabled, we will see data show up here.
To do so, we can run [this notebook](../tracing/agent_with_tracing.ipynb) as an example.
After running it, we will see an initial trace show up.
![](../tracing/first_trace.png)
From here we can explore the trace at a high level by clicking on the arrow to show nested runs.
We can keep on clicking further and further down to explore deeper and deeper.
![](../tracing/explore.png)
We can also click on the "Explore" button of the top level run to dive even deeper.
Here, we can see the inputs and outputs in full, as well as all the nested traces.
![](../tracing/explore_trace.png)
We can keep on exploring each of these nested traces in more detail.
For example, here is the lowest level trace with the exact inputs/outputs to the LLM.
![](../tracing/explore_llm.png)
## Changing Sessions
1. To initially record traces to a session other than `"default"`, you can set the `LANGCHAIN_SESSION` environment variable to the name of the session you want to record to:
```python
import os
os.environ["LANGCHAIN_TRACING"] = "true"
os.environ["LANGCHAIN_SESSION"] = "my_session" # Make sure this session actually exists. You can create a new session in the UI.
```
2. To switch sessions mid-script or mid-notebook, do NOT set the `LANGCHAIN_SESSION` environment variable. Instead: `langchain.set_tracing_callback_manager(session_name="my_session")`

View File

@@ -1,90 +0,0 @@
# YouTube
This is a collection of `LangChain` videos on `YouTube`.
### ⛓️[Official LangChain YouTube channel](https://www.youtube.com/@LangChain)⛓️
### Introduction to LangChain with Harrison Chase, creator of LangChain
- [Building the Future with LLMs, `LangChain`, & `Pinecone`](https://youtu.be/nMniwlGyX-c) by [Pinecone](https://www.youtube.com/@pinecone-io)
- [LangChain and Weaviate with Harrison Chase and Bob van Luijt - Weaviate Podcast #36](https://youtu.be/lhby7Ql7hbk) by [Weaviate • Vector Database](https://www.youtube.com/@Weaviate)
- [LangChain Demo + Q&A with Harrison Chase](https://youtu.be/zaYTXQFR0_s?t=788) by [Full Stack Deep Learning](https://www.youtube.com/@FullStackDeepLearning)
- [LangChain Agents: Build Personal Assistants For Your Data (Q&A with Harrison Chase and Mayo Oshin)](https://youtu.be/gVkF8cwfBLI) by [Chat with data](https://www.youtube.com/@chatwithdata)
- ⛓️ [LangChain "Agents in Production" Webinar](https://youtu.be/k8GNCCs16F4) by [LangChain](https://www.youtube.com/@LangChain)
## Videos (sorted by views)
- [Building AI LLM Apps with LangChain (and more?) - LIVE STREAM](https://www.youtube.com/live/M-2Cj_2fzWI?feature=share) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
- [First look - `ChatGPT` + `WolframAlpha` (`GPT-3.5` and Wolfram|Alpha via LangChain by James Weaver)](https://youtu.be/wYGbY811oMo) by [Dr Alan D. Thompson](https://www.youtube.com/@DrAlanDThompson)
- [LangChain explained - The hottest new Python framework](https://youtu.be/RoR4XJw8wIc) by [AssemblyAI](https://www.youtube.com/@AssemblyAI)
- [Chatbot with INFINITE MEMORY using `OpenAI` & `Pinecone` - `GPT-3`, `Embeddings`, `ADA`, `Vector DB`, `Semantic`](https://youtu.be/2xNzB7xq8nk) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [LangChain for LLMs is... basically just an Ansible playbook](https://youtu.be/X51N9C-OhlE) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [Build your own LLM Apps with LangChain & `GPT-Index`](https://youtu.be/-75p09zFUJY) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [`BabyAGI` - New System of Autonomous AI Agents with LangChain](https://youtu.be/lg3kJvf1kXo) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [Run `BabyAGI` with Langchain Agents (with Python Code)](https://youtu.be/WosPGHPObx8) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [How to Use Langchain With `Zapier` | Write and Send Email with GPT-3 | OpenAI API Tutorial](https://youtu.be/p9v2-xEa9A0) by [StarMorph AI](https://www.youtube.com/@starmorph)
- [Use Your Locally Stored Files To Get Response From GPT - `OpenAI` | Langchain | Python](https://youtu.be/NC1Ni9KS-rk) by [Shweta Lodha](https://www.youtube.com/@shweta-lodha)
- [`Langchain JS` | How to Use GPT-3, GPT-4 to Reference your own Data | `OpenAI Embeddings` Intro](https://youtu.be/veV2I-NEjaM) by [StarMorph AI](https://www.youtube.com/@starmorph)
- [The easiest way to work with large language models | Learn LangChain in 10min](https://youtu.be/kmbS6FDQh7c) by [Sophia Yang](https://www.youtube.com/@SophiaYangDS)
- [4 Autonomous AI Agents: “Westworld” simulation `BabyAGI`, `AutoGPT`, `Camel`, `LangChain`](https://youtu.be/yWbnH6inT_U) by [Sophia Yang](https://www.youtube.com/@SophiaYangDS)
- [AI CAN SEARCH THE INTERNET? Langchain Agents + OpenAI ChatGPT](https://youtu.be/J-GL0htqda8) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)
- [Query Your Data with GPT-4 | Embeddings, Vector Databases | Langchain JS Knowledgebase](https://youtu.be/jRnUPUTkZmU) by [StarMorph AI](https://www.youtube.com/@starmorph)
- [`Weaviate` + LangChain for LLM apps presented by Erika Cardenas](https://youtu.be/7AGj4Td5Lgw) by [`Weaviate` • Vector Database](https://www.youtube.com/@Weaviate)
- [Langchain Overview — How to Use Langchain & `ChatGPT`](https://youtu.be/oYVYIq0lOtI) by [Python In Office](https://www.youtube.com/@pythoninoffice6568)
- [Langchain Overview - How to Use Langchain & `ChatGPT`](https://youtu.be/oYVYIq0lOtI) by [Python In Office](https://www.youtube.com/@pythoninoffice6568)
- [Custom langchain Agent & Tools with memory. Turn any `Python function` into langchain tool with Gpt 3](https://youtu.be/NIG8lXk0ULg) by [echohive](https://www.youtube.com/@echohive)
- [LangChain: Run Language Models Locally - `Hugging Face Models`](https://youtu.be/Xxxuw4_iCzw) by [Prompt Engineering](https://www.youtube.com/@engineerprompt)
- [`ChatGPT` with any `YouTube` video using langchain and `chromadb`](https://youtu.be/TQZfB2bzVwU) by [echohive](https://www.youtube.com/@echohive)
- [How to Talk to a `PDF` using LangChain and `ChatGPT`](https://youtu.be/v2i1YDtrIwk) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)
- [Langchain Document Loaders Part 1: Unstructured Files](https://youtu.be/O5C0wfsen98) by [Merk](https://www.youtube.com/@merksworld)
- [LangChain - Prompt Templates (what all the best prompt engineers use)](https://youtu.be/1aRu8b0XNOQ) by [Nick Daigler](https://www.youtube.com/@nick_daigs)
- [LangChain. Crear aplicaciones Python impulsadas por GPT](https://youtu.be/DkW_rDndts8) by [Jesús Conde](https://www.youtube.com/@0utKast)
- [Easiest Way to Use GPT In Your Products | LangChain Basics Tutorial](https://youtu.be/fLy0VenZyGc) by [Rachel Woods](https://www.youtube.com/@therachelwoods)
- [`BabyAGI` + `GPT-4` Langchain Agent with Internet Access](https://youtu.be/wx1z_hs5P6E) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)
- [Learning LLM Agents. How does it actually work? LangChain, AutoGPT & OpenAI](https://youtu.be/mb_YAABSplk) by [Arnoldas Kemeklis](https://www.youtube.com/@processusAI)
- [Get Started with LangChain in `Node.js`](https://youtu.be/Wxx1KUWJFv4) by [Developers Digest](https://www.youtube.com/@DevelopersDigest)
- [LangChain + `OpenAI` tutorial: Building a Q&A system w/ own text data](https://youtu.be/DYOU_Z0hAwo) by [Samuel Chan](https://www.youtube.com/@SamuelChan)
- [Langchain + `Zapier` Agent](https://youtu.be/yribLAb-pxA) by [Merk](https://www.youtube.com/@merksworld)
- [Connecting the Internet with `ChatGPT` (LLMs) using Langchain And Answers Your Questions](https://youtu.be/9Y0TBC63yZg) by [Kamalraj M M](https://www.youtube.com/@insightbuilder)
- [Build More Powerful LLM Applications for Businesss with LangChain (Beginners Guide)](https://youtu.be/sp3-WLKEcBg) by[ No Code Blackbox](https://www.youtube.com/@nocodeblackbox)
- ⛓️ [LangFlow LLM Agent Demo for 🦜🔗LangChain](https://youtu.be/zJxDHaWt-6o) by [Cobus Greyling](https://www.youtube.com/@CobusGreylingZA)
- ⛓️ [Chatbot Factory: Streamline Python Chatbot Creation with LLMs and Langchain](https://youtu.be/eYer3uzrcuM) by [Finxter](https://www.youtube.com/@CobusGreylingZA)
- ⛓️ [LangChain Tutorial - ChatGPT mit eigenen Daten](https://youtu.be/0XDLyY90E2c) by [Coding Crashkurse](https://www.youtube.com/@codingcrashkurse6429)
- ⛓️ [Chat with a `CSV` | LangChain Agents Tutorial (Beginners)](https://youtu.be/tjeti5vXWOU) by [GoDataProf](https://www.youtube.com/@godataprof)
- ⛓️ [Introdução ao Langchain - #Cortes - Live DataHackers](https://youtu.be/fw8y5VRei5Y) by [Prof. João Gabriel Lima](https://www.youtube.com/@profjoaogabriellima)
- ⛓️ [LangChain: Level up `ChatGPT` !? | LangChain Tutorial Part 1](https://youtu.be/vxUGx8aZpDE) by [Code Affinity](https://www.youtube.com/@codeaffinitydev)
- ⛓️ [KI schreibt krasses Youtube Skript 😲😳 | LangChain Tutorial Deutsch](https://youtu.be/QpTiXyK1jus) by [SimpleKI](https://www.youtube.com/@simpleki)
- ⛓️ [Chat with Audio: Langchain, `Chroma DB`, OpenAI, and `Assembly AI`](https://youtu.be/Kjy7cx1r75g) by [AI Anytime](https://www.youtube.com/@AIAnytime)
- ⛓️ [QA over documents with Auto vector index selection with Langchain router chains](https://youtu.be/9G05qybShv8) by [echohive](https://www.youtube.com/@echohive)
- ⛓️ [Build your own custom LLM application with `Bubble.io` & Langchain (No Code & Beginner friendly)](https://youtu.be/O7NhQGu1m6c) by [No Code Blackbox](https://www.youtube.com/@nocodeblackbox)
- ⛓️ [Simple App to Question Your Docs: Leveraging `Streamlit`, `Hugging Face Spaces`, LangChain, and `Claude`!](https://youtu.be/X4YbNECRr7o) by [Chris Alexiuk](https://www.youtube.com/@chrisalexiuk)
- ⛓️ [LANGCHAIN AI- `ConstitutionalChainAI` + Databutton AI ASSISTANT Web App](https://youtu.be/5zIU6_rdJCU) by [Avra](https://www.youtube.com/@Avra_b)
- ⛓️ [LANGCHAIN AI AUTONOMOUS AGENT WEB APP - 👶 `BABY AGI` 🤖 with EMAIL AUTOMATION using `DATABUTTON`](https://youtu.be/cvAwOGfeHgw) by [Avra](https://www.youtube.com/@Avra_b)
- ⛓️ [The Future of Data Analysis: Using A.I. Models in Data Analysis (LangChain)](https://youtu.be/v_LIcVyg5dk) by [Absent Data](https://www.youtube.com/@absentdata)
- ⛓️ [Memory in LangChain | Deep dive (python)](https://youtu.be/70lqvTFh_Yg) by [Eden Marco](https://www.youtube.com/@EdenMarco)
- ⛓️ [9 LangChain UseCases | Beginner's Guide | 2023](https://youtu.be/zS8_qosHNMw) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- ⛓️ [Use Large Language Models in Jupyter Notebook | LangChain | Agents & Indexes](https://youtu.be/JSe11L1a_QQ) by [Abhinaw Tiwari](https://www.youtube.com/@AbhinawTiwariAT)
- ⛓️ [How to Talk to Your Langchain Agent | `11 Labs` + `Whisper`](https://youtu.be/N4k459Zw2PU) by [VRSEN](https://www.youtube.com/@vrsen)
- ⛓️ [LangChain Deep Dive: 5 FUN AI App Ideas To Build Quickly and Easily](https://youtu.be/mPYEPzLkeks) by [James NoCode](https://www.youtube.com/@jamesnocode)
- ⛓️ [BEST OPEN Alternative to OPENAI's EMBEDDINGs for Retrieval QA: LangChain](https://youtu.be/ogEalPMUCSY) by [Prompt Engineering](https://www.youtube.com/@engineerprompt)
- ⛓️ [LangChain 101: Models](https://youtu.be/T6c_XsyaNSQ) by [Mckay Wrigley](https://www.youtube.com/@realmckaywrigley)
- ⛓️ [LangChain with JavaScript Tutorial #1 | Setup & Using LLMs](https://youtu.be/W3AoeMrg27o) by [Leon van Zyl](https://www.youtube.com/@leonvanzyl)
- ⛓️ [LangChain Overview & Tutorial for Beginners: Build Powerful AI Apps Quickly & Easily (ZERO CODE)](https://youtu.be/iI84yym473Q) by [James NoCode](https://www.youtube.com/@jamesnocode)
- ⛓️ [LangChain In Action: Real-World Use Case With Step-by-Step Tutorial](https://youtu.be/UO699Szp82M) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
- ⛓️ [Summarizing and Querying Multiple Papers with LangChain](https://youtu.be/p_MQRWH5Y6k) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)
- ⛓️ [Using Langchain (and `Replit`) through `Tana`, ask `Google`/`Wikipedia`/`Wolfram Alpha` to fill out a table](https://youtu.be/Webau9lEzoI) by [Stian Håklev](https://www.youtube.com/@StianHaklev)
- ⛓️ [Langchain PDF App (GUI) | Create a ChatGPT For Your `PDF` in Python](https://youtu.be/wUAUdEw5oxM) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)
- ⛓️ [Auto-GPT with LangChain 🔥 | Create Your Own Personal AI Assistant](https://youtu.be/imDfPmMKEjM) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- ⛓️ [Create Your OWN Slack AI Assistant with Python & LangChain](https://youtu.be/3jFXRNn2Bu8) by [Dave Ebbelaar](https://www.youtube.com/@daveebbelaar)
- ⛓️ [How to Create LOCAL Chatbots with GPT4All and LangChain [Full Guide]](https://youtu.be/4p1Fojur8Zw) by [Liam Ottley](https://www.youtube.com/@LiamOttley)
- ⛓️ [Build a `Multilingual PDF` Search App with LangChain, `Cohere` and `Bubble`](https://youtu.be/hOrtuumOrv8) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- ⛓️ [Building a LangChain Agent (code-free!) Using `Bubble` and `Flowise`](https://youtu.be/jDJIIVWTZDE) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- ⛓️ [Build a LangChain-based Semantic PDF Search App with No-Code Tools Bubble and Flowise](https://youtu.be/s33v5cIeqA4) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- ⛓️ [LangChain Memory Tutorial | Building a ChatGPT Clone in Python](https://youtu.be/Cwq91cj2Pnc) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)
- ⛓️ [ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain](https://youtu.be/TeDgIDqQmzs) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- ⛓️ [`Llama Index`: Chat with Documentation using URL Loader](https://youtu.be/XJRoDEctAwA) by [Merk](https://www.youtube.com/@merksworld)
- ⛓️ [Using OpenAI, LangChain, and `Gradio` to Build Custom GenAI Applications](https://youtu.be/1MsmqMg3yUc) by [David Hundley](https://www.youtube.com/@dkhundley)
---------------------
⛓ icon marks a new video [last update 2023-05-15]

File diff suppressed because it is too large Load Diff

View File

@@ -11,13 +11,14 @@
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import os
import sys
import toml
with open("../pyproject.toml") as f:
sys.path.insert(0, os.path.abspath("."))
with open("../../pyproject.toml") as f:
data = toml.load(f)
# -- Project information -----------------------------------------------------
@@ -45,26 +46,34 @@ extensions = [
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinxcontrib.autodoc_pydantic",
"myst_nb",
"sphinx_copybutton",
"sphinx_panels",
"IPython.sphinxext.ipython_console_highlighting",
]
source_suffix = [".ipynb", ".html", ".md", ".rst"]
source_suffix = [".rst"]
autodoc_pydantic_model_show_json = False
autodoc_pydantic_field_list_validators = False
autodoc_pydantic_config_members = False
autodoc_pydantic_model_show_config_summary = False
autodoc_pydantic_model_show_validator_members = False
autodoc_pydantic_model_show_field_summary = False
autodoc_pydantic_model_members = False
autodoc_pydantic_model_undoc_members = False
# autodoc_typehints = "signature"
# autodoc_typehints = "description"
autodoc_pydantic_model_show_validator_summary = False
autodoc_pydantic_model_signature_prefix = "class"
autodoc_pydantic_field_signature_prefix = "param"
autodoc_member_order = "groupwise"
autoclass_content = "both"
autodoc_typehints_format = "short"
autodoc_default_options = {
"members": True,
"show-inheritance": True,
"inherited-members": "BaseModel",
"undoc-members": True,
"special-members": "__call__",
}
# autodoc_typehints = "description"
# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]
templates_path = ["templates"]
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
@@ -77,20 +86,24 @@ exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_book_theme"
html_theme = "scikit-learn-modern"
html_theme_path = ["themes"]
html_theme_options = {
"path_to_docs": "docs",
"repository_url": "https://github.com/hwchase17/langchain",
"use_repository_button": True,
# redirects dictionary maps from old links to new links
html_additional_pages = {}
redirects = {
"index": "api_reference",
}
for old_link in redirects:
html_additional_pages[old_link] = "redirects.html"
html_context = {
"display_github": True, # Integrate GitHub
"github_user": "hwchase17", # Username
"github_repo": "langchain", # Repo name
"github_version": "master", # Version
"conf_py_path": "/docs/", # Path in the checkout to the docs root
"conf_py_path": "/docs/api_reference", # Path in the checkout to the docs root
"redirects": redirects,
}
# Add any paths that contain custom static files (such as style sheets) here,
@@ -103,10 +116,9 @@ html_static_path = ["_static"]
html_css_files = [
"css/custom.css",
]
html_use_index = False
html_js_files = [
"js/mendablesearch.js",
]
nb_execution_mode = "off"
myst_enable_extensions = ["colon_fence"]
# generate autosummary even if no references
autosummary_generate = True

View File

@@ -0,0 +1,94 @@
"""Script for auto-generating api_reference.rst"""
import glob
import re
from pathlib import Path
ROOT_DIR = Path(__file__).parents[2].absolute()
PKG_DIR = ROOT_DIR / "langchain"
WRITE_FILE = Path(__file__).parent / "api_reference.rst"
def load_members() -> dict:
members: dict = {}
for py in glob.glob(str(PKG_DIR) + "/**/*.py", recursive=True):
module = py[len(str(PKG_DIR)) + 1 :].replace(".py", "").replace("/", ".")
top_level = module.split(".")[0]
if top_level not in members:
members[top_level] = {"classes": [], "functions": []}
with open(py, "r") as f:
for line in f.readlines():
cls = re.findall(r"^class ([^_].*)\(", line)
members[top_level]["classes"].extend([module + "." + c for c in cls])
func = re.findall(r"^def ([^_].*)\(", line)
members[top_level]["functions"].extend([module + "." + f for f in func])
return members
def construct_doc(members: dict) -> str:
full_doc = """\
.. _api_reference:
=============
API Reference
=============
"""
for module, _members in sorted(members.items(), key=lambda kv: kv[0]):
classes = _members["classes"]
functions = _members["functions"]
if not (classes or functions):
continue
module_title = module.replace("_", " ").title()
if module_title == "Llms":
module_title = "LLMs"
section = f":mod:`langchain.{module}`: {module_title}"
full_doc += f"""\
{section}
{'=' * (len(section) + 1)}
.. automodule:: langchain.{module}
:no-members:
:no-inherited-members:
"""
if classes:
cstring = "\n ".join(sorted(classes))
full_doc += f"""\
Classes
--------------
.. currentmodule:: langchain
.. autosummary::
:toctree: {module}
:template: class.rst
{cstring}
"""
if functions:
fstring = "\n ".join(sorted(functions))
full_doc += f"""\
Functions
--------------
.. currentmodule:: langchain
.. autosummary::
:toctree: {module}
{fstring}
"""
return full_doc
def main() -> None:
members = load_members()
full_doc = construct_doc(members)
with open(WRITE_FILE, "w") as f:
f.write(full_doc)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,8 @@
=============
LangChain API
=============
.. toctree::
:maxdepth: 2
api_reference.rst

View File

@@ -0,0 +1,27 @@
Copyright (c) 2007-2023 The scikit-learn developers.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -0,0 +1,28 @@
:mod:`{{module}}`.{{objname}}
{{ underline }}==============
.. currentmodule:: {{ module }}
.. autoclass:: {{ objname }}
{% block methods %}
{% if methods %}
.. rubric:: {{ _('Methods') }}
.. autosummary::
{% for item in methods %}
~{{ name }}.{{ item }}
{%- endfor %}
{% endif %}
{% endblock %}
{% block attributes %}
{% if attributes %}
.. rubric:: {{ _('Attributes') }}
.. autosummary::
{% for item in attributes %}
~{{ name }}.{{ item }}
{%- endfor %}
{% endif %}
{% endblock %}

View File

@@ -0,0 +1,15 @@
{% set redirect = pathto(redirects[pagename]) %}
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="Refresh" content="0; url={{ redirect }}" />
<meta name="Description" content="scikit-learn: machine learning in Python">
<link rel="canonical" href="{{ redirect }}" />
<title>scikit-learn: machine learning in Python</title>
</head>
<body>
<p>You will be automatically redirected to the <a href="{{ redirect }}">new location of this page</a>.</p>
</body>
</html>

View File

@@ -0,0 +1,27 @@
Copyright (c) 2007-2023 The scikit-learn developers.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -0,0 +1,67 @@
<script>
$(document).ready(function() {
/* Add a [>>>] button on the top-right corner of code samples to hide
* the >>> and ... prompts and the output and thus make the code
* copyable. */
var div = $('.highlight-python .highlight,' +
'.highlight-python3 .highlight,' +
'.highlight-pycon .highlight,' +
'.highlight-default .highlight')
var pre = div.find('pre');
// get the styles from the current theme
pre.parent().parent().css('position', 'relative');
var hide_text = 'Hide prompts and outputs';
var show_text = 'Show prompts and outputs';
// create and add the button to all the code blocks that contain >>>
div.each(function(index) {
var jthis = $(this);
if (jthis.find('.gp').length > 0) {
var button = $('<span class="copybutton">&gt;&gt;&gt;</span>');
button.attr('title', hide_text);
button.data('hidden', 'false');
jthis.prepend(button);
}
// tracebacks (.gt) contain bare text elements that need to be
// wrapped in a span to work with .nextUntil() (see later)
jthis.find('pre:has(.gt)').contents().filter(function() {
return ((this.nodeType == 3) && (this.data.trim().length > 0));
}).wrap('<span>');
});
// define the behavior of the button when it's clicked
$('.copybutton').click(function(e){
e.preventDefault();
var button = $(this);
if (button.data('hidden') === 'false') {
// hide the code output
button.parent().find('.go, .gp, .gt').hide();
button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'hidden');
button.css('text-decoration', 'line-through');
button.attr('title', show_text);
button.data('hidden', 'true');
} else {
// show the code output
button.parent().find('.go, .gp, .gt').show();
button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'visible');
button.css('text-decoration', 'none');
button.attr('title', hide_text);
button.data('hidden', 'false');
}
});
/*** Add permalink buttons next to glossary terms ***/
$('dl.glossary > dt[id]').append(function() {
return ('<a class="headerlink" href="#' +
this.getAttribute('id') +
'" title="Permalink to this term">¶</a>');
});
});
</script>
{%- if pagename != 'index' and pagename != 'documentation' %}
{% if theme_mathjax_path %}
<script id="MathJax-script" async src="{{ theme_mathjax_path }}"></script>
{% endif %}
{%- endif %}

View File

@@ -0,0 +1,142 @@
{# TEMPLATE VAR SETTINGS #}
{%- set url_root = pathto('', 1) %}
{%- if url_root == '#' %}{% set url_root = '' %}{% endif %}
{%- if not embedded and docstitle %}
{%- set titlesuffix = " &mdash; "|safe + docstitle|e %}
{%- else %}
{%- set titlesuffix = "" %}
{%- endif %}
{%- set lang_attr = 'en' %}
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="{{ lang_attr }}" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="{{ lang_attr }}" > <!--<![endif]-->
<head>
<meta charset="utf-8">
{{ metatags }}
<meta name="viewport" content="width=device-width, initial-scale=1.0">
{% block htmltitle %}
<title>{{ title|striptags|e }}{{ titlesuffix }}</title>
{% endblock %}
<link rel="canonical" href="http://scikit-learn.org/stable/{{pagename}}.html" />
{% if favicon_url %}
<link rel="shortcut icon" href="{{ favicon_url|e }}"/>
{% endif %}
<link rel="stylesheet" href="{{ pathto('_static/css/vendor/bootstrap.min.css', 1) }}" type="text/css" />
{%- for css in css_files %}
{%- if css|attr("rel") %}
<link rel="{{ css.rel }}" href="{{ pathto(css.filename, 1) }}" type="text/css"{% if css.title is not none %} title="{{ css.title }}"{% endif %} />
{%- else %}
<link rel="stylesheet" href="{{ pathto(css, 1) }}" type="text/css" />
{%- endif %}
{%- endfor %}
<link rel="stylesheet" href="{{ pathto('_static/' + style, 1) }}" type="text/css" />
<script id="documentation_options" data-url_root="{{ pathto('', 1) }}" src="{{ pathto('_static/documentation_options.js', 1) }}"></script>
<script src="{{ pathto('_static/jquery.js', 1) }}"></script>
{%- block extrahead %} {% endblock %}
</head>
<body>
{% include "nav.html" %}
{%- block content %}
<div class="d-flex" id="sk-doc-wrapper">
<input type="checkbox" name="sk-toggle-checkbox" id="sk-toggle-checkbox">
<label id="sk-sidemenu-toggle" class="sk-btn-toggle-toc btn sk-btn-primary" for="sk-toggle-checkbox">Toggle Menu</label>
<div id="sk-sidebar-wrapper" class="border-right">
<div class="sk-sidebar-toc-wrapper">
<div class="btn-group w-100 mb-2" role="group" aria-label="rellinks">
{%- if prev %}
<a href="{{ prev.link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ prev.title|striptags }}">Prev</a>
{%- else %}
<a href="#" role="button" class="btn sk-btn-rellink py-1 disabled"">Prev</a>
{%- endif %}
{%- if parents -%}
<a href="{{ parents[-1].link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ parents[-1].title|striptags }}">Up</a>
{%- else %}
<a href="#" role="button" class="btn sk-btn-rellink disabled py-1">Up</a>
{%- endif %}
{%- if next %}
<a href="{{ next.link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ next.title|striptags }}">Next</a>
{%- else %}
<a href="#" role="button" class="btn sk-btn-rellink py-1 disabled"">Next</a>
{%- endif %}
</div>
{%- if pagename != "install" %}
<div class="alert alert-warning p-1 mb-2" role="alert">
<p class="text-center mb-0">
<strong>LangChain {{ release }}</strong><br/>
</p>
</div>
{%- endif %}
{%- if meta and meta['parenttoc']|tobool %}
<div class="sk-sidebar-toc">
{% set nav = get_nav_object(maxdepth=3, collapse=True, numbered=True) %}
<ul>
{% for main_nav_item in nav %}
{% if main_nav_item.active %}
<li>
<a href="{{ main_nav_item.url }}" class="sk-toc-active">{{ main_nav_item.title }}</a>
</li>
<ul>
{% for nav_item in main_nav_item.children %}
<li>
<a href="{{ nav_item.url }}" class="{% if nav_item.active %}sk-toc-active{% endif %}">{{ nav_item.title }}</a>
{% if nav_item.children %}
<ul>
{% for inner_child in nav_item.children %}
<li class="sk-toctree-l3">
<a href="{{ inner_child.url }}">{{ inner_child.title }}</a>
</li>
{% endfor %}
</ul>
{% endif %}
</li>
{% endfor %}
</ul>
{% endif %}
{% endfor %}
</ul>
</div>
{%- elif meta and meta['globalsidebartoc']|tobool %}
<div class="sk-sidebar-toc sk-sidebar-global-toc">
{{ toctree(maxdepth=2, titles_only=True) }}
</div>
{%- else %}
<div class="sk-sidebar-toc">
{{ toc }}
</div>
{%- endif %}
</div>
</div>
<div id="sk-page-content-wrapper">
<div class="sk-page-content container-fluid body px-md-3" role="main">
{% block body %}{% endblock %}
</div>
<div class="container">
<footer class="sk-content-footer">
{%- if pagename != 'index' %}
{%- if show_copyright %}
{%- if hasdoc('copyright') %}
{% trans path=pathto('copyright'), copyright=copyright|e %}&copy; {{ copyright }}.{% endtrans %}
{%- else %}
{% trans copyright=copyright|e %}&copy; {{ copyright }}.{% endtrans %}
{%- endif %}
{%- endif %}
{%- if last_updated %}
{% trans last_updated=last_updated|e %}Last updated on {{ last_updated }}.{% endtrans %}
{%- endif %}
{%- if show_source and has_source and sourcename %}
<a href="{{ pathto('_sources/' + sourcename, true)|e }}" rel="nofollow">{{ _('Show this page source') }}</a>
{%- endif %}
{%- endif %}
</footer>
</div>
</div>
</div>
{%- endblock %}
<script src="{{ pathto('_static/js/vendor/bootstrap.min.js', 1) }}"></script>
{% include "javascript.html" %}
</body>
</html>

View File

@@ -0,0 +1,85 @@
{%- if pagename != 'index' and pagename != 'documentation' %}
{%- set nav_bar_class = "sk-docs-navbar" %}
{%- set top_container_cls = "sk-docs-container" %}
{%- else %}
{%- set nav_bar_class = "sk-landing-navbar" %}
{%- set top_container_cls = "sk-landing-container" %}
{%- endif %}
{% if theme_link_to_live_contributing_page|tobool %}
{# Link to development page for live builds #}
{%- set development_link = "https://scikit-learn.org/dev/developers/index.html" %}
{# Open on a new development page in new window/tab for live builds #}
{%- set development_attrs = 'target="_blank" rel="noopener noreferrer"' %}
{%- else %}
{%- set development_link = pathto('developers/index') %}
{%- set development_attrs = '' %}
{%- endif %}
{# title, link, link_attrs #}
{%- set drop_down_navigation = [
('Getting Started', pathto('getting_started'), ''),
('Tutorial', pathto('tutorial/index'), ''),
("What's new", pathto('whats_new/v' + version), ''),
('Glossary', pathto('glossary'), ''),
('Development', development_link, development_attrs),
('FAQ', pathto('faq'), ''),
('Support', pathto('support'), ''),
('Related packages', pathto('related_projects'), ''),
('Roadmap', pathto('roadmap'), ''),
('Governance', pathto('governance'), ''),
('About us', pathto('about'), ''),
('GitHub', 'https://github.com/scikit-learn/scikit-learn', ''),
('Other Versions and Download', 'https://scikit-learn.org/dev/versions.html', '')]
-%}
<nav id="navbar" class="{{ nav_bar_class }} navbar navbar-expand-md navbar-light bg-light py-0">
<div class="container-fluid {{ top_container_cls }} px-0">
{%- if logo_url %}
<a class="navbar-brand py-0" href="{{ pathto('index') }}">
<img
class="sk-brand-img"
src="{{ logo_url|e }}"
alt="logo"/>
</a>
{%- endif %}
<button
id="sk-navbar-toggler"
class="navbar-toggler"
type="button"
data-toggle="collapse"
data-target="#navbarSupportedContent"
aria-controls="navbarSupportedContent"
aria-expanded="false"
aria-label="Toggle navigation"
>
<span class="navbar-toggler-icon"></span>
</button>
<div class="sk-navbar-collapse collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav mr-auto">
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('api_reference') }}">API</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" target="_blank" rel="noopener noreferrer" href="https://python.langchain.com/">Python Docs</a>
</li>
{%- for title, link, link_attrs in drop_down_navigation %}
<li class="nav-item">
<a class="sk-nav-link nav-link nav-more-item-mobile-items" href="{{ link }}" {{ link_attrs }}>{{ title }}</a>
</li>
{%- endfor %}
</ul>
{%- if pagename != "search"%}
<div id="searchbox" role="search">
<div class="searchformwrapper">
<form class="search" action="{{ pathto('search') }}" method="get">
<input class="sk-search-text-input" type="text" name="q" aria-labelledby="searchlabel" />
<input class="sk-search-text-btn" type="submit" value="{{ _('Go') }}" />
</form>
</div>
</div>
{%- endif %}
</div>
</div>
</nav>

View File

@@ -0,0 +1,16 @@
{%- extends "basic/search.html" %}
{% block extrahead %}
<script type="text/javascript" src="{{ pathto('_static/underscore.js', 1) }}"></script>
<script type="text/javascript" src="{{ pathto('searchindex.js', 1) }}" defer></script>
<script type="text/javascript" src="{{ pathto('_static/doctools.js', 1) }}"></script>
<script type="text/javascript" src="{{ pathto('_static/language_data.js', 1) }}"></script>
<script type="text/javascript" src="{{ pathto('_static/searchtools.js', 1) }}"></script>
<!-- <script type="text/javascript" src="{{ pathto('_static/sphinx_highlight.js', 1) }}"></script> -->
<script type="text/javascript">
$(document).ready(function() {
if (!Search.out) {
Search.init();
}
});
</script>
{% endblock %}

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,8 @@
[theme]
inherit = basic
pygments_style = default
stylesheet = css/theme.css
[options]
link_to_live_contributing_page = false
mathjax_path =

View File

@@ -1,231 +0,0 @@
# Dependents
Dependents stats for `hwchase17/langchain`
[![](https://img.shields.io/static/v1?label=Used%20by&message=7484&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(public)&message=212&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(private)&message=7272&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(stars)&message=19095&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[update: 2023-06-05; only dependent repositories with Stars > 100]
| Repository | Stars |
| :-------- | -----: |
|[openai/openai-cookbook](https://github.com/openai/openai-cookbook) | 38024 |
|[LAION-AI/Open-Assistant](https://github.com/LAION-AI/Open-Assistant) | 33609 |
|[microsoft/TaskMatrix](https://github.com/microsoft/TaskMatrix) | 33136 |
|[hpcaitech/ColossalAI](https://github.com/hpcaitech/ColossalAI) | 30032 |
|[imartinez/privateGPT](https://github.com/imartinez/privateGPT) | 28094 |
|[reworkd/AgentGPT](https://github.com/reworkd/AgentGPT) | 23430 |
|[openai/chatgpt-retrieval-plugin](https://github.com/openai/chatgpt-retrieval-plugin) | 17942 |
|[jerryjliu/llama_index](https://github.com/jerryjliu/llama_index) | 16697 |
|[mindsdb/mindsdb](https://github.com/mindsdb/mindsdb) | 16410 |
|[mlflow/mlflow](https://github.com/mlflow/mlflow) | 14517 |
|[GaiZhenbiao/ChuanhuChatGPT](https://github.com/GaiZhenbiao/ChuanhuChatGPT) | 10793 |
|[databrickslabs/dolly](https://github.com/databrickslabs/dolly) | 10155 |
|[openai/evals](https://github.com/openai/evals) | 10076 |
|[AIGC-Audio/AudioGPT](https://github.com/AIGC-Audio/AudioGPT) | 8619 |
|[logspace-ai/langflow](https://github.com/logspace-ai/langflow) | 8211 |
|[imClumsyPanda/langchain-ChatGLM](https://github.com/imClumsyPanda/langchain-ChatGLM) | 8154 |
|[PromtEngineer/localGPT](https://github.com/PromtEngineer/localGPT) | 6853 |
|[StanGirard/quivr](https://github.com/StanGirard/quivr) | 6830 |
|[PipedreamHQ/pipedream](https://github.com/PipedreamHQ/pipedream) | 6520 |
|[go-skynet/LocalAI](https://github.com/go-skynet/LocalAI) | 6018 |
|[arc53/DocsGPT](https://github.com/arc53/DocsGPT) | 5643 |
|[e2b-dev/e2b](https://github.com/e2b-dev/e2b) | 5075 |
|[langgenius/dify](https://github.com/langgenius/dify) | 4281 |
|[nsarrazin/serge](https://github.com/nsarrazin/serge) | 4228 |
|[zauberzeug/nicegui](https://github.com/zauberzeug/nicegui) | 4084 |
|[madawei2699/myGPTReader](https://github.com/madawei2699/myGPTReader) | 4039 |
|[wenda-LLM/wenda](https://github.com/wenda-LLM/wenda) | 3871 |
|[GreyDGL/PentestGPT](https://github.com/GreyDGL/PentestGPT) | 3837 |
|[zilliztech/GPTCache](https://github.com/zilliztech/GPTCache) | 3625 |
|[csunny/DB-GPT](https://github.com/csunny/DB-GPT) | 3545 |
|[gkamradt/langchain-tutorials](https://github.com/gkamradt/langchain-tutorials) | 3404 |
|[mmabrouk/chatgpt-wrapper](https://github.com/mmabrouk/chatgpt-wrapper) | 3303 |
|[postgresml/postgresml](https://github.com/postgresml/postgresml) | 3052 |
|[marqo-ai/marqo](https://github.com/marqo-ai/marqo) | 3014 |
|[MineDojo/Voyager](https://github.com/MineDojo/Voyager) | 2945 |
|[PrefectHQ/marvin](https://github.com/PrefectHQ/marvin) | 2761 |
|[project-baize/baize-chatbot](https://github.com/project-baize/baize-chatbot) | 2673 |
|[hwchase17/chat-langchain](https://github.com/hwchase17/chat-langchain) | 2589 |
|[whitead/paper-qa](https://github.com/whitead/paper-qa) | 2572 |
|[Azure-Samples/azure-search-openai-demo](https://github.com/Azure-Samples/azure-search-openai-demo) | 2366 |
|[GerevAI/gerev](https://github.com/GerevAI/gerev) | 2330 |
|[OpenGVLab/InternGPT](https://github.com/OpenGVLab/InternGPT) | 2289 |
|[ParisNeo/gpt4all-ui](https://github.com/ParisNeo/gpt4all-ui) | 2159 |
|[OpenBMB/BMTools](https://github.com/OpenBMB/BMTools) | 2158 |
|[guangzhengli/ChatFiles](https://github.com/guangzhengli/ChatFiles) | 2005 |
|[h2oai/h2ogpt](https://github.com/h2oai/h2ogpt) | 1939 |
|[Farama-Foundation/PettingZoo](https://github.com/Farama-Foundation/PettingZoo) | 1845 |
|[OpenGVLab/Ask-Anything](https://github.com/OpenGVLab/Ask-Anything) | 1749 |
|[IntelligenzaArtificiale/Free-Auto-GPT](https://github.com/IntelligenzaArtificiale/Free-Auto-GPT) | 1740 |
|[Unstructured-IO/unstructured](https://github.com/Unstructured-IO/unstructured) | 1628 |
|[hwchase17/notion-qa](https://github.com/hwchase17/notion-qa) | 1607 |
|[NVIDIA/NeMo-Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) | 1544 |
|[SamurAIGPT/privateGPT](https://github.com/SamurAIGPT/privateGPT) | 1543 |
|[paulpierre/RasaGPT](https://github.com/paulpierre/RasaGPT) | 1526 |
|[yanqiangmiffy/Chinese-LangChain](https://github.com/yanqiangmiffy/Chinese-LangChain) | 1485 |
|[Kav-K/GPTDiscord](https://github.com/Kav-K/GPTDiscord) | 1402 |
|[vocodedev/vocode-python](https://github.com/vocodedev/vocode-python) | 1387 |
|[Chainlit/chainlit](https://github.com/Chainlit/chainlit) | 1336 |
|[lunasec-io/lunasec](https://github.com/lunasec-io/lunasec) | 1323 |
|[psychic-api/psychic](https://github.com/psychic-api/psychic) | 1248 |
|[agiresearch/OpenAGI](https://github.com/agiresearch/OpenAGI) | 1208 |
|[jina-ai/thinkgpt](https://github.com/jina-ai/thinkgpt) | 1193 |
|[thomas-yanxin/LangChain-ChatGLM-Webui](https://github.com/thomas-yanxin/LangChain-ChatGLM-Webui) | 1182 |
|[ttengwang/Caption-Anything](https://github.com/ttengwang/Caption-Anything) | 1137 |
|[jina-ai/dev-gpt](https://github.com/jina-ai/dev-gpt) | 1135 |
|[greshake/llm-security](https://github.com/greshake/llm-security) | 1086 |
|[keephq/keep](https://github.com/keephq/keep) | 1063 |
|[juncongmoo/chatllama](https://github.com/juncongmoo/chatllama) | 1037 |
|[richardyc/Chrome-GPT](https://github.com/richardyc/Chrome-GPT) | 1035 |
|[visual-openllm/visual-openllm](https://github.com/visual-openllm/visual-openllm) | 997 |
|[mmz-001/knowledge_gpt](https://github.com/mmz-001/knowledge_gpt) | 995 |
|[jina-ai/langchain-serve](https://github.com/jina-ai/langchain-serve) | 949 |
|[irgolic/AutoPR](https://github.com/irgolic/AutoPR) | 936 |
|[microsoft/X-Decoder](https://github.com/microsoft/X-Decoder) | 908 |
|[poe-platform/api-bot-tutorial](https://github.com/poe-platform/api-bot-tutorial) | 902 |
|[peterw/Chat-with-Github-Repo](https://github.com/peterw/Chat-with-Github-Repo) | 875 |
|[cirediatpl/FigmaChain](https://github.com/cirediatpl/FigmaChain) | 822 |
|[homanp/superagent](https://github.com/homanp/superagent) | 806 |
|[seanpixel/Teenage-AGI](https://github.com/seanpixel/Teenage-AGI) | 800 |
|[chatarena/chatarena](https://github.com/chatarena/chatarena) | 796 |
|[hashintel/hash](https://github.com/hashintel/hash) | 795 |
|[SamurAIGPT/Camel-AutoGPT](https://github.com/SamurAIGPT/Camel-AutoGPT) | 786 |
|[rlancemartin/auto-evaluator](https://github.com/rlancemartin/auto-evaluator) | 770 |
|[corca-ai/EVAL](https://github.com/corca-ai/EVAL) | 769 |
|[101dotxyz/GPTeam](https://github.com/101dotxyz/GPTeam) | 755 |
|[noahshinn024/reflexion](https://github.com/noahshinn024/reflexion) | 706 |
|[eyurtsev/kor](https://github.com/eyurtsev/kor) | 695 |
|[cheshire-cat-ai/core](https://github.com/cheshire-cat-ai/core) | 681 |
|[e-johnstonn/BriefGPT](https://github.com/e-johnstonn/BriefGPT) | 656 |
|[run-llama/llama-lab](https://github.com/run-llama/llama-lab) | 635 |
|[griptape-ai/griptape](https://github.com/griptape-ai/griptape) | 583 |
|[namuan/dr-doc-search](https://github.com/namuan/dr-doc-search) | 555 |
|[getmetal/motorhead](https://github.com/getmetal/motorhead) | 550 |
|[kreneskyp/ix](https://github.com/kreneskyp/ix) | 543 |
|[hwchase17/chat-your-data](https://github.com/hwchase17/chat-your-data) | 510 |
|[Anil-matcha/ChatPDF](https://github.com/Anil-matcha/ChatPDF) | 501 |
|[whyiyhw/chatgpt-wechat](https://github.com/whyiyhw/chatgpt-wechat) | 497 |
|[SamurAIGPT/ChatGPT-Developer-Plugins](https://github.com/SamurAIGPT/ChatGPT-Developer-Plugins) | 496 |
|[microsoft/PodcastCopilot](https://github.com/microsoft/PodcastCopilot) | 492 |
|[debanjum/khoj](https://github.com/debanjum/khoj) | 485 |
|[akshata29/chatpdf](https://github.com/akshata29/chatpdf) | 485 |
|[langchain-ai/langchain-aiplugin](https://github.com/langchain-ai/langchain-aiplugin) | 462 |
|[jina-ai/agentchain](https://github.com/jina-ai/agentchain) | 460 |
|[alexanderatallah/window.ai](https://github.com/alexanderatallah/window.ai) | 457 |
|[yeagerai/yeagerai-agent](https://github.com/yeagerai/yeagerai-agent) | 451 |
|[mckaywrigley/repo-chat](https://github.com/mckaywrigley/repo-chat) | 446 |
|[michaelthwan/searchGPT](https://github.com/michaelthwan/searchGPT) | 446 |
|[mpaepper/content-chatbot](https://github.com/mpaepper/content-chatbot) | 441 |
|[freddyaboulton/gradio-tools](https://github.com/freddyaboulton/gradio-tools) | 439 |
|[ruoccofabrizio/azure-open-ai-embeddings-qna](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna) | 429 |
|[StevenGrove/GPT4Tools](https://github.com/StevenGrove/GPT4Tools) | 422 |
|[jonra1993/fastapi-alembic-sqlmodel-async](https://github.com/jonra1993/fastapi-alembic-sqlmodel-async) | 407 |
|[msoedov/langcorn](https://github.com/msoedov/langcorn) | 405 |
|[amosjyng/langchain-visualizer](https://github.com/amosjyng/langchain-visualizer) | 395 |
|[ajndkr/lanarky](https://github.com/ajndkr/lanarky) | 384 |
|[mtenenholtz/chat-twitter](https://github.com/mtenenholtz/chat-twitter) | 376 |
|[steamship-core/steamship-langchain](https://github.com/steamship-core/steamship-langchain) | 371 |
|[langchain-ai/auto-evaluator](https://github.com/langchain-ai/auto-evaluator) | 365 |
|[xuwenhao/geektime-ai-course](https://github.com/xuwenhao/geektime-ai-course) | 358 |
|[continuum-llms/chatgpt-memory](https://github.com/continuum-llms/chatgpt-memory) | 357 |
|[opentensor/bittensor](https://github.com/opentensor/bittensor) | 347 |
|[showlab/VLog](https://github.com/showlab/VLog) | 345 |
|[daodao97/chatdoc](https://github.com/daodao97/chatdoc) | 345 |
|[logan-markewich/llama_index_starter_pack](https://github.com/logan-markewich/llama_index_starter_pack) | 332 |
|[poe-platform/poe-protocol](https://github.com/poe-platform/poe-protocol) | 320 |
|[explosion/spacy-llm](https://github.com/explosion/spacy-llm) | 312 |
|[andylokandy/gpt-4-search](https://github.com/andylokandy/gpt-4-search) | 311 |
|[alejandro-ao/langchain-ask-pdf](https://github.com/alejandro-ao/langchain-ask-pdf) | 310 |
|[jupyterlab/jupyter-ai](https://github.com/jupyterlab/jupyter-ai) | 294 |
|[BlackHC/llm-strategy](https://github.com/BlackHC/llm-strategy) | 283 |
|[itamargol/openai](https://github.com/itamargol/openai) | 281 |
|[momegas/megabots](https://github.com/momegas/megabots) | 279 |
|[personoids/personoids-lite](https://github.com/personoids/personoids-lite) | 277 |
|[yvann-hub/Robby-chatbot](https://github.com/yvann-hub/Robby-chatbot) | 267 |
|[Anil-matcha/Website-to-Chatbot](https://github.com/Anil-matcha/Website-to-Chatbot) | 266 |
|[Cheems-Seminar/grounded-segment-any-parts](https://github.com/Cheems-Seminar/grounded-segment-any-parts) | 260 |
|[sullivan-sean/chat-langchainjs](https://github.com/sullivan-sean/chat-langchainjs) | 248 |
|[bborn/howdoi.ai](https://github.com/bborn/howdoi.ai) | 245 |
|[daveebbelaar/langchain-experiments](https://github.com/daveebbelaar/langchain-experiments) | 240 |
|[MagnivOrg/prompt-layer-library](https://github.com/MagnivOrg/prompt-layer-library) | 237 |
|[ur-whitelab/exmol](https://github.com/ur-whitelab/exmol) | 234 |
|[conceptofmind/toolformer](https://github.com/conceptofmind/toolformer) | 234 |
|[recalign/RecAlign](https://github.com/recalign/RecAlign) | 226 |
|[OpenBMB/AgentVerse](https://github.com/OpenBMB/AgentVerse) | 220 |
|[alvarosevilla95/autolang](https://github.com/alvarosevilla95/autolang) | 219 |
|[JohnSnowLabs/nlptest](https://github.com/JohnSnowLabs/nlptest) | 216 |
|[kaleido-lab/dolphin](https://github.com/kaleido-lab/dolphin) | 215 |
|[truera/trulens](https://github.com/truera/trulens) | 208 |
|[NimbleBoxAI/ChainFury](https://github.com/NimbleBoxAI/ChainFury) | 208 |
|[airobotlab/KoChatGPT](https://github.com/airobotlab/KoChatGPT) | 207 |
|[monarch-initiative/ontogpt](https://github.com/monarch-initiative/ontogpt) | 200 |
|[paolorechia/learn-langchain](https://github.com/paolorechia/learn-langchain) | 195 |
|[shaman-ai/agent-actors](https://github.com/shaman-ai/agent-actors) | 185 |
|[Haste171/langchain-chatbot](https://github.com/Haste171/langchain-chatbot) | 184 |
|[plchld/InsightFlow](https://github.com/plchld/InsightFlow) | 182 |
|[su77ungr/CASALIOY](https://github.com/su77ungr/CASALIOY) | 180 |
|[jbrukh/gpt-jargon](https://github.com/jbrukh/gpt-jargon) | 177 |
|[benthecoder/ClassGPT](https://github.com/benthecoder/ClassGPT) | 174 |
|[billxbf/ReWOO](https://github.com/billxbf/ReWOO) | 170 |
|[filip-michalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) | 168 |
|[hwchase17/langchain-streamlit-template](https://github.com/hwchase17/langchain-streamlit-template) | 168 |
|[radi-cho/datasetGPT](https://github.com/radi-cho/datasetGPT) | 164 |
|[hardbyte/qabot](https://github.com/hardbyte/qabot) | 164 |
|[gia-guar/JARVIS-ChatGPT](https://github.com/gia-guar/JARVIS-ChatGPT) | 158 |
|[plastic-labs/tutor-gpt](https://github.com/plastic-labs/tutor-gpt) | 154 |
|[yasyf/compress-gpt](https://github.com/yasyf/compress-gpt) | 154 |
|[fengyuli-dev/multimedia-gpt](https://github.com/fengyuli-dev/multimedia-gpt) | 154 |
|[ethanyanjiali/minChatGPT](https://github.com/ethanyanjiali/minChatGPT) | 153 |
|[hwchase17/chroma-langchain](https://github.com/hwchase17/chroma-langchain) | 153 |
|[edreisMD/plugnplai](https://github.com/edreisMD/plugnplai) | 148 |
|[chakkaradeep/pyCodeAGI](https://github.com/chakkaradeep/pyCodeAGI) | 145 |
|[ccurme/yolopandas](https://github.com/ccurme/yolopandas) | 145 |
|[shamspias/customizable-gpt-chatbot](https://github.com/shamspias/customizable-gpt-chatbot) | 144 |
|[realminchoi/babyagi-ui](https://github.com/realminchoi/babyagi-ui) | 143 |
|[PradipNichite/Youtube-Tutorials](https://github.com/PradipNichite/Youtube-Tutorials) | 140 |
|[gustavz/DataChad](https://github.com/gustavz/DataChad) | 140 |
|[Klingefjord/chatgpt-telegram](https://github.com/Klingefjord/chatgpt-telegram) | 140 |
|[Jaseci-Labs/jaseci](https://github.com/Jaseci-Labs/jaseci) | 139 |
|[handrew/browserpilot](https://github.com/handrew/browserpilot) | 137 |
|[jmpaz/promptlib](https://github.com/jmpaz/promptlib) | 137 |
|[SamPink/dev-gpt](https://github.com/SamPink/dev-gpt) | 135 |
|[menloparklab/langchain-cohere-qdrant-doc-retrieval](https://github.com/menloparklab/langchain-cohere-qdrant-doc-retrieval) | 135 |
|[hirokidaichi/wanna](https://github.com/hirokidaichi/wanna) | 135 |
|[steamship-core/vercel-examples](https://github.com/steamship-core/vercel-examples) | 134 |
|[pablomarin/GPT-Azure-Search-Engine](https://github.com/pablomarin/GPT-Azure-Search-Engine) | 133 |
|[ibiscp/LLM-IMDB](https://github.com/ibiscp/LLM-IMDB) | 133 |
|[shauryr/S2QA](https://github.com/shauryr/S2QA) | 133 |
|[jerlendds/osintbuddy](https://github.com/jerlendds/osintbuddy) | 132 |
|[yuanjie-ai/ChatLLM](https://github.com/yuanjie-ai/ChatLLM) | 132 |
|[yasyf/summ](https://github.com/yasyf/summ) | 132 |
|[WongSaang/chatgpt-ui-server](https://github.com/WongSaang/chatgpt-ui-server) | 130 |
|[peterw/StoryStorm](https://github.com/peterw/StoryStorm) | 127 |
|[Teahouse-Studios/akari-bot](https://github.com/Teahouse-Studios/akari-bot) | 126 |
|[vaibkumr/prompt-optimizer](https://github.com/vaibkumr/prompt-optimizer) | 125 |
|[preset-io/promptimize](https://github.com/preset-io/promptimize) | 124 |
|[homanp/vercel-langchain](https://github.com/homanp/vercel-langchain) | 124 |
|[petehunt/langchain-github-bot](https://github.com/petehunt/langchain-github-bot) | 123 |
|[eunomia-bpf/GPTtrace](https://github.com/eunomia-bpf/GPTtrace) | 118 |
|[nicknochnack/LangchainDocuments](https://github.com/nicknochnack/LangchainDocuments) | 116 |
|[jiran214/GPT-vup](https://github.com/jiran214/GPT-vup) | 112 |
|[rsaryev/talk-codebase](https://github.com/rsaryev/talk-codebase) | 112 |
|[zenml-io/zenml-projects](https://github.com/zenml-io/zenml-projects) | 112 |
|[microsoft/azure-openai-in-a-day-workshop](https://github.com/microsoft/azure-openai-in-a-day-workshop) | 112 |
|[davila7/file-gpt](https://github.com/davila7/file-gpt) | 112 |
|[prof-frink-lab/slangchain](https://github.com/prof-frink-lab/slangchain) | 111 |
|[aurelio-labs/arxiv-bot](https://github.com/aurelio-labs/arxiv-bot) | 110 |
|[fixie-ai/fixie-examples](https://github.com/fixie-ai/fixie-examples) | 108 |
|[miaoshouai/miaoshouai-assistant](https://github.com/miaoshouai/miaoshouai-assistant) | 105 |
|[flurb18/AgentOoba](https://github.com/flurb18/AgentOoba) | 103 |
|[solana-labs/chatgpt-plugin](https://github.com/solana-labs/chatgpt-plugin) | 102 |
|[Significant-Gravitas/Auto-GPT-Benchmarks](https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks) | 102 |
|[kaarthik108/snowChat](https://github.com/kaarthik108/snowChat) | 100 |
_Generated by [github-dependents-info](https://github.com/nvuillam/github-dependents-info)_
`github-dependents-info --repo hwchase17/langchain --markdownfile dependents.md --minstars 100 --sort stars`

7
docs/docs_skeleton/.gitignore vendored Normal file
View File

@@ -0,0 +1,7 @@
.yarn/
node_modules/
.docusaurus
.cache-loader
docs/api

View File

@@ -0,0 +1,49 @@
# Website
This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.
### Installation
```
$ yarn
```
### Local Development
```
$ yarn start
```
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
### Build
```
$ yarn build
```
This command generates static content into the `build` directory and can be served using any static contents hosting service.
### Deployment
Using SSH:
```
$ USE_SSH=true yarn deploy
```
Not using SSH:
```
$ GIT_USER=<Your GitHub username> yarn deploy
```
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
### Continuous Integration
Some common defaults for linting/formatting have been set for you. If you integrate your project with an open source Continuous Integration system (e.g. Travis CI, CircleCI), you may check for issues using the following command.
```
$ yarn ci
```

View File

@@ -0,0 +1,12 @@
/**
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*
* @format
*/
module.exports = {
presets: [require.resolve("@docusaurus/core/lib/babel/preset")],
};

View File

@@ -0,0 +1,76 @@
/* eslint-disable prefer-template */
/* eslint-disable no-param-reassign */
// eslint-disable-next-line import/no-extraneous-dependencies
const babel = require("@babel/core");
const path = require("path");
const fs = require("fs");
/**
*
* @param {string|Buffer} content Content of the resource file
* @param {object} [map] SourceMap data consumable by https://github.com/mozilla/source-map
* @param {any} [meta] Meta data, could be anything
*/
async function webpackLoader(content, map, meta) {
const cb = this.async();
if (!this.resourcePath.endsWith(".ts")) {
cb(null, JSON.stringify({ content, imports: [] }), map, meta);
return;
}
try {
const result = await babel.parseAsync(content, {
sourceType: "module",
filename: this.resourcePath,
});
const imports = [];
result.program.body.forEach((node) => {
if (node.type === "ImportDeclaration") {
const source = node.source.value;
if (!source.startsWith("langchain")) {
return;
}
node.specifiers.forEach((specifier) => {
if (specifier.type === "ImportSpecifier") {
const local = specifier.local.name;
const imported = specifier.imported.name;
imports.push({ local, imported, source });
} else {
throw new Error("Unsupported import type");
}
});
}
});
imports.forEach((imp) => {
const { imported, source } = imp;
const moduleName = source.split("/").slice(1).join("_");
const docsPath = path.resolve(__dirname, "docs", "api", moduleName);
const available = fs.readdirSync(docsPath, { withFileTypes: true });
const found = available.find(
(dirent) =>
dirent.isDirectory() &&
fs.existsSync(path.resolve(docsPath, dirent.name, imported + ".md"))
);
if (found) {
imp.docs =
"/" + path.join("docs", "api", moduleName, found.name, imported);
} else {
throw new Error(
`Could not find docs for ${source}.${imported} in docs/api/`
);
}
});
cb(null, JSON.stringify({ content, imports }), map, meta);
} catch (err) {
cb(err);
}
}
module.exports = webpackLoader;

View File

Before

Width:  |  Height:  |  Size: 559 KiB

After

Width:  |  Height:  |  Size: 559 KiB

View File

Before

Width:  |  Height:  |  Size: 157 KiB

After

Width:  |  Height:  |  Size: 157 KiB

View File

Before

Width:  |  Height:  |  Size: 235 KiB

After

Width:  |  Height:  |  Size: 235 KiB

View File

Before

Width:  |  Height:  |  Size: 148 KiB

After

Width:  |  Height:  |  Size: 148 KiB

View File

Before

Width:  |  Height:  |  Size: 3.5 MiB

After

Width:  |  Height:  |  Size: 3.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

View File

@@ -0,0 +1,21 @@
pre {
white-space: break-spaces;
}
@media (min-width: 1200px) {
.container,
.container-lg,
.container-md,
.container-sm,
.container-xl {
max-width: 2560px !important;
}
}
#my-component-root *, #headlessui-portal-root * {
z-index: 10000;
}
.content-container p {
margin: revert;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 542 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

View File

@@ -37,7 +37,6 @@ document.addEventListener('DOMContentLoaded', () => {
style: { darkMode: false, accentColor: '#010810' },
floatingButtonStyle: { color: '#ffffff', backgroundColor: '#010810' },
anon_key: '82842b36-3ea6-49b2-9fb8-52cfc4bde6bf', // Mendable Search Public ANON key, ok to be public
cmdShortcutKey:'j',
messageSettings: {
openSourcesInNewTab: false,
prettySources: true // Prettify the sources displayed now

Binary file not shown.

After

Width:  |  Height:  |  Size: 103 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 0
---
# Integrations
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,5 @@
# Installation
import Installation from "@snippets/get_started/installation.mdx"
<Installation/>

View File

@@ -0,0 +1,65 @@
---
sidebar_position: 0
---
# Introduction
**LangChain** is a framework for developing applications powered by language models. It enables applications that are:
- **Data-aware**: connect a language model to other sources of data
- **Agentic**: allow a language model to interact with its environment
The main value props of LangChain are:
1. **Components**: abstractions for working with language models, along with a collection of implementations for each abstraction. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
2. **Off-the-shelf chains**: a structured assembly of components for accomplishing specific higher-level tasks
Off-the-shelf chains make it easy to get started. For more complex applications and nuanced use-cases, components make it easy to customize existing chains or build new ones.
## Get started
[Heres](/docs/get_started/installation.html) how to install LangChain, set up your environment, and start building.
We recommend following our [Quickstart](/docs/get_started/quickstart.html) guide to familiarize yourself with the framework by building your first LangChain application.
_**Note**: These docs are for the LangChain [Python package](https://github.com/hwchase17/langchain). For documentation on [LangChain.js](https://github.com/hwchase17/langchainjs), the JS/TS version, [head here](https://js.langchain.com/docs)._
## Modules
LangChain provides standard, extendable interfaces and external integrations for the following modules, listed from least to most complex:
#### [Model I/O](/docs/modules/model_io/)
Interface with language models
#### [Data connection](/docs/modules/data_connection/)
Interface with application-specific data
#### [Chains](/docs/modules/chains/)
Construct sequences of calls
#### [Agents](/docs/modules/agents/)
Let chains choose which tools to use given high-level directives
#### [Memory](/docs/modules/memory/)
Persist application state between runs of a chain
#### [Callbacks](/docs/modules/callbacks/)
Log and stream intermediate steps of any chain
## Examples, ecosystem, and resources
### [Use cases](/docs/use_cases/)
Walkthroughs and best-practices for common end-to-end use cases, like:
- [Chatbots](/docs/use_cases/chatbots/)
- [Answering questions using sources](/docs/use_cases/question_answering/)
- [Analyzing structured data](/docs/use_cases/tabular.html)
- and much more...
### [Guides](/docs/guides/)
Learn best practices for developing with LangChain.
### [Ecosystem](/docs/ecosystem/)
LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/ecosystem/integrations/) and [dependent repos](/docs/ecosystem/dependents.html).
### [Additional resources](/docs/additional_resources/)
Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/additional_resources/youtube.html) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).
<h3><span style={{color:"#2e8555"}}> Support </span></h3>
Join us on [GitHub](https://github.com/hwchase17/langchain) or [Discord](https://discord.gg/6adMQxSpJS) to ask questions, share feedback, meet other developers building with LangChain, and dream about the future of LLMs.
## API reference
Head to the [reference](https://api.python.langchain.com) section for full documentation of all classes and methods in the LangChain Python package.

View File

@@ -0,0 +1,158 @@
# Quickstart
## Installation
To install LangChain run:
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Install from "@snippets/get_started/quickstart/installation.mdx"
<Install/>
For more details, see our [Installation guide](/docs/get_started/installation.html).
## Environment setup
Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs.
import OpenAISetup from "@snippets/get_started/quickstart/openai_setup.mdx"
<OpenAISetup/>
## Building an application
Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications. Modules can be used as stand-alones in simple applications and they can be combined for more complex use cases.
## LLMs
#### Get predictions from a language model
The basic building block of LangChain is the LLM, which takes in text and generates more text.
As an example, suppose we're building an application that generates a company name based on a company description. In order to do this, we need to initialize an OpenAI model wrapper. In this case, since we want the outputs to be MORE random, we'll initialize our model with a HIGH temperature.
import LLM from "@snippets/get_started/quickstart/llm.mdx"
<LLM/>
## Chat models
Chat models are a variation on language models. While chat models use language models under the hood, the interface they expose is a bit different: rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
You can get chat completions by passing one or more messages to the chat model. The response will be a message. The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`.
import ChatModel from "@snippets/get_started/quickstart/chat_model.mdx"
<ChatModel/>
## Prompt templates
Most LLM applications do not pass user input directly into an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.
In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instructions.
import PromptTemplateLLM from "@snippets/get_started/quickstart/prompt_templates_llms.mdx"
import PromptTemplateChatModel from "@snippets/get_started/quickstart/prompt_templates_chat_models.mdx"
<Tabs>
<TabItem value="llms" label="LLMs" default>
With PromptTemplates this is easy! In this case our template would be very simple:
<PromptTemplateLLM/>
</TabItem>
<TabItem value="chat_models" label="Chat models">
Similar to LLMs, you can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplate`s. You can use `ChatPromptTemplate`'s `format_messages` method to generate the formatted messages.
Because this is generating a list of messages, it is slightly more complex than the normal prompt template which is generating only a string. Please see the detailed guides on prompts to understand more options available to you here.
<PromptTemplateChatModel/>
</TabItem>
</Tabs>
## Chains
Now that we've got a model and a prompt template, we'll want to combine the two. Chains give us a way to link (or chain) together multiple primitives, like models, prompts, and other chains.
import ChainLLM from "@snippets/get_started/quickstart/chains_llms.mdx"
import ChainChatModel from "@snippets/get_started/quickstart/chains_chat_models.mdx"
<Tabs>
<TabItem value="llms" label="LLMs" default>
The simplest and most common type of chain is an LLMChain, which passes an input first to a PromptTemplate and then to an LLM. We can construct an LLM chain from our existing model and prompt template.
<ChainLLM/>
There we go, our first chain! Understanding how this simple chain works will set you up well for working with more complex chains.
</TabItem>
<TabItem value="chat_models" label="Chat models">
The `LLMChain` can be used with chat models as well:
<ChainChatModel/>
</TabItem>
</Tabs>
## Agents
import AgentLLM from "@snippets/get_started/quickstart/agents_llms.mdx"
import AgentChatModel from "@snippets/get_started/quickstart/agents_chat_models.mdx"
Our first chain ran a pre-determined sequence of steps. To handle complex workflows, we need to be able to dynamically choose actions based on inputs.
Agents do just this: they use a language model to determine which actions to take and in what order. Agents are given access to tools, and they repeatedly choose a tool, run the tool, and observe the output until they come up with a final answer.
To load an agent, you need to choose a(n):
- LLM/Chat model: The language model powering the agent.
- Tool(s): A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. For a list of predefined tools and their specifications, see the [Tools documentation](/docs/modules/agents/tools/).
- Agent name: A string that references a supported agent class. An agent class is largely parameterized by the prompt the language model uses to determine which action to take. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see [here](/docs/modules/agents/how_to/custom_agent.html). For a list of supported agents and their specifications, see [here](/docs/modules/agents/agent_types/).
For this example, we'll be using SerpAPI to query a search engine.
You'll need to install the SerpAPI Python package:
```bash
pip install google-search-results
```
And set the `SERPAPI_API_KEY` environment variable.
<Tabs>
<TabItem value="llms" label="LLMs" default>
<AgentLLM/>
</TabItem>
<TabItem value="chat_models" label="Chat models">
Agents can also be used with chat models, you can initialize one using `AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION` as the agent type.
<AgentChatModel/>
</TabItem>
</Tabs>
## Memory
The chains and agents we've looked at so far have been stateless, but for many applications it's necessary to reference past interactions. This is clearly the case with a chatbot for example, where you want it to understand new messages in the context of past messages.
The Memory module gives you a way to maintain application state. The base Memory interface is simple: it lets you update state given the latest run inputs and outputs and it lets you modify (or contextualize) the next input using the stored state.
There are a number of built-in memory systems. The simplest of these is a buffer memory which just prepends the last few inputs/outputs to the current input - we will use this in the example below.
import MemoryLLM from "@snippets/get_started/quickstart/memory_llms.mdx"
import MemoryChatModel from "@snippets/get_started/quickstart/memory_chat_models.mdx"
<Tabs>
<TabItem value="llms" label="LLMs" default>
<MemoryLLM/>
</TabItem>
<TabItem value="chat_models" label="Chat models">
You can use Memory with chains and agents initialized with chat models. The main difference between this and Memory for LLMs is that rather than trying to condense all previous messages into a string, we can keep them as their own unique memory object.
<MemoryChatModel/>
</TabItem>
</Tabs>

View File

@@ -1,42 +1,40 @@
# Tutorials
⛓ icon marks a new addition [last update 2023-05-15]
⛓ icon marks a new addition [last update 2023-06-20]
---------------------
### DeepLearning.AI course
[LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain) by Harrison Chase presented by [Andrew Ng](https://en.wikipedia.org/wiki/Andrew_Ng)
[LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain) by Harrison Chase and [Andrew Ng](https://en.wikipedia.org/wiki/Andrew_Ng)
### Handbook
[LangChain AI Handbook](https://www.pinecone.io/learn/langchain/) By **James Briggs** and **Francisco Ingham**
### Tutorials
[LangChain Tutorials](https://www.youtube.com/watch?v=FuqdVNB_8c0&list=PL9V0lbeJ69brU-ojMpU1Y7Ic58Tap0Cw6) by [Edrick](https://www.youtube.com/@edrickdch):
- ⛓ [LangChain, Chroma DB, OpenAI Beginner Guide | ChatGPT with your PDF](https://youtu.be/FuqdVNB_8c0)
- ⛓ [LangChain 101: The Complete Beginner's Guide](https://youtu.be/P3MAbZ2eMUI)
[LangChain Crash Course: Build an AutoGPT app in 25 minutes](https://youtu.be/MlK6SIjcjE8) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
### Short Tutorials
[LangChain Crash Course - Build apps with language models](https://youtu.be/LbT1yp6quS8) by [Patrick Loeber](https://www.youtube.com/@patloeber)
[LangChain Crash Course: Build an AutoGPT app in 25 minutes](https://youtu.be/MlK6SIjcjE8) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
[LangChain Explained in 13 Minutes | QuickStart Tutorial for Beginners](https://youtu.be/aywZrzNaKjs) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
###
[LangChain for Gen AI and LLMs](https://www.youtube.com/playlist?list=PLIUOU7oqGTLieV9uTIFMm6_4PXg-hlN6F) by [James Briggs](https://www.youtube.com/@jamesbriggs):
## Tutorials
### [LangChain for Gen AI and LLMs](https://www.youtube.com/playlist?list=PLIUOU7oqGTLieV9uTIFMm6_4PXg-hlN6F) by [James Briggs](https://www.youtube.com/@jamesbriggs):
- #1 [Getting Started with `GPT-3` vs. Open Source LLMs](https://youtu.be/nE2skSRWTTs)
- #2 [Prompt Templates for `GPT 3.5` and other LLMs](https://youtu.be/RflBcK0oDH0)
- #3 [LLM Chains using `GPT 3.5` and other LLMs](https://youtu.be/S8j9Tk0lZHU)
- #4 [Chatbot Memory for `Chat-GPT`, `Davinci` + other LLMs](https://youtu.be/X05uK0TZozM)
- #5 [Chat with OpenAI in LangChain](https://youtu.be/CnAgB3A5OlU)
- #6 [Fixing LLM Hallucinations with Retrieval Augmentation in LangChain](https://youtu.be/kvdVduIJsc8)
- #7 [LangChain Agents Deep Dive with GPT 3.5](https://youtu.be/jSP-gSEyVeI)
- #8 [Create Custom Tools for Chatbots in LangChain](https://youtu.be/q-HNphrWsDE)
- #9 [Build Conversational Agents with Vector DBs](https://youtu.be/H6bCqqw9xyI)
- #6 [Fixing LLM Hallucinations with Retrieval Augmentation in LangChain](https://youtu.be/kvdVduIJsc8)
- #7 [LangChain Agents Deep Dive with GPT 3.5](https://youtu.be/jSP-gSEyVeI)
- #8 [Create Custom Tools for Chatbots in LangChain](https://youtu.be/q-HNphrWsDE)
- #9 [Build Conversational Agents with Vector DBs](https://youtu.be/H6bCqqw9xyI)
- ⛓ #10 [Using NEW `MPT-7B` in Hugging Face and LangChain](https://youtu.be/DXpk9K7DgMo)
###
[LangChain 101](https://www.youtube.com/playlist?list=PLqZXAkvF1bPNQER9mLmDbntNfSpzdDIU5) by [Data Independent](https://www.youtube.com/@DataIndependent):
### [LangChain 101](https://www.youtube.com/playlist?list=PLqZXAkvF1bPNQER9mLmDbntNfSpzdDIU5) by [Greg Kamradt (Data Indy)](https://www.youtube.com/@DataIndependent):
- [What Is LangChain? - LangChain + `ChatGPT` Overview](https://youtu.be/_v_fgW2SkkQ)
- [Quickstart Guide](https://youtu.be/kYRB-vJFy38)
- [Beginner Guide To 7 Essential Concepts](https://youtu.be/2xxziIWmaSA)
@@ -52,12 +50,15 @@
- [Structured Output From `OpenAI` (Clean Dirty Data)](https://youtu.be/KwAXfey-xQk)
- [Connect `OpenAI` To +5,000 Tools (LangChain + `Zapier`)](https://youtu.be/7tNm0yiDigU)
- [Use LLMs To Extract Data From Text (Expert Mode)](https://youtu.be/xZzvwR9jdPA)
- [Extract Insights From Interview Transcripts Using LLMs](https://youtu.be/shkMOHwJ4SM)
- [5 Levels Of LLM Summarizing: Novice to Expert](https://youtu.be/qaPMdcCqtWk)
- [Extract Insights From Interview Transcripts Using LLMs](https://youtu.be/shkMOHwJ4SM)
- [5 Levels Of LLM Summarizing: Novice to Expert](https://youtu.be/qaPMdcCqtWk)
- ⛓ [Control Tone & Writing Style Of Your LLM Output](https://youtu.be/miBG-a3FuhU)
- ⛓ [Build Your Own `AI Twitter Bot` Using LLMs](https://youtu.be/yLWLDjT01q8)
- ⛓ [ChatGPT made my interview questions for me (`Streamlit` + LangChain)](https://youtu.be/zvoAMx0WKkw)
- ⛓ [Function Calling via ChatGPT API - First Look With LangChain](https://youtu.be/0-zlUy7VUjg)
###
[LangChain How to and guides](https://www.youtube.com/playlist?list=PL8motc6AQftk1Bs42EW45kwYbyJ4jOdiZ) by [Sam Witteveen](https://www.youtube.com/@samwitteveenai):
### [LangChain How to and guides](https://www.youtube.com/playlist?list=PL8motc6AQftk1Bs42EW45kwYbyJ4jOdiZ) by [Sam Witteveen](https://www.youtube.com/@samwitteveenai):
- [LangChain Basics - LLMs & PromptTemplates with Colab](https://youtu.be/J_0qvRt4LNk)
- [LangChain Basics - Tools and Chains](https://youtu.be/hI2BY7yl_Ac)
- [`ChatGPT API` Announcement & Code Walkthrough with LangChain](https://youtu.be/phHqvLHCwH4)
@@ -75,39 +76,41 @@
- [Talk to your `CSV` & `Excel` with LangChain](https://youtu.be/xQ3mZhw69bc)
- [`BabyAGI`: Discover the Power of Task-Driven Autonomous Agents!](https://youtu.be/QBcDLSE2ERA)
- [Improve your `BabyAGI` with LangChain](https://youtu.be/DRgPyOXZ-oE)
- [Master `PDF` Chat with LangChain - Your essential guide to queries on documents](https://youtu.be/ZzgUqFtxgXI)
- [Using LangChain with `DuckDuckGO` `Wikipedia` & `PythonREPL` Tools](https://youtu.be/KerHlb8nuVc)
- [Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)](https://youtu.be/biS8G8x8DdA)
- [LangChain Retrieval QA Over Multiple Files with `ChromaDB`](https://youtu.be/3yPBVii7Ct0)
- [LangChain Retrieval QA with Instructor Embeddings & `ChromaDB` for PDFs](https://youtu.be/cFCGUjc33aU)
- [LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!](https://youtu.be/9ISVjh8mdlA)
- [Master `PDF` Chat with LangChain - Your essential guide to queries on documents](https://youtu.be/ZzgUqFtxgXI)
- [Using LangChain with `DuckDuckGO` `Wikipedia` & `PythonREPL` Tools](https://youtu.be/KerHlb8nuVc)
- [Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)](https://youtu.be/biS8G8x8DdA)
- [LangChain Retrieval QA Over Multiple Files with `ChromaDB`](https://youtu.be/3yPBVii7Ct0)
- [LangChain Retrieval QA with Instructor Embeddings & `ChromaDB` for PDFs](https://youtu.be/cFCGUjc33aU)
- [LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!](https://youtu.be/9ISVjh8mdlA)
- ⛓ [`Camel` + LangChain for Synthetic Data & Market Research](https://youtu.be/GldMMK6-_-g)
- ⛓ [Information Extraction with LangChain & `Kor`](https://youtu.be/SW1ZdqH0rRQ)
- ⛓ [Converting a LangChain App from OpenAI to OpenSource](https://youtu.be/KUDn7bVyIfc)
- ⛓ [Using LangChain `Output Parsers` to get what you want out of LLMs](https://youtu.be/UVn2NroKQCw)
- ⛓ [Building a LangChain Custom Medical Agent with Memory](https://youtu.be/6UFtRwWnHws)
- ⛓ [Understanding `ReACT` with LangChain](https://youtu.be/Eug2clsLtFs)
- ⛓ [`OpenAI Functions` + LangChain : Building a Multi Tool Agent](https://youtu.be/4KXK6c6TVXQ)
- ⛓ [What can you do with 16K tokens in LangChain?](https://youtu.be/z2aCZBAtWXs)
- ⛓ [Tagging and Extraction - Classification using `OpenAI Functions`](https://youtu.be/a8hMgIcUEnE)
###
[LangChain](https://www.youtube.com/playlist?list=PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr) by [Prompt Engineering](https://www.youtube.com/@engineerprompt):
### [LangChain](https://www.youtube.com/playlist?list=PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr) by [Prompt Engineering](https://www.youtube.com/@engineerprompt):
- [LangChain Crash Course — All You Need to Know to Build Powerful Apps with LLMs](https://youtu.be/5-fc4Tlgmro)
- [Working with MULTIPLE `PDF` Files in LangChain: `ChatGPT` for your Data](https://youtu.be/s5LhRdh5fu4)
- [`ChatGPT` for YOUR OWN `PDF` files with LangChain](https://youtu.be/TLf90ipMzfE)
- [Talk to YOUR DATA without OpenAI APIs: LangChain](https://youtu.be/wrD-fZvT6UI)
- ⛓ [CHATGPT For WEBSITES: Custom ChatBOT](https://youtu.be/RBnuhhmD21U)
- ⛓ [Langchain: PDF Chat App (GUI) | ChatGPT for Your PDF FILES](https://youtu.be/RIWbalZ7sTo)
- ⛓ [LangFlow: Build Chatbots without Writing Code](https://youtu.be/KJ-ux3hre4s)
- ⛓ [LangChain: Giving Memory to LLMs](https://youtu.be/dxO6pzlgJiY)
- ⛓ [BEST OPEN Alternative to `OPENAI's EMBEDDINGs` for Retrieval QA: LangChain](https://youtu.be/ogEalPMUCSY)
###
LangChain by [Chat with data](https://www.youtube.com/@chatwithdata)
### LangChain by [Chat with data](https://www.youtube.com/@chatwithdata)
- [LangChain Beginner's Tutorial for `Typescript`/`Javascript`](https://youtu.be/bH722QgRlhQ)
- [`GPT-4` Tutorial: How to Chat With Multiple `PDF` Files (~1000 pages of Tesla's 10-K Annual Reports)](https://youtu.be/Ix9WIZpArm0)
- [`GPT-4` & LangChain Tutorial: How to Chat With A 56-Page `PDF` Document (w/`Pinecone`)](https://youtu.be/ih9PBGVVOO4)
- [LangChain & Supabase Tutorial: How to Build a ChatGPT Chatbot For Your Website](https://youtu.be/R2FMzcsmQY8)
- [LangChain & Supabase Tutorial: How to Build a ChatGPT Chatbot For Your Website](https://youtu.be/R2FMzcsmQY8)
- ⛓ [LangChain Agents: Build Personal Assistants For Your Data (Q&A with Harrison Chase and Mayo Oshin)](https://youtu.be/gVkF8cwfBLI)
###
[Get SH\*T Done with Prompt Engineering and LangChain](https://www.youtube.com/watch?v=muXbPpG_ys4&list=PLEJK-H61Xlwzm5FYLDdKt_6yibO33zoMW) by [Venelin Valkov](https://www.youtube.com/@venelin_valkov)
- [Getting Started with LangChain: Load Custom Data, Run OpenAI Models, Embeddings and `ChatGPT`](https://www.youtube.com/watch?v=muXbPpG_ys4)
- [Loaders, Indexes & Vectorstores in LangChain: Question Answering on `PDF` files with `ChatGPT`](https://www.youtube.com/watch?v=FQnvfR8Dmr0)
- [LangChain Models: `ChatGPT`, `Flan Alpaca`, `OpenAI Embeddings`, Prompt Templates & Streaming](https://www.youtube.com/watch?v=zy6LiK5F5-s)
- [LangChain Chains: Use `ChatGPT` to Build Conversational Agents, Summaries and Q&A on Text With LLMs](https://www.youtube.com/watch?v=h1tJZQPcimM)
- [Analyze Custom CSV Data with `GPT-4` using Langchain](https://www.youtube.com/watch?v=Ew3sGdX8at4)
- ⛓ [Build ChatGPT Chatbots with LangChain Memory: Understanding and Implementing Memory in Conversations](https://youtu.be/CyuUlf54wTs)
---------------------
⛓ icon marks a new addition [last update 2023-05-15]
⛓ icon marks a new addition [last update 2023-06-20]

View File

@@ -0,0 +1,13 @@
# Conversational
This walkthrough demonstrates how to use an agent optimized for conversation. Other agents are often optimized for using tools to figure out the best response, which is not ideal in a conversational setting where you may want the agent to be able to chat with the user as well.
import Example from "@snippets/modules/agents/agent_types/conversational_agent.mdx"
<Example/>
import ChatExample from "@snippets/modules/agents/agent_types/chat_conversation_agent.mdx"
## Using a chat model
<ChatExample/>

View File

@@ -0,0 +1,57 @@
---
sidebar_position: 0
---
# Agent types
## Action agents
Agents use an LLM to determine which actions to take and in what order.
An action can either be using a tool and observing its output, or returning a response to the user.
Here are the agents available in LangChain.
### [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)
This agent uses the [ReAct](https://arxiv.org/pdf/2205.00445.pdf) framework to determine which tool to use
based solely on the tool's description. Any number of tools can be provided.
This agent requires that a description is provided for each tool.
**Note**: This is the most general purpose action agent.
### [Structured input ReAct](/docs/modules/agents/agent_types/structured_chat.html)
The structured tool chat agent is capable of using multi-input tools.
Older agents are configured to specify an action input as a single string, but this agent can use a tools' argument
schema to create a structured action input. This is useful for more complex tool usage, like precisely
navigating around a browser.
### [OpenAI Functions](/docs/modules/agents/agent_types/openai_functions_agent.html)
Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been explicitly fine-tuned to detect when a
function should to be called and respond with the inputs that should be passed to the function.
The OpenAI Functions Agent is designed to work with these models.
### [Conversational](/docs/modules/agents/agent_types/chat_conversation_agent.html)
This agent is designed to be used in conversational settings.
The prompt is designed to make the agent helpful and conversational.
It uses the ReAct framework to decide which tool to use, and uses memory to remember the previous conversation interactions.
### [Self ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)
This agent utilizes a single tool that should be named `Intermediate Answer`.
This tool should be able to lookup factual answers to questions. This agent
is equivalent to the original [self ask with search paper](https://ofir.io/self-ask.pdf),
where a Google search API was provided as the tool.
### [ReAct document store](/docs/modules/agents/agent_types/react_docstore.html)
This agent uses the ReAct framework to interact with a docstore. Two tools must
be provided: a `Search` tool and a `Lookup` tool (they must be named exactly as so).
The `Search` tool should search for a document, while the `Lookup` tool should lookup
a term in the most recently found document.
This agent is equivalent to the
original [ReAct paper](https://arxiv.org/pdf/2210.03629.pdf), specifically the Wikipedia example.
## [Plan-and-execute agents](/docs/modules/agents/agent_types/plan_and_execute.html)
Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).

View File

@@ -0,0 +1,11 @@
# OpenAI functions
Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to detect when a function should to be called and respond with the inputs that should be passed to the function.
In an API call, you can describe functions and have the model intelligently choose to output a JSON object containing arguments to call those functions.
The goal of the OpenAI Function APIs is to more reliably return valid and useful function calls than a generic text completion or chat API.
The OpenAI Functions Agent is designed to work with these models.
import Example from "@snippets/modules/agents/agent_types/openai_functions_agent.mdx";
<Example/>

View File

@@ -0,0 +1,11 @@
# Plan and execute
Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
The planning is almost always done by an LLM.
The execution is usually done by a separate agent (equipped with tools).
import Example from "@snippets/modules/agents/agent_types/plan_and_execute.mdx"
<Example/>

View File

@@ -0,0 +1,15 @@
# ReAct
This walkthrough showcases using an agent to implement the [ReAct](https://react-lm.github.io/) logic.
import Example from "@snippets/modules/agents/agent_types/react.mdx"
<Example/>
## Using chat models
You can also create ReAct agents that use chat models instead of LLMs as the agent driver.
import ChatExample from "@snippets/modules/agents/agent_types/react_chat.mdx"
<ChatExample/>

View File

@@ -0,0 +1,10 @@
# Structured tool chat
The structured tool chat agent is capable of using multi-input tools.
Older agents are configured to specify an action input as a single string, but this agent can use the provided tools' `args_schema` to populate the action input.
import Example from "@snippets/modules/agents/agent_types/structured_chat.mdx"
<Example/>

View File

@@ -0,0 +1,2 @@
label: 'How-to'
position: 1

View File

@@ -0,0 +1,14 @@
# Custom LLM Agent
This notebook goes through how to create your own custom LLM agent.
An LLM agent consists of three parts:
- PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
- LLM: This is the language model that powers the agent
- `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
- OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
import Example from "@snippets/modules/agents/how_to/custom_llm_agent.mdx"
<Example/>

View File

@@ -0,0 +1,14 @@
# Custom LLM Agent (with a ChatModel)
This notebook goes through how to create your own custom agent based on a chat model.
An LLM chat agent consists of three parts:
- PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
- ChatModel: This is the language model that powers the agent
- `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
- OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
import Example from "@snippets/modules/agents/how_to/custom_llm_chat_agent.mdx"
<Example/>

View File

@@ -0,0 +1,16 @@
# Replicating MRKL
This walkthrough demonstrates how to replicate the [MRKL](https://arxiv.org/pdf/2205.00445.pdf) system using agents.
This uses the example Chinook database.
To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository.
import Example from "@snippets/modules/agents/how_to/mrkl.mdx"
<Example/>
## With a chat model
import ChatExample from "@snippets/modules/agents/how_to/mrkl_chat.mdx"
<ChatExample/>

View File

@@ -0,0 +1,51 @@
---
sidebar_position: 4
---
# Agents
Some applications require a flexible chain of calls to LLMs and other tools based on user input. The **Agent** interface provides the flexibility for such applications. An agent has access to a suite of tools, and determines which ones to use depending on the user input. Agents can use multiple tools, and use the output of one tool as the input to the next.
There are two main types of agents:
- **Action agents**: at each timestep, decide on the next action using the outputs of all previous actions
- **Plan-and-execute agents**: decide on the full sequence of actions up front, then execute them all without updating the plan
Action agents are suitable for small tasks, while plan-and-execute agents are better for complex or long-running tasks that require maintaining long-term objectives and focus. Often the best approach is to combine the dynamism of an action agent with the planning abilities of a plan-and-execute agent by letting the plan-and-execute agent use action agents to execute plans.
For a full list of agent types see [agent types](/docs/modules/agents/agent_types/). Additional abstractions involved in agents are:
- [**Tools**](/docs/modules/agents/tools/): the actions an agent can take. What tools you give an agent highly depend on what you want the agent to do
- [**Toolkits**](/docs/modules/agents/toolkits/): wrappers around collections of tools that can be used together a specific use case. For example, in order for an agent to
interact with a SQL database it will likely need one tool to execute queries and another to inspect tables
## Action agents
At a high-level an action agent:
1. Receives user input
2. Decides which tool, if any, to use and the tool input
3. Calls the tool and records the output (also known as an "observation")
4. Decides the next step using the history of tools, tool inputs, and observations
5. Repeats 3-4 until it determines it can respond directly to the user
Action agents are wrapped in **agent executors**, which are responsible for calling the agent, getting back an action and action input, calling the tool that the action references with the generated input, getting the output of the tool, and then passing all that information back into the agent to get the next action it should take.
Although an agent can be constructed in many ways, it typically involves these components:
- **Prompt template**: Responsible for taking the user input and previous steps and constructing a prompt
to send to the language model
- **Language model**: Takes the prompt with use input and action history and decides what to do next
- **Output parser**: Takes the output of the language model and parses it into the next action or a final answer
## Plan-and-execute agents
At a high-level a plan-and-execute agent:
1. Receives user input
2. Plans the full sequence of steps to take
3. Executes the steps in order, passing the outputs of past steps as inputs to future steps
The most typical implementation is to have the planner be a language model, and the executor be an action agent. Read more [here](/docs/modules/agents/agent_types/plan_and_execute.html).
## Get started
import GetStarted from "@snippets/modules/agents/get_started.mdx"
<GetStarted/>

View File

@@ -0,0 +1,10 @@
---
sidebar_position: 3
---
# Toolkits
Toolkits are collections of tools that are designed to be used together for specific tasks and have convenience loading methods.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,2 @@
label: 'How-to'
position: 0

View File

@@ -0,0 +1,17 @@
---
sidebar_position: 2
---
# Tools
Tools are interfaces that an agent can use to interact with the world.
## Get started
Tools are functions that agents can use to interact with the world.
These tools can be generic utilities (e.g. search), other chains, or even other agents.
Currently, tools can be loaded with the following snippet:
import GetStarted from "@snippets/modules/agents/tools/get_started.mdx"
<GetStarted/>

View File

@@ -0,0 +1 @@
label: 'Integrations'

View File

@@ -0,0 +1,2 @@
label: 'How-to'
position: 0

View File

@@ -0,0 +1,10 @@
---
sidebar_position: 5
---
# Callbacks
LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
import GetStarted from "@snippets/modules/callbacks/get_started.mdx"
<GetStarted/>

View File

@@ -0,0 +1 @@
label: 'Integrations'

View File

@@ -0,0 +1,7 @@
# Analyze Document
The AnalyzeDocumentChain can be used as an end-to-end to chain. This chain takes in a single document, splits it up, and then runs it through a CombineDocumentsChain.
import Example from "@snippets/modules/chains/additional/analyze_document.mdx"
<Example/>

View File

@@ -0,0 +1,7 @@
# Self-critique chain with constitutional AI
The ConstitutionalChain is a chain that ensures the output of a language model adheres to a predefined set of constitutional principles. By incorporating specific rules and guidelines, the ConstitutionalChain filters and modifies the generated content to align with these principles, thus providing more controlled, ethical, and contextually appropriate responses. This mechanism helps maintain the integrity of the output while minimizing the risk of generating content that may violate guidelines, be offensive, or deviate from the desired context.
import Example from "@snippets/modules/chains/additional/constitutional_chain.mdx"
<Example/>

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 4
---
# Additional
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,8 @@
# Moderation
This notebook walks through examples of how to use a moderation chain, and several common ways for doing so. Moderation chains are useful for detecting text that could be hateful, violent, etc. This can be useful to apply on both user input, but also on the output of a Language Model. Some API providers, like OpenAI, [specifically prohibit](https://beta.openai.com/docs/usage-policies/use-case-policy) you, or your end users, from generating some types of harmful content. To comply with this (and to just generally prevent your application from being harmful) you may often want to append a moderation chain to any LLMChains, in order to make sure any output the LLM generates is not harmful.
If the content passed into the moderation chain is harmful, there is not one best way to handle it, it probably depends on your application. Sometimes you may want to throw an error in the Chain (and have your application handle that). Other times, you may want to return something to the user explaining that the text was harmful. There could even be other ways to handle it! We will cover all these ways in this walkthrough.
import Example from "@snippets/modules/chains/additional/moderation.mdx"
<Example/>

View File

@@ -0,0 +1,7 @@
# Dynamically selecting from multiple prompts
This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects the prompt to use for a given input. Specifically we show how to use the `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt.
import Example from "@snippets/modules/chains/additional/multi_prompt_router.mdx"
<Example/>

View File

@@ -0,0 +1,7 @@
# Dynamically selecting from multiple retrievers
This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects which Retrieval system to use. Specifically we show how to use the `MultiRetrievalQAChain` to create a question-answering chain that selects the retrieval QA chain which is most relevant for a given question, and then answers the question using it.
import Example from "@snippets/modules/chains/additional/multi_retrieval_qa_router.mdx"
<Example/>

View File

@@ -0,0 +1,13 @@
# Document QA
Here we walk through how to use LangChain for question answering over a list of documents. Under the hood we'll be using our [Document chains](/docs/modules/chains/document/).
import Example from "@snippets/modules/chains/additional/question_answering.mdx"
<Example/>
## Document QA with sources
import ExampleWithSources from "@snippets/modules/chains/additional/qa_with_sources.mdx"
<ExampleWithSources/>

View File

@@ -0,0 +1,16 @@
---
sidebar_position: 2
---
# Documents
These are the core chains for working with Documents. They are useful for summarizing documents, answering questions over documents, extracting information from documents, and more.
These chains all implement a common interface:
import Interface from "@snippets/modules/chains/document/combine_docs.mdx"
<Interface/>
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,5 @@
# Map reduce
The map reduce documents chain first applies an LLM chain to each document individually (the Map step), treating the chain output as a new document. It then passes all the new documents to a separate combine documents chain to get a single output (the Reduce step). It can optionally first compress, or collapse, the mapped documents to make sure that they fit in the combine documents chain (which will often pass them to an LLM). This compression step is performed recursively if necessary.
![map_reduce_diagram](/img/map_reduce.jpg)

View File

@@ -0,0 +1,5 @@
# Map re-rank
The map re-rank documents chain runs an initial prompt on each document, that not only tries to complete a task but also gives a score for how certain it is in its answer. The highest scoring response is returned.
![map_rerank_diagram](/img/map_rerank.jpg)

View File

@@ -0,0 +1,12 @@
---
sidebar_position: 1
---
# Refine
The refine documents chain constructs a response by looping over the input documents and iteratively updating its answer. For each document, it passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to get a new answer.
Since the Refine chain only passes a single document to the LLM at a time, it is well-suited for tasks that require analyzing more documents than can fit in the model's context.
The obvious tradeoff is that this chain will make far more LLM calls than, for example, the Stuff documents chain.
There are also certain tasks which are difficult to accomplish iteratively. For example, the Refine chain can perform poorly when documents frequently cross-reference one another or when a task requires detailed information from many documents.
![refine_diagram](/img/refine.jpg)

View File

@@ -0,0 +1,12 @@
---
sidebar_position: 0
---
# Stuff
The stuff documents chain ("stuff" as in "to stuff" or "to fill") is the most straightforward of the document chains. It takes a list of documents, inserts them all into a prompt and passes that prompt to an LLM.
This chain is well-suited for applications where documents are small and only a few are passed in for most calls.
![stuff_diagram](/img/stuff.jpg)

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 1
---
# Foundational
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,11 @@
# LLM
An LLMChain is a simple chain that adds some functionality around language models. It is used widely throughout LangChain, including in other chains and agents.
An LLMChain consists of a PromptTemplate and a language model (either an LLM or chat model). It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output.
## Get started
import Example from "@snippets/modules/chains/foundational/llm_chain.mdx"
<Example/>

View File

@@ -0,0 +1,14 @@
# Sequential
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! Instead, edit the notebook w/the location & name as this file. -->
The next step after calling a language model is make a series of calls to a language model. This is particularly useful when you want to take the output from one call and use it as the input to another.
In this notebook we will walk through some examples for how to do this, using sequential chains. Sequential chains allow you to connect multiple chains and compose them into pipelines that execute some specific scenario.. There are two types of sequential chains:
- `SimpleSequentialChain`: The simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.
- `SequentialChain`: A more general form of sequential chains, allowing for multiple inputs/outputs.
import Example from "@snippets/modules/chains/foundational/sequential_chains.mdx"
<Example/>

View File

@@ -0,0 +1,8 @@
# Debugging chains
It can be hard to debug a `Chain` object solely from its output as most `Chain` objects involve a fair amount of input prompt preprocessing and LLM output post-processing.
import Example from "@snippets/modules/chains/how_to/debugging.mdx"
<Example/>

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 0
---
# How to
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,10 @@
# Adding memory (state)
Chains can be initialized with a Memory object, which will persist data across calls to the chain. This makes a Chain stateful.
## Get started
import GetStarted from "@snippets/modules/chains/how_to/memory.mdx"
<GetStarted/>

View File

@@ -0,0 +1,33 @@
---
sidebar_position: 2
---
# Chains
Using an LLM in isolation is fine for simple applications,
but more complex applications require chaining LLMs - either with each other or with other components.
LangChain provides the **Chain** interface for such "chained" applications. We define a Chain very generically as a sequence of calls to components, which can include other chains. The base interface is simple:
import BaseClass from "@snippets/modules/chains/base_class.mdx"
<BaseClass/>
This idea of composing components together in a chain is simple but powerful. It drastically simplifies and makes more modular the implementation of complex applications, which in turn makes it much easier to debug, maintain, and improve your applications.
For more specifics check out:
- [How-to](/docs/modules/chains/how_to/) for walkthroughs of different chain features
- [Foundational](/docs/modules/chains/foundational/) to get acquainted with core building block chains
- [Document](/docs/modules/chains/document/) to learn how to incorporate documents into chains
- [Popular](/docs/modules/chains/popular/) chains for the most common use cases
- [Additional](/docs/modules/chains/additional/) to see some of the more advanced chains and integrations that you can use out of the box
## Why do we need chains?
Chains allow us to combine multiple components together to create a single, coherent application. For example, we can create a chain that takes user input, formats it with a PromptTemplate, and then passes the formatted response to an LLM. We can build more complex chains by combining multiple chains together, or by combining chains with other components.
## Get started
import GetStarted from "@snippets/modules/chains/get_started.mdx"
<GetStarted/>

View File

@@ -0,0 +1,9 @@
---
sidebar_position: 0
---
# API chains
APIChain enables using LLMs to interact with APIs to retrieve relevant information. Construct the chain by providing a question relevant to the provided API documentation.
import Example from "@snippets/modules/chains/popular/api.mdx"
<Example/>

Some files were not shown because too many files have changed in this diff Show More