Compare commits

..

344 Commits

Author SHA1 Message Date
Bagatur
a8be207ea3 bump 248 (#8518) 2023-07-31 07:14:45 -07:00
Harrison Chase
6556a8fcfd add initial anthropic agent (#8468)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-07-30 21:30:49 -07:00
os1ma
a795c3d860 Fix GitLoader to handle repeated load calls (#8412)
**Description: a description of the change**

In this pull request, GitLoader has been updated to handle multiple load
calls, provided the same repository is being cloned. Previously, calling
`load` multiple times would raise an error if a clone URL was provided.

Additionally, a check has been added to raise a ValueError when
attempting to clone a different repository into an existing path.

New tests have also been introduced to verify the correct behavior of
the GitLoader class when `load` is called multiple times.

Lastly, the GitPython package, a dependency for the GitLoader class, has
been added to the project dependencies (pyproject.toml and poetry.lock).

**Issue: the issue # it fixes (if applicable)**

None

**Dependencies: any dependencies required for this change**

GitPython

**Tag maintainer: for a quicker response, tag the relevant maintainer
(see below)**

- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
2023-07-30 21:27:20 -07:00
Muhammed Al-Dulaimi
9975ba4124 Fix ChromaDB integration -> docker container instructions (#8447)
## Description
This PR handles modifying the Chroma DB integration's documentation.
It modifies the **Docker container** example to fix the instructions
mentioned in the documentation.
In the current documentation, the below `client.reset()` line causes a
runtime error:

```py
...
client = chromadb.HttpClient(settings=Settings(allow_reset=True))
client.reset()  # resets the database
collection = client.create_collection("my_collection")
...
```

`Exception: {"error":"ValueError('Resetting is not allowed by this
configuration')"}`

This is due to the Chroma DB server needing to have the `allow_reset`
flag set to `true` there as well.
This is fixed by adding the `ALLOW_RESET=TRUE` to the `docker-compose`
file environment variable to the docker container before spinning it

## Issue
This fixes the runtime error that occurs when running the docker
container example code

## Tag Maintainer
@rlancemartin, @eyurtsev
2023-07-30 21:11:56 -07:00
Nicolas Raoul
7f9c6c3baa Fixed typo: papaer -> paper (#8500) 2023-07-30 21:08:11 -07:00
Piyush Jain
b2f8a5bae9 Fixed exports for NeptuneOpenCypherQAChain (#8439)
## Description
The imports for `NeptuneOpenCypherQAChain` are failing. This PR adds the
chain class to the `__init__.py` file to fix this issue.

## Maintainers
@dev2049 
@krlawrence
2023-07-30 20:36:22 -07:00
Eugene Yurtsev
e98e2b2b81 ChatPromptTemplate: clean up doc-string (#8473)
Minor doc-string clean up

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-30 20:11:04 -07:00
Eugene Yurtsev
529cb2e30c Update doc-string in few shot template (#8474)
Partial update of doc-string, need to update other instances in
documentation
2023-07-30 19:39:14 -07:00
Bharat Raghunathan
04ebdbe98f doc(prompts): Add redirects in Prompt subcategories pages (#8478)
- Description: Fixes broken links in some Prompts subcategories in
documentation (Example Selectors, Prompt Templates)
  - Issue: #8477 (Fixes #8477)
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: [@BharatR123](https://twitter.com/BharatR123)
2023-07-30 19:38:52 -07:00
Ludwig Hubert
08f5e6b801 Fix documentation for from_documents signature (#8482)
Docs for from_documents() were outdated as seen in
https://github.com/langchain-ai/langchain/issues/8457 .

fixes #8457 

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-30 13:24:44 -07:00
Muneeb Ahmad
4923cf029a Added Proper Documentation for faiss-gpu Installation (#8492)
### Description
In the LangChain Documentation and Comments, I've Noticed that `pip
install faiss` was mentioned, instead of `pip install faiss-gpu`, since
installing `pip install faiss` results in an error. I've gone ahead and
updated the Documentation, and `faiss.ipynb`. This Change will ensure
ease of use for the end user, trying to install `faiss-gpu`.

### Issue: 
Documentation / Comments Related.

### Dependencies:
No Dependencies we're changed only updated the files with the wrong
reference.

### Tag maintainer:
 @rlancemartin, @eyurtsev (Thank You for your contributions 😄 )
2023-07-30 13:24:30 -07:00
shibuiwilliam
549720ae51 add test to ensure values in time weighted retriever are updated (#8479)
# What
- add test to ensure values in time weighted retriever are updated

<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: add test to ensure values in time weighted retriever are
updated
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ


Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-30 11:42:25 -07:00
Harrison Chase
18a2452121 prompt cleanup (#8470) 2023-07-30 10:47:31 -07:00
Harrison Chase
4d526c49ed bump experimental to 008 (#8490) 2023-07-30 07:28:18 -07:00
Harrison Chase
8f14ddefdf add anthropic functions wrapper (#8475)
a cheeky wrapper around claude that adds in function calling support
(kind of, hence it going in experimental)
2023-07-30 07:23:46 -07:00
Harrison Chase
490ad93b3c fix links generation (#8471) 2023-07-29 18:31:33 -07:00
Nuno Campos
b65a9414bb runnable.bind().bind() should combine kwargs, instead of nesting wrappers (#8467)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-29 15:48:30 -07:00
Harrison Chase
ae4638aa35 improve notebooks (#8461) 2023-07-29 12:49:11 -07:00
Nuno Campos
872abb4198 Implement Runnable for Tools (#8460)
- Make _arun optional
- Pass run_manager to inner chains in tools that have them

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-29 10:01:18 -07:00
Harrison Chase
412fa4e1db add guide notebook (#8258)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-07-29 09:42:59 -07:00
William FH
b7c0eb9ecb Wfh/ref links (#8454) 2023-07-29 08:44:32 -07:00
Harrison Chase
13b4f465e2 log output parser (#8446) 2023-07-29 07:53:45 +01:00
William FH
7d79178827 Wfh/update guide imports (#8452) 2023-07-28 23:12:10 -07:00
William FH
d935573362 Partial formatting for chat messages (#8450) 2023-07-28 23:08:33 -07:00
William FH
3314f54383 Update supabase docstrings (#8443) 2023-07-28 23:08:14 -07:00
Harrison Chase
f63240649c cr 2023-07-28 17:47:00 -07:00
Harrison Chase
17953ab61f add notebook for sql query (#8442) 2023-07-28 17:44:59 -07:00
Harrison Chase
2448043b84 bump and fix (#8441) 2023-07-28 17:16:51 -07:00
Zack Proser
3892cefac6 Minor fixes to enhance notebook usability: (#8389)
- Install langchain
- Set Pinecone API key and environment as env vars
- Create Pinecone index if it doesn't already exist
---
- Description: Fix a couple minor issues I came across when running this
notebook,
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: none,
  - Tag maintainer: @rlancemartin @eyurtsev,
  - Twitter handle: @zackproser (certainly not necessary!)
2023-07-28 17:10:03 -07:00
Amélie
8ee56b9a5b Feature: Add support for meilisearch vectorstore (#7649)
**Description:**

Add support for Meilisearch vector store.
Resolve #7603 

- No external dependencies added
- A notebook has been added

@rlancemartin

https://twitter.com/meilisearch

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-28 17:06:54 -07:00
Bearnardd
b7d6e1909c fix empty ids when metadatas is provided (#8127)
Fixes https://github.com/hwchase17/langchain/issues/7865 and
https://github.com/hwchase17/langchain/issues/8061

- [x] fixes returning empty ids when metadatas argument is provided

@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-28 16:17:31 -07:00
Bharat Raghunathan
62b8b459c6 doc(prompts): Add redirect to fix broken link on Prompts Page (#8408)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-28 16:08:06 -07:00
Bagatur
2311d57df4 mv dropbox (#8438) 2023-07-28 16:07:56 -07:00
Luis Valencia
7124377524 Devcontainer README -> Clarification. (#8414)
- Description: The contribution guidlelines using devcontainer refer to
the main repo and not the forked repo. We should create our changes in
our own forked repo, not on langchain/main
  - Issue: Just documentation
  - Dependencies: N/A,
  - Tag maintainer: @baskaryan
  - Twitter handle: @levalencia
2023-07-28 15:09:42 -07:00
lvisdd
abe4c361f9 update get_num_tokens_from_messages model (#8431)
(#8430)

Co-authored-by: Kano Kunihiko <kkano@heroz.co.jp>
2023-07-28 15:07:03 -07:00
Jeffrey Wang
e0de62f6da Add RoPE Scaling params from llamacpp (#8422)
Description:
Just adding parameters from `llama-python-cpp` that support RoPE
scaling.
@hwchase17, @baskaryan

sources:
papers and explanation:
https://kaiokendev.github.io/context
llamacpp conversation:
https://github.com/ggerganov/llama.cpp/discussions/1965 
Supports models like:
https://huggingface.co/conceptofmind/LLongMA-2-13b
2023-07-28 14:42:41 -07:00
Bagatur
2db2987b1b add experimental ref (#8435) 2023-07-28 14:26:47 -07:00
Harrison Chase
fab24457bc remove code (#8425) 2023-07-28 13:19:44 -07:00
Harrison Chase
3a78450883 update experimental (#8402)
some changes were made to experimental, porting them over
2023-07-28 13:01:36 -07:00
Harrison Chase
af7e70d4af expose function for converting messages to messages (#8426) 2023-07-28 13:00:54 -07:00
Eugene Yurtsev
06bdbe06fe PromptTemplate update documentation and expand kwarg (#8423)
# PromptTemplate

* Update documentation to highlight the classmethod for instantiating a
prompt template.
* Expand kwargs in the classmethod to make parameters easier to discover

This PR got reverted here:
https://github.com/langchain-ai/langchain/pull/8395/files
2023-07-28 14:11:49 -04:00
Eugene Yurtsev
e62a1686e2 ChatPromptTemplate: minor fix in doc string (#8424)
Minor fix in doc-string to use `ai` rather than `assistant`
2023-07-28 13:01:13 -04:00
Eugene Yurtsev
760c278fe0 ChatPromptTemplate: Expand support for message formats and documentation (#8244)
* Expands support for a variety of message formats in the
`from_messages` classmethod. Ideally, we could deprecate the other
on-ramps to reduce the amount of classmethods users need to know about.
* Expand documentation with code examples.
2023-07-28 12:48:08 -04:00
Bagatur
61dd92f821 bump 246 (#8410) 2023-07-28 01:18:37 -07:00
Harrison Chase
394b67ab92 add kwargs to llm runnables (#8388) 2023-07-28 09:13:11 +01:00
HeTaoPKU
d5884017a9 Add Minimax llm model to langchain (#7645)
- Description: Minimax is a great AI startup from China, recently they
released their latest model and chat API, and the API is widely-spread
in China. As a result, I'd like to add the Minimax llm model to
Langchain.
- Tag maintainer: @hwchase17, @baskaryan

---------

Co-authored-by: the <tao.he@hulu.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 22:53:23 -07:00
James Campbell
0ad2d5f27a [nit] Add default value for ChatOpenAI client (#7939)
Micro convenience PR to avoid warning regarding missing `client`
parameter. It is always set during initialization.

@baskaryan

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 22:38:32 -07:00
Harrison Chase
82df923f37 Merge branch 'master' of github.com:hwchase17/langchain 2023-07-27 22:01:20 -07:00
Harrison Chase
1b0bfa54cf cr 2023-07-27 22:00:52 -07:00
Jeff Vestal
c7ff5f19a8 ElasticKnnSearch rewrite - bug fix - return Document (#8180)
Fixes: 
https://github.com/hwchase17/langchain/issues/7117
https://github.com/hwchase17/langchain/issues/5760

Adding back `create_index` , `add_texts`, `from_texts` to
ElasticKnnSearch

`from_texts` matches standard `from_texts` methods as quick start up
method

`knn_search` and `hybrid_result` return a list of [`Document()`,
`score`,]

# Test `from_texts` for quick start
```
# create new index using from_text

from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch
from langchain.embeddings import ElasticsearchEmbeddings

model_id = "sentence-transformers__all-distilroberta-v1" 
dims = 768
es_cloud_id = ""
es_user = ""
es_password = ""
test_index = "knn_test_index_305"

embeddings = ElasticsearchEmbeddings.from_credentials(
    model_id,
    #input_field=input_field,
    es_cloud_id=es_cloud_id,
    es_user=es_user,
    es_password=es_password,
)

# add texts and create class instance
texts = ["This is a test document", "This is another test document"]
knnvectorsearch = ElasticKnnSearch.from_texts(
    texts=texts,
    embedding=embeddings,
    index_name= test_index,
    vector_query_field='vector',
    query_field='text',
    model_id=model_id,
    dims=dims,
	es_cloud_id=es_cloud_id, 
	es_user=es_user, 
	es_password=es_password
)

# Test `add_texts` method
texts2 = ["Hello, world!", "Machine learning is fun.", "I love Python."]
knnvectorsearch.add_texts(texts2)

query = "Hello"
knn_result = knnvectorsearch.knn_search(query = query, model_id= model_id, k=2)

hybrid_result = knnvectorsearch.knn_hybrid_search(query = query, model_id= model_id, k=2)

```

The  mapping is as follows:
```
{
  "knn_test_index_012": {
    "mappings": {
      "properties": {
        "text": {
          "type": "text"
        },
        "vector": {
          "type": "dense_vector",
          "dims": 768,
          "index": true,
          "similarity": "dot_product"
        }
      }
    }
  }
}
```

# Check response type
```
>>> hybrid_result
[(Document(page_content='Hello, world!', metadata={}), 0.94232327), (Document(page_content='I love Python.', metadata={}), 0.5321523)]

>>> hybrid_result[0]
(Document(page_content='Hello, world!', metadata={}), 0.94232327)

>>> hybrid_result[0][0]
Document(page_content='Hello, world!', metadata={})

>>> type(hybrid_result[0][0])
<class 'langchain.schema.document.Document'>
```

# Test with existing Index
```
from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch
from langchain.embeddings import ElasticsearchEmbeddings

## Initialize ElasticsearchEmbeddings
model_id = "sentence-transformers__all-distilroberta-v1" 
dims = 768
es_cloud_id = 
es_user = ""
es_password = ""
test_index = "knn_test_index_012"

embeddings = ElasticsearchEmbeddings.from_credentials(
    model_id,
    es_cloud_id=es_cloud_id,
    es_user=es_user,
    es_password=es_password,
)

## Initialize ElasticKnnSearch
knn_search = ElasticKnnSearch(
	es_cloud_id=es_cloud_id, 
	es_user=es_user, 
	es_password=es_password, 
	index_name= test_index, 
	embedding= embeddings
)


## Test adding vectors

### Test `add_texts` method when index created
texts = ["Hello, world!", "Machine learning is fun.", "I love Python."]
knn_search.add_texts(texts)

```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 22:00:18 -07:00
Harrison Chase
a221a9ced0 Harrison/sql query (#8370)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-07-27 21:55:17 -07:00
Bagatur
a1a650c743 Bagatur/from texts bug fix (#8394)
---------

Co-authored-by: Davit Buniatyan <davit@loqsh.com>
Co-authored-by: Davit Buniatyan <d@activeloop.ai>
Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>
Co-authored-by: Ivo Stranic <istranic@gmail.com>
2023-07-27 21:52:38 -07:00
Jiayi Ni
1efb9bae5f FEAT: Integrate Xinference LLMs and Embeddings (#8171)
- [Xorbits
Inference(Xinference)](https://github.com/xorbitsai/inference) is a
powerful and versatile library designed to serve language, speech
recognition, and multimodal models. Xinference supports a variety of
GGML-compatible models including chatglm, whisper, and vicuna, and
utilizes heterogeneous hardware and a distributed architecture for
seamless cross-device and cross-server model deployment.
- This PR integrates Xinference models and Xinference embeddings into
LangChain.
- Dependencies: To install the depenedencies for this integration, run
    
    `pip install "xinference[all]"`
    
- Example Usage:

To start a local instance of Xinference, run `xinference`.

To deploy Xinference in a distributed cluster, first start an Xinference
supervisor using `xinference-supervisor`:

`xinference-supervisor -H "${supervisor_host}"`

Then, start the Xinference workers using `xinference-worker` on each
server you want to run them on.

`xinference-worker -e "http://${supervisor_host}:9997"`

To use Xinference with LangChain, you also need to launch a model. You
can use command line interface (CLI) to do so. Fo example: `xinference
launch -n vicuna-v1.3 -f ggmlv3 -q q4_0`. This launches a model named
vicuna-v1.3 with `model_format="ggmlv3"` and `quantization="q4_0"`. A
model UID is returned for you to use.

Now you can use Xinference with LangChain:

```python
from langchain.llms import Xinference

llm = Xinference(
    server_url="http://0.0.0.0:9997", # suppose the supervisor_host is "0.0.0.0"
    model_uid = {model_uid} # model UID returned from launching a model
)

llm(
    prompt="Q: where can we visit in the capital of France? A:",
    generate_config={"max_tokens": 1024},
)
```

You can also use RESTful client to launch a model:
```python
from xinference.client import RESTfulClient

client = RESTfulClient("http://0.0.0.0:9997")

model_uid = client.launch_model(model_name="vicuna-v1.3", model_size_in_billions=7, quantization="q4_0")
```

The following code block demonstrates how to use Xinference embeddings
with LangChain:
```python
from langchain.embeddings import XinferenceEmbeddings

xinference = XinferenceEmbeddings(
    server_url="http://0.0.0.0:9997",
    model_uid = model_uid
)
```

```python
query_result = xinference.embed_query("This is a test query")
```

```python
doc_result = xinference.embed_documents(["text A", "text B"])
```

Xinference is still under rapid development. Feel free to [join our
Slack
community](https://xorbitsio.slack.com/join/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA)
to get the latest updates!

- Request for review: @hwchase17, @baskaryan
- Twitter handle: https://twitter.com/Xorbitsio

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 21:23:19 -07:00
Bagatur
877d384bc9 Revert "PromptTemplate update documentation and expand kwargs (#8234)" (#8395)
fyi @eyurtsev was failing a unit test
2023-07-27 21:11:10 -07:00
Gordon Clark
e66759cc9d Github add "Create PR" tool + Docs update (#8235)
Added a new tool to the Github toolkit called **Create Pull Request.**
Now we can make our own langchain contributor in langchain 😁

In order to have somewhere to pull from, I also added a new env var,
"GITHUB_BASE_BRANCH." This will allow the existing env var,
"GITHUB_BRANCH," to be a working branch for the bot (so that it doesn't
have to always commit on the main/master). For example, if you want the
bot to work in a branch called `bot_dev` and your repo base is `main`,
you would set up the vars like:
```
GITHUB_BASE_BRANCH = "main"
GITHUB_BRANCH = "bot_dev"
``` 

Maintainer responsibilities:
  - Agents / Tools / Toolkits: @hinthornw
2023-07-27 19:19:44 -07:00
William FH
ecd4aae818 Few Shot Chat Prompt (#8038)
Proposal for a few shot chat message example selector

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-07-27 18:46:10 -07:00
Eugene Yurtsev
6dd18eee26 PromptTemplate update documentation and expand kwargs (#8234)
# PromptTemplate

* Update documentation to highlight the classmethod for instantiating a
prompt template.
* Expand kwargs in the classmethod to make parameters easier to discover
2023-07-27 18:11:39 -07:00
Karan V
a003a0baf6 fix(petals) allows to run models that aren't Bloom (Support for LLama and newer models) (#8356)
In this PR:

- Removed restricted model loading logic for Petals-Bloom
- Removed petals imports (DistributedBloomForCausalLM,
BloomTokenizerFast)
- Instead imported more generalized versions of loader
(AutoDistributedModelForCausalLM, AutoTokenizer)
- Updated the Petals example notebook to allow for a successful
installation of Petals in Apple Silicon Macs

- Tag maintainer: @hwchase17, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 18:01:04 -07:00
lars.gersmann
e758e9e7f5 fix(openapi): openapi chain will work without/empty description/summa… (#8351)
Description: 

This PR will enable the Open API chain to work with valid Open API
specifications missing `description` and `summary` properties for path
and operation nodes in open api specs.

Since both `description` and `summary` property are declared optional we
cannot be sure they are defined. This PR resolves this problem by
providing an empty (`''`) description as fallback.

The previous behavior of the Open API chain was that the underlying LLM
(OpenAI) throw ed an exception since `None` is not of type string:

```
openai.error.InvalidRequestError: None is not of type 'string' - 'functions.0.description'
```

Using this PR the Open API chain will succeed also using Open API specs
lacking `description` and `summary` properties for path and operation
nodes.

Thanks for your amazing work !

Tag maintainer: @baskaryan

---------

Co-authored-by: Lars Gersmann <lars.gersmann@cm4all.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 17:58:43 -07:00
ljeagle
caa6caeb8a Upgrade the AwaDB from v0.3.7 to v0.3.9 and change the default embeddings (#8281)
1. Upgrade the AwaDB from v0.3.7 to v0.3.9
2. Change the default embedding to AwaEmbedding

---------

Co-authored-by: ljeagle <awadb.vincent@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-27 17:20:50 -07:00
Harrison Chase
25b8cc7e3d Harrison/update memory docs (#8384)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 17:18:19 -07:00
Holt Skinner
d7e6770de8 refactor: Code refactoring & simplification for Google Cloud Enterprise Search retriever (#8369)
Followup to https://github.com/langchain-ai/langchain/pull/7857

- Changes `_convert_search_response()` to use object attributes instead
of converting to dictionary
- Simplifies logic for readability
2023-07-27 17:13:49 -07:00
Taozhi Wang
594f195e54 Add embeddings for AwaEmbedding (#8353)
- Description: Adds AwaEmbeddings class for embeddings, which provides
users with a convenient way to do fine-tuning, as well as the potential
need for multimodality

  - Tag maintainer: @baskaryan

Create `Awa.ipynb`: an example notebook for AwaEmbeddings class
Modify `embeddings/__init__.py`: Import the class
Create `embeddings/awa.py`: The embedding class
Create `embeddings/test_awa.py`: The test file.

---------

Co-authored-by: taozhiwang <taozhiwa@gmail.com>
2023-07-27 17:08:00 -07:00
thehunmonkgroup
ba4e82bb47 fix missing _identifying_params() in _VertexAICommon (#8303)
Full set of params are missing from Vertex* LLMs when `dict()` method is
called.

```
>>> from langchain.chat_models.vertexai import ChatVertexAI
>>> from langchain.llms.vertexai import VertexAI
>>> chat_llm = ChatVertexAI()
l>>> llm = VertexAI()
>>> chat_llm.dict()
{'_type': 'vertexai'}
>>> llm.dict()
{'_type': 'vertexai'}
```

This PR just uses the same mechanism used elsewhere to expose the full
params.

Since `_identifying_params()` is on the `_VertexAICommon` class, it
should cover the chat and non-chat cases.
2023-07-27 16:59:10 -07:00
bheroder
dc3ca44e05 Add an example for azure ml managed feature store (#8324)
We are adding an example of how one can connect to azure ml managed
feature store and use such a prompt template in a llm chain. @baskaryan
2023-07-27 16:56:06 -07:00
Caitlin2694
b2e4b9dca4 Fix exception caused by restrictions in OWL (#8341)
Description: Fix exception caused by restrictions in OWL
Issue: #8331
Dependencies: none
Maintainer: @baskaryan
2023-07-27 16:51:32 -07:00
Harrison Chase
cddd8ae83d update release yml (#8364)
only do the step that tags and adds release notes if its langchain
2023-07-27 16:49:04 -07:00
Nikita Pokidyshev
f499e6ea6a Add FunctionMessage to _message_from_dict (#8374)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 16:45:27 -07:00
evelynmitchell
539574670c Update tot.ipynb (#8387)
Spelling error fix

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 16:44:41 -07:00
emarco177
2ab13ab743 added unit tests for mrkl output_parser.py (#8321)
- Description: added unit tests for mrkl output_parser.py, 
  - Tag maintainer: @hinthornw
  - Twitter handle: EdenEmarco177
2023-07-27 13:46:06 -07:00
Sachin Varghese
01217b2247 Update sql database agent example (#8354)
This PR fixes a minor documentation issue on the SQL database toolkit
example notebook.
2023-07-27 13:44:02 -07:00
Bagatur
55beab326c cleanup warnings (#8379) 2023-07-27 13:43:05 -07:00
William FH
41524304bf Update local script for docs build (#8377) 2023-07-27 13:13:59 -07:00
Harrison Chase
f5bf893035 rename to str output parser (#8373) 2023-07-27 12:57:34 -07:00
William FH
0e9e5b5202 Retry events on any run type (#8375) 2023-07-27 12:56:46 -07:00
Bagatur
68763bd25f mv popular and additional chains to use cases (#8242) 2023-07-27 12:55:13 -07:00
William FH
ff98fad2d9 Add Retry Events (#8053)
![image](https://github.com/hwchase17/langchain/assets/13333726/59a5c3b4-4367-47e6-9f58-5b6557576a8a)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-27 12:39:39 -07:00
William FH
94a693e2ee Link to use cases from tutorials (#8371) 2023-07-27 11:54:04 -07:00
Nuno Campos
0eca3e7d90 Add Runnable.bind method to attach kwargs to a Runnable that will be passed to all invoke/stream/batch calls when it is run (#8368)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 11:16:30 -07:00
Harrison Chase
cf608f876b update link 2023-07-27 09:47:57 -07:00
Nuno Campos
1bbadde77b Support using RunnableMap directly (#8317)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-27 17:24:29 +01:00
Bagatur
944321c6ab bump 245 (#8359) 2023-07-27 06:53:24 -07:00
Rubén Barragán
ef6332ead6 Support loading files from Dropbox (#8271)
## Description
This commit introduces the `DropboxLoader` class, a new document loader
that allows loading files from Dropbox into the application. The loader
relies on a Dropbox app, which requires creating an app on Dropbox,
obtaining the necessary scope permissions, and generating an access
token. Additionally, the dropbox Python package is required.

The `DropboxLoader` class is designed to be used as a document loader
for processing various file types, including text files, PDFs, and
Dropbox Paper files.

## Dependencies
`pip install dropbox` and `pip install unstructured` for PDF reading.

## Tag maintainer
@rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some
feedback here 🙏 .

## Social Networks
https://github.com/rubenbarragan
https://www.linkedin.com/in/rgbarragan/
https://twitter.com/RubenBarraganP

---------

Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>
2023-07-27 06:36:08 -07:00
Pranay Chandekar
41bb3a6f9b fixed the bug #8343 (#8345)
- Issue: #8343

Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>
2023-07-27 06:33:15 -07:00
Ikko Eltociear Ashimine
934ea80780 Fix typo in Etherscan.ipynb (#8340)
specifc  -> specific
2023-07-27 01:57:19 -07:00
Martin Krasser
93260a9922 Fix broken make targets format_diff and lint_diff (#8344)
Since the refactoring into sub-projects `libs/langchain` and
`libs/experimental`, the `make` targets `format_diff` and `lint_diff` do
not work anymore when running `make` from these subdirectories. Reason
is that

```
PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
```

generates paths from the project's root directory instead of the
corresponding subdirectories. This PR fixes this by adding a
`--relative` command line option.

- Tag maintainer: @baskaryan
2023-07-27 01:56:55 -07:00
Harrison Chase
ae78ef7fe6 bump experimental to 005 (#8339) 2023-07-26 21:46:28 -07:00
Vadim Gubergrits
e7e5cb9d08 Tree of Thought introducing a new ToTChain. (#5167)
# [WIP] Tree of Thought introducing a new ToTChain.

This PR adds a new chain called ToTChain that implements the ["Large
Language Model Guided
Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper.

There's a notebook example `docs/modules/chains/examples/tot.ipynb` that
shows how to use it.


Implements #4975


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @hwchase17
- @vowelparrot

---------

Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-26 21:29:39 -07:00
William FH
412e29d436 Fix notebook that 'cannot convert' via nbdoc_build (#8333) 2023-07-26 18:54:23 -07:00
William FH
9eb7e6e27f Delete Old Evals Examples (#8252)
Still retain:
- Comparison Examples
- Data + QA walkthrough
- QA (but really minimize it)
2023-07-26 18:46:54 -07:00
Saurabh Misra
db9d5b213a Optimize the cosine_similarity_top_k function performance (#8151)
Optimizing important numerical code and making it run faster.

Performance went up by 1.48x (148%). Runtime went down from 138715us to
56020us

Optimization explanation:

The `cosine_similarity_top_k` function is where we made the most
significant optimizations.
Instead of sorting the entire score_array which needs considering all
elements, `np.argpartition` is utilized to find the top_k largest scores
indices, this operation has a time complexity of O(n), higher
performance than sorting. Remember, `np.argpartition` doesn't guarantee
the order of the values. So we need to use argsort() to get the indices
that would sort our top-k values after partitioning, which is much more
efficient because it only sorts the top-K elements, not the entire
array. Then to get the row and column indices of sorted top_k scores in
the original score array, we use `np.unravel_index`. This operation is
more efficient and cleaner than a list comprehension.

The code has been tested for correctness by running the following
snippet on both the original function and the optimized function and
averaged over 5 times.
```
def test_cosine_similarity_top_k_large_matrices():
    X = np.random.rand(1000, 1000)
    Y = np.random.rand(1000, 1000)
    top_k = 100
    score_threshold = 0.5
    gc.disable()
    counter = time.perf_counter_ns()
    return_value = cosine_similarity_top_k(X, Y, top_k, score_threshold)
    duration = time.perf_counter_ns() - counter
    gc.enable()
```

@hwaking @hwchase17 @jerwelborn 

Unit tests pass, I also generated more regression tests which all
passed.
2023-07-26 18:03:49 -07:00
Fabrizio Ruocco
ddc353a768 Azure Cognitive Search: Custom index and scoring profile support (#6843)
Description: Adding support for custom index and scoring profile support
in Azure Cognitive Search
@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 17:58:01 -07:00
Leonid Ganeline
ed24de8467 removed namespace title (#8208)
This change compacts the left-side Navbar (ToC) of the [API
Reference](https://api.python.langchain.com/en/latest/api_reference.html).
Now almost each namespace item is split into two lines. For example
`langchain.chat_models: Chat Models`
We remove the `Chat Models` and leave one the `langchain.chat_models`. 
This effectively compacts the navbar and increases the main page's
usability. On my screen, it reduces # of lines in Toc from 28 t to 18,
which is huge.

Removing the namespace "title" (like `Chat Models`) does not remove any
information because the title is composed directly from the namespace.
API Reference users are developers. Usability for them is very
important. We see less text => we find faster.
2023-07-26 16:45:23 -07:00
Kacper Łukawski
c5988c1d4b Implement async support for Cohere (#8237)
This PR introduces async API support for Cohere, both LLM and
embeddings. It requires updating `cohere` package to `^4`.

Tagging @hwchase17, @baskaryan, @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:51:18 -07:00
Daniel Alexander Brenot
bf1357f584 Added async support to PlanAndExecute Chain (#8239)
- Description: Adds async support to the PlanAndExecute Chain

Maintainer responsibilities:
  - Async: @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:16:07 -07:00
Bastin Florian
a3ac9b23eb feat(confluence): add markdown format option (#8246)
# Description:
**Add the possibility to keep text as Markdown in the ConfluenceLoader**
Add a bool variable that allows to keep the Markdown format of the
Confluence pages.
It is useful because it allows to use MarkdownHeaderTextSplitter as a
DataSplitter.
If this variable in set to True in the load() method, the pages are
extracted using the markdownify library.

  # Issue: 
[4407](https://github.com/langchain-ai/langchain/issues/4407)
  # Dependencies: 
Add the markdownify library
  # Tag maintainer:
 @rlancemartin, @eyurtsev
  # Twitter handle:
 FloBastinHeyI - https://twitter.com/FloBastinHeyI

---------

Co-authored-by: Florian Bastin <florian.bastin@octo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 15:00:27 -07:00
Leonid Ganeline
ee6ff96e28 docstrings cleanup (#8311)
- added missed docstrings
 - changed docstrings into consistent format
  
@baskaryan
2023-07-26 14:13:10 -07:00
Bagatur
ceab0a7c1f update api ref style (#8318) 2023-07-26 14:12:44 -07:00
Rohit Gupta
e5dba8978a Avoid re-computation of embedding in weaviate similarity search (#8284)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 13:31:55 -07:00
William FH
01a9b06400 Add api cross ref linking (#8275)
Example of how it would show up in our python docs:


![image](https://github.com/langchain-ai/langchain/assets/13333726/0f0a88cc-ba4a-4778-bc47-118c66807f15)


Examples added to the reference docs:

https://api.python.langchain.com/en/wfh-api_crosslink/vectorstores/langchain.vectorstores.chroma.Chroma.html#langchain.vectorstores.chroma.Chroma


![image](https://github.com/langchain-ai/langchain/assets/13333726/dcd150de-cb56-4d42-b49a-a76a002a5a52)
2023-07-26 12:38:58 -07:00
Nuno Campos
a612800ef0 Runnable single protocol (#7800)
Objects implementing Runnable: BasePromptTemplate, LLM, ChatModel,
Chain, Retriever, OutputParser

- [x] Implement Runnable in base Retriever
- [x] Raise TypeError in operator methods for unsupported things 
- [x] Implement dict which calls values in parallel and outputs dict
with results
- [x] Merge in `+` for prompts
- [x] Confirm precedence order for operators, ideal would be `+` `|`,
https://docs.python.org/3/reference/expressions.html#operator-precedence
- [x] Add support for openai functions, ie. Chat Models must return
messages
- [x] Implement BaseMessageChunk return type for BaseChatModel, a
subclass of BaseMessage which implements __add__ to return
BaseMessageChunk, concatenating all str args
- [x] Update implementation of stream/astream for llm and chat models to
use new `_stream`, `_astream` optional methods, with default
implementation in base class `raise NotImplementedError` use
https://stackoverflow.com/a/59762827 to see if it is implemented in base
class
- [x] Delete the IteratorCallbackHandler (leave the async one because
people using)
- [x] Make BaseLLMOutputParser implement Runnable, accepting either str
or BaseMessage
---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-07-26 12:16:46 -07:00
Bharat
04a4d3e312 Fixes #8310 Fix maximum recursion depth exceeded error (#8313)
ElasticsearchVectorStore.as_retriever() method is returning 
`RecursionError: maximum recursion depth exceeded` 
because of incorrect field reference in
 `embeddings()` method

  - Description: Fix RecursionError because of a typo
  - Issue: the issue #8310 
  - Dependencies: None,
  - Tag maintainer: @eyurtsev
  - Twitter handle: bpatel
2023-07-26 12:15:37 -07:00
Caitlin2694
b9db3dd09b Fix "missing key op" RDFGraph OWL serialization (#8276)
Replace this comment with:
- Description: Fix "missing key op" error in RDFGraph OWL Serialization
  - Issue: #8263
  - Dependencies: None
  - Tag maintainer: @baskaryan
2023-07-26 12:14:56 -07:00
Eugene Yurtsev
862e9aed66 ChatPromptTemplate: Update doc-strings, update from_role_strings behavior (#8308)
* Update doc-strings in ChatPromptTemplate
* Update from_role_strings classmethod to use well known roles
2023-07-26 15:02:36 -04:00
Bagatur
2c2fd9ff13 bump 244 (#8314) 2023-07-26 11:58:26 -07:00
Lance Martin
77c0582243 Clean queries prior to search (#8309)
With some search tools, we see no results returned if the query is a
numeric list.

E.g., if we pass:
```
'1. "LangChain vs LangSmith: How do they differ?"'
```

We see:
```
No good Google Search Result was found
```

Local testing w/ Streamlit:

![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)
2023-07-26 11:48:28 -07:00
shibuiwilliam
6b88fbd9bb add test for embedding distance evaluation (#8285)
Add tests for embedding distance evaluation

  - Description: Add tests for embedding distance evaluation
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: @MlopsJ
2023-07-26 11:45:50 -07:00
Riche Akparuorji
f3d2fdd54c Fix for code snippet in documentation (#8290)
- Description: I fixed an issue in the code snippet related to the
variable name and the evaluation of its length. The original code used
the variable "docs," but the correct variable name is "docs_svm" after
using the SVMRetriever.
- maintainer: @baskaryan
- Twitter handle: @iamreechi_

Co-authored-by: iamreechi <richieakparuorji>
2023-07-26 11:31:08 -07:00
Bagatur
f27176930a fix geopandas link (#8305) 2023-07-26 11:30:17 -07:00
Timon Palm
70604e590f DuckDuckGoSearch News Tool (#8292)
Description: 
I wanted to use the DuckDuckGoSearch tool in an agent to let him get the
latest news for a topic. DuckDuckGoSearch has already an implemented
function for retrieving news articles. But there wasn't a tool to use
it. I simply adapted the SearchResult class with an extra argument
"backend". You can set it to "news" to only get news articles.

Furthermore, I added an example to the DuckDuckGo Notebook on how to
further customize the results by using the DuckDuckGoSearchAPIWrapper.

Dependencies: no new dependencies
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 11:30:01 -07:00
Aarav Borthakur
8ce661d5a1 Docs: Fix Rockset links (#8214)
Fix broken Rockset links.

Right now links at
https://python.langchain.com/docs/integrations/providers/rockset are
broken.
2023-07-26 10:38:37 -07:00
Byron Saltysiak
61347bd322 giving path to the copy command for *.toml files (#8294)
Description: in the .devcontainer, docker-compose build is currently
failing due to the src paths in the COPY command. This change adds the
full path to the pyproject.toml and poetry.toml to allow the build to
run.
Issue: 

You can see the issue if you try to build the dev docker image with:
```
cd .devcontainer
docker-compose build
```

Dependencies: none
Twitter handle: byronsalty
2023-07-26 10:37:03 -07:00
happyxhw
6384c1ec8f fix: ElasticVectorSearch.from_documents failed #8293 (#8296)
- Description: fix ElasticVectorSearch.from_documents with
elasticsearch_url param,
- Issue: ElasticVectorSearch.from_documents failed #8293 # it fixes (if
applicable),


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 10:33:52 -07:00
Jon Bennion
ad38eb2d50 correction to reference to code (#8301)
- Description: fixes typo referencing code

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-26 10:33:18 -07:00
jacobswe
83a53e2126 Bug Fix: AzureChatOpenAI streaming with function calls (#8300)
- Description: During streaming, the first chunk may only contain the
name of an OpenAI function and not any arguments. In this case, the
current code presumes there is a streaming response and tries to append
to it, but gets a KeyError. This fixes that case by checking if the
arguments key exists, and if not, creates a new entry instead of
appending.
  - Issue: Related to #6462

Sample Code:
```python
llm = AzureChatOpenAI(
    deployment_name=deployment_name,
    model_name=model_name,
    streaming=True
)

tools = [PythonREPLTool()]
callbacks = [StreamingStdOutCallbackHandler()]

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    callbacks=callbacks
)

agent('Run some python code to test your interpreter')
```

Previous Result:
```
File ...langchain/chat_models/openai.py:344, in ChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
    342         function_call = _function_call
    343     else:
--> 344         function_call["arguments"] += _function_call["arguments"]
    345 if run_manager:
    346     run_manager.on_llm_new_token(token)

KeyError: 'arguments'
```

New Result:
```python
{'input': 'Run some python code to test your interpreter',
 'output': "The Python code `print('Hello, World!')` has been executed successfully, and the output `Hello, World!` has been printed."}
```

Co-authored-by: jswe <jswe@polencapital.com>
2023-07-26 10:11:50 -07:00
German Martin
457a4730b2 Fix the mangling issue on several VectorStores child classes. (#8274)
- Description: Fix mangling issue affecting a couple of VectorStore
classes including Redis.
  - Issue: https://github.com/langchain-ai/langchain/issues/8185
  - @rlancemartin 
  
This is a simple issue but I lack of some context in the original
implementation.
My changes perhaps are not the definitive fix but to start a quick
discussion.

@hinthornw Tagging you since one of your changes introduced this
[here.](c38965fcba)
2023-07-26 09:48:55 -07:00
Alec Flett
4da43f77e5 Add ability to load (deserialize) objects from other namespaces (#7726)
I have some Prompt subclasses in my project that I'd like to be able to
deserialize in callbacks. Right now `loads()`/`load()` will bail when it
encounters my object, but I know I can trust the objects because they're
in my own projects.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-26 16:59:28 +01:00
Bagatur
5c6dcb1960 bump 243 (#8289) 2023-07-26 05:41:56 -07:00
William FH
adf019724f unpack later (#8278)
Fix https://github.com/langchain-ai/langchain/issues/8272
2023-07-26 01:53:22 -07:00
Naveen Tatikonda
9cbefcc56c [ OpenSearch ] : Add AOSS Support to OpenSearch (#8256)
### Description

This PR includes the following changes:

- Adds AOSS (Amazon OpenSearch Service Serverless) support to
OpenSearch. Please refer to the documentation on how to use it.
- While creating an index, AOSS only supports Approximate Search with
`nmslib` and `faiss` engines. During Search, only Approximate Search and
Script Scoring (on doc values) are supported.
- This PR also adds support to `efficient_filter` which can be used with
`faiss` and `lucene` engines.
- The `lucene_filter` is deprecated. Instead please use the
`efficient_filter` for the lucene engine.


Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-07-25 23:59:36 -07:00
Lance Martin
7a00f17033 Web research retriever (#8102)
Given a user question, this will -
* Use LLM to generate a set of queries.
* Query for each.
* The URLs from search results are stored in self.urls.
* A check is performed for any new URLs that haven't been processed yet
(not in self.url_database).
* Only these new URLs are loaded, transformed, and added to the
vectorstore.
* The vectorstore is queried for relevant documents based on the
questions generated by the LLM.
* Only unique documents are returned as the final result.

This code will avoid reprocessing of URLs across multiple runs of
similar queries, which should improve the performance of the retriever.
It also keeps track of all URLs that have been processed, which could be
useful for debugging or understanding the retriever's behavior.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-25 19:58:00 -07:00
Rithwik Ediga Lakhamsani
d1d691caa4 Added Databricks support to MLflow Callback (#7906)
Added a quick check to make integration easier with Databricks; another
option would be to make a new class, but this seemed more
straightfoward.

cc: @liangz1 Can this be done in a more straightfoward way?
2023-07-25 18:23:54 -07:00
William FH
479cc086ba Rm Github Import (#8257)
It's not a required dep but would break peoples builds

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-25 18:20:58 -07:00
Byron Saltysiak
68a906bb31 added lxml to the pip install example since it is required (#8260)
- Description: The trello dataloader example didn't work without an
additional dependency installed - lxml
  - Issue: na
2023-07-25 18:16:07 -07:00
Emory Petermann
7734a2b5ab update golden-query notebook and fix typo in golden docs (#8253)
updating the documentation to be consistent for Golden query tool and
have a better introduction to the tool
2023-07-25 18:15:48 -07:00
Erick Friis
c14571ab37 New enterprise support form (#8254) 2023-07-25 15:43:27 -07:00
William FH
dd87275dde Add LLMChain example of memory with chat models (#8250) 2023-07-25 15:20:32 -07:00
William FH
1f40d3e094 Update Broken Links (#8247) 2023-07-25 12:26:39 -07:00
Eugene Yurtsev
ec069381fb Remove operator overloading for BaseMessage (#8245)
This PR removes operator overloading for base message.

Removing the `+` operating from base message will help make sure that:

1) There's no need to re-define `+` for message chunks
2) That there's no unexpected behavior in terms of types changing
(adding two messages yields a ChatPromptTemplate which is not a message)
2023-07-25 20:12:19 +01:00
William FH
30c2d3cd06 Update references (#8243) 2023-07-25 11:49:25 -07:00
jacobswe
0af48b06d0 Bug Fix #6462 (#8241)
- Description: Small change to fix broken Azure streaming. More complete
migration probably still necessary once the new API behavior is
finalized.
- Issue: Implements fix by @rock-you in #6462 
- Dependencies: N/A

There don't seem to be any tests specifically for this, and I was having
some trouble adding some. This is just a small temporary fix to allow
for the new API changes that OpenAI are releasing without breaking any
other code.

---------

Co-authored-by: Jacob Swe <jswe@polencapital.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-25 11:30:22 -07:00
Bagatur
c1ea8da9bc bump 242 (#8238) 2023-07-25 08:01:37 -07:00
shibuiwilliam
af788b7cf0 Add/faiss test score threshold (#8224)
# What
- This is to add test for faiss vector store with score threshold

<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: This is to add test for faiss vector store with score
threshold
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-25 09:56:29 -04:00
shibuiwilliam
bed8eb978e use logger instead of logging (#8225)
# What
- Use `logger` instead of using logging directly.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: Use `logger` instead of using logging directly.
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: @MlopsJ

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-25 09:55:30 -04:00
Leonid Ganeline
afc55a4fee Refactored requests (#8203)
Refactored `requests.py`. The same as
https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099
requests.py is in the root code folder. This creates the
`langchain.requests: Requests` group on the API Reference navigation
ToC, on the same level as Chains and Agents which is incorrect.

Refactoring:

- copied requests.py content into utils/requests.py
- I added the backwards compatibility ref in the original requests.py. 
- updated imports to requests objects

@hwchase17, @baskaryan
2023-07-24 21:23:59 -07:00
William FH
0a16b3d84b Update Integrations links (#8206) 2023-07-24 21:20:32 -07:00
Alex Stachowiak
a7efa95775 Update base chain type hints (#7680)
Addresses #7578. `run()` can return dictionaries, Pydantic objects or
strings, so the type hints should reflect that. See the chain from
`create_structured_output_chain` for an example of a non-string return
type from `run()`.

I've updated the BaseLLMChain return type hint from `str` to `Any`.
Although, the differences between `run()` and `__call__()` seem less
clear now.

CC: @baskaryan

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 21:16:41 -07:00
Ani peter benjamin
e58b1d7073 feat: temp fixed Could not parse LLM output on agents folder (#7746)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 19:20:37 -07:00
Dayuan Jiang
125ae6d9de add Hybrid retriever that not require any external service (#8108)
- Until now, hybrid search was limited to modules requiring external
services, such as Weaviate/Pinecone Hybrid Search. However, I have
developed a hybrid retriever that can merge a list of retrievers using
the [Reciprocal Rank
Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf)
algorithm. This new approach, similar to Weaviate hybrid search, does
not require the initialization of any external service.
  - Dependencies: No  - Twitter handle: dayuanjian21687

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 19:16:10 -07:00
Dario Ruben
04e45f9cde Fixed grammar in LLM models documentation (#8210)
Description: I fixed a typo in the documentation related to LLMs
(https://python.langchain.com/docs/modules/model_io/models/llms/)
2023-07-24 19:14:32 -07:00
earonesty
59a7c5877a Update supabase.py, add filter to query (matches latest supabase docs & js) (#7721)
- Description: Update supabase to support optional filter argument (if
present, used, if not, doesn't break things)
- Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 19:13:52 -07:00
Aditya S
00de334f81 Fixed sparql SELECT and UPDATE query function (#7758)
- Description: Changed "SELECT" and "UPDTAE" intent check from "=" to
"in",
- Issue: Based on my own testing, most of the LLM (StarCoder, NeoGPT3,
etc..) doesn't return a single word response ("SELECT" / "UPDATE")
through this modification, we can accomplish the same output without
curated prompt engineering.
  - Dependencies: None
  - Tag maintainer: @baskaryan
  - Twitter handle: @aditya_0290


Thank you for maintaining this library, Keep up the good efforts.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 18:29:30 -07:00
William FH
3662aca7d4 Add async support for transform chain (#8205) 2023-07-24 17:45:17 -07:00
Taqi Jaffri
8f158b72fc Added stop sequence support to replicate (#8107)
Stop sequences are useful if you are doing long-running completions and
need to early-out rather than running for the full max_length... not
only does this save inference cost on Replicate, it is also much faster
if you are going to truncate the output later anyway.

Other LLMs support stop sequences natively (e.g. OpenAI) but I didn't
see this for Replicate so adding this via their prediction cancel
method.

Housekeeping: I ran `make format` and `make lint`, no issues reported in
the files I touched.

I did update the replicate integration test and ran `poetry run pytest
tests/integration_tests/llms/test_replicate.py` successfully.

Finally, I am @tjaffri https://twitter.com/tjaffri for feature
announcement tweets... or if you could please tag @docugami
https://twitter.com/docugami we would really appreciate that :-)

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-07-24 17:34:13 -07:00
glaze
f7ad14acfa Add etherscan document loader (#7943)
@rlancemartin 
The modification includes:
* etherscanLoader
* test_etherscan
* document ipynb

I have run the test, lint, format, and spell check. I do encounter a
linting error on ipynb, I am not sure how to address that.
```
docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:55: error: Name "null" is not defined  [name-defined]
docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:76: error: Name "null" is not defined  [name-defined]
Found 2 errors in 1 file (checked 1 source file)
```
- Description: The Etherscan loader uses etherscan api to load
transaction histories under specific accounts on Ethereum Mainnet.
- No dependency is introduced by this PR.
- Twitter handle: glazecl

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 17:09:16 -07:00
Julien Salinas
73d5cba308 Allow user to modify the GPU and language settings when using NLP Cloud (#7985)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 17:08:56 -07:00
Bagatur
483f6c2fe3 mv eval docs (#8209) 2023-07-24 16:31:20 -07:00
Liu Ming
24f889f2bc Change with_history option to False for ChatGLM by default (#8076)
ChatGLM LLM integration will by default accumulate conversation
history(with_history=True) to ChatGLM backend api, which is not expected
in most cases. This PR set with_history=False by default, user should
explicitly set llm.with_history=True to turn this feature on. Related
PR: #8048 #7774

---------

Co-authored-by: mlot <limpo2000@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 15:46:02 -07:00
Mahip Soni
1f055775f8 Fixing issue with MSSQL connection (#8040)
My team recently faced an issue while using MSSQL and passing a schema
name.

We noticed that "SET search_path TO {self.schema}" is being called for
us, which is not a valid ms-sql query, and is specific to postgresql
dialect.

We were able to run it locally after this fix.


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 15:45:40 -07:00
Anthony Mahanna
76102971c0 ArangoDB/AQL support for Graph QA Chain (#7880)
**Description**: Serves as an introduction to LangChain's support for
[ArangoDB](https://github.com/arangodb/arangodb), similar to
https://github.com/hwchase17/langchain/pull/7165 and
https://github.com/hwchase17/langchain/pull/4881

**Issue**: No issue has been created for this feature

**Dependencies**: `python-arango` has been added as an optional
dependency via the `CONTRIBUTING.md` guidelines
 
**Twitter handle**: [at]arangodb

- Integration test has been added
- Notebook has been added:
[graph_arangodb_qa.ipynb](https://github.com/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb)

[![Open In
Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb)

```
docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD= arangodb/arangodb
```

```
pip install git+https://github.com/amahanna/langchain.git
```

```python
from arango import ArangoClient

from langchain.chat_models import ChatOpenAI
from langchain.graphs import ArangoGraph
from langchain.chains import ArangoGraphQAChain

db = ArangoClient(hosts="localhost:8529").db(name="_system", username="root", password="", verify=True)

graph = ArangoGraph(db)

chain = ArangoGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph)

chain.run("Is Ned Stark alive?")
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 15:16:52 -07:00
Adilkhan Sarsen
3e7d2a1b64 SelfQuery support for deeplake (#7888)
Added support SelfQuery for Deeplake
2023-07-24 14:22:33 -07:00
Leonid Ganeline
c580c81cca docstrings experimental (#7969)
- added/changed docstring for `experimental`
- added/changed docstrings for different artifacts
- 
@baskaryan
2023-07-24 14:21:48 -07:00
Leonid Ganeline
3eb4112a1f Refactored example_generator (#8099)
Refactored `example_generator.py`. The same as #7961 
`example_generator.py` is in the root code folder. This creates the
`langchain.example_generator: Example Generator ` group on the API
Reference navigation ToC, on the same level as `Chains` and `Agents`
which is not correct.

Refactoring:
- moved `example_generator.py` content into
`chains/example_generator.py` (not in `utils` because the
`example_generator` has dependencies on other LangChain classes. It also
doesn't work for moving into `utilities/`)
- added the backwards compatibility ref in the original
`example_generator.py`

@hwchase17
2023-07-24 13:36:44 -07:00
Juan José Torres
1cc7d4c9eb Update SageMaker Endpoint Embeddings docs to be up to date with current requirements (#8103)
- **Description:** Simple change of the Class that ContentHandler
inherits from. To create an object of type SagemakerEndpointEmbeddings,
the property content_handler must be of type EmbeddingsContentHandler
not ContentHandlerBase anymore,
  - **Twitter handle:** @Juanjo_Torres11

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 13:35:06 -07:00
Leonid Ganeline
7cbe28ba9b Refactored input (#8202)
Refactored `input.py`. The same as
https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099
input.py is in the root code folder. This creates the `langchain.input:
Input` group on the API Reference navigation ToC, on the same level as
Chains and Agents which is incorrect.

Refactoring:

- copied input.py file into utils/input.py
- I added the backwards compatibility ref in the original input.py. 
- changed several imports to a new ref

@hwchase17, @baskaryan
2023-07-24 13:10:03 -07:00
Monty Evans
72eb4fa4e8 Change WebBaseLoader metadata parsing to set missing metadata to descriptive string instead of None (#8175)
Solves #8174 & #3542

Co-authored-by: mevans <mevans@palantir.com>
2023-07-24 12:17:49 -07:00
Bagatur
1a7d8667c8 Bagatur/gateway chat (#8198)
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: dbczumar <corey.zumar@databricks.com>
2023-07-24 12:17:00 -07:00
Ettore Di Giacinto
ae28568e2a Add embeddings for LocalAI (#8134)
Description:

This PR adds embeddings for LocalAI (
https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in
replacement. As LocalAI can re-use OpenAI clients it is mostly following
the lines of the OpenAI embeddings, however when embedding documents, it
just uses string instead of sending tokens as sending tokens is
best-effort depending on the model being used in LocalAI. Sending tokens
is also tricky as token id's can mismatch with the model - so it's safer
to just send strings in this case.

Partly related to: https://github.com/hwchase17/langchain/issues/5256

Dependencies: No new dependencies

Twitter: @mudler_it
---------

Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 12:16:49 -07:00
Mike Nitsenko
d983046f90 Extend Cube Semantic Loader functionality (#8186)
**PR Description:**

This pull request introduces several enhancements and new features to
the `CubeSemanticLoader`. The changes include the following:

1. Added imports for the `json` and `time` modules.
2. Added new constructor parameters: `load_dimension_values`,
`dimension_values_limit`, `dimension_values_max_retries`, and
`dimension_values_retry_delay`.
3. Updated the class documentation with descriptions for the new
constructor parameters.
4. Added a new private method `_get_dimension_values()` to retrieve
dimension values from Cube's REST API.
5. Modified the `load()` method to load dimension values for string
dimensions if `load_dimension_values` is set to `True`.
6. Updated the API endpoint in the `load()` method from the base URL to
the metadata endpoint.
7. Refactored the code to retrieve metadata from the response JSON.
8. Added the `column_member_type` field to the metadata dictionary to
indicate if a column is a measure or a dimension.
9. Added the `column_values` field to the metadata dictionary to store
the dimension values retrieved from Cube's API.
10. Modified the `page_content` construction to include the column title
and description instead of the table name, column name, data type,
title, and description.

These changes improve the functionality and flexibility of the
`CubeSemanticLoader` class by allowing the loading of dimension values
and providing more detailed metadata for each document.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 12:11:58 -07:00
Bagatur
82b8d8596c bump lc241 exp3 (#8193) 2023-07-24 11:52:44 -07:00
Leonid Ganeline
848454d1e7 Refactored formatting (#8191)
Refactored `formatting.py`. The same as
https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099
formatting.py is in the root code folder. This creates the
`langchain.formatting: Formatting` group on the API Reference navigation
ToC, on the same level as Chains and Agents which is incorrect.

Refactoring:

- moved formatting.py content into utils/formatting.py
- I did not add the backwards compatibility ref in the original
formatting.py. It seems unnecessary.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-24 11:34:15 -07:00
Bagatur
4928f7a9f5 undo bump (#8192) 2023-07-24 11:32:17 -07:00
Bagatur
14aa27b5f4 redirect (#8189) 2023-07-24 10:45:12 -07:00
Bagatur
e7d64f8b15 Bagatur/vercel test 3 (#8188) 2023-07-24 10:11:54 -07:00
Leonid Ganeline
120cdf813d docstrings memory (#8018)
docstrings `memory`:
- added module summary
- added missed docstrings
- updated docstrings into consistent format
- 
@baskaryan
2023-07-24 10:05:36 -07:00
Bagatur
026269bfa9 redirects (#8183) 2023-07-24 08:32:49 -07:00
Bagatur
d5689d58ab Bagatur/bump 241 (#8182) 2023-07-24 07:47:40 -07:00
Harrison Chase
3caccf304c Harrison/hugginggpt (#8162)
Co-authored-by: Yongliang Shen <withsyl@163.com>
2023-07-24 07:36:24 -07:00
rajib
f3908627ed changed to mlflow-ai-gateway in llms/__init__.py (#8114)
- Description: In the llms/__init__.py, the key name is wrong for
mlflowaigateway. It should be mlflow-ai-gateway
  - Issue: NA
  - Dependencies: NA
  - Tag maintainer: @hwchase17, @baskaryan
  - Twitter handle: na

Without this fix, when we run the code for mlflowaigateway, we will get
error as below

ValueError: Loading mlflow-ai-gateway LLM not supported

---------

Co-authored-by: rajib76 <rajib76@yahoo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-23 23:30:46 -07:00
Bagatur
c8c8635dc9 mv module integrations docs (#8101) 2023-07-23 23:23:16 -07:00
Adarsh Shirawalmath
8ea840432f Generalize Comment on Streaming Support for LLM Implementations and add examples (#8115)
The example provided demonstrates the usage of the
HuggingFaceTextGenInference implementation with streaming enabled.
2023-07-23 22:59:59 -07:00
Gordon Clark
80b3ec5869 GitHub toolkit improvements (#8121)
Fixes an issue with the github tool where the API returned special
objects but the tool was expecting dictionaries.

Also added proper docstrings to the GitHubAPIWraper methods and a (very
basic) integration test.

Maintainer responsibilities:
  - Agents / Tools / Toolkits: @hinthornw

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-23 20:17:53 -07:00
Harrison Chase
33fd6184ba beef up getting started (#8139)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-23 19:57:43 -07:00
Lawrence Lim
fa8906a9b7 fix typo: Entity Summary Memory documentation (#8145)
Fixed a small typo I came across in the Memory documentation.
2023-07-23 19:36:50 -07:00
shibuiwilliam
8f5000146c add faiss test for score threshold (#8143)
# What
- Add faiss vector search test for score threshold
- Fix failing faiss vector search test; filtering with list value is
wrong.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
- Description: Add faiss vector search test for score threshold; Fix
failing faiss vector search test; filtering with list value is wrong.
  - Issue: None
  - Dependencies: None
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @MlopsJ

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-23 19:36:38 -07:00
Nolan
7686dabd36 Unbreak devcontainer (#8154)
Codespaces and devcontainer was broken by the [repo
restructure](https://github.com/langchain-ai/langchain/discussions/8043).



- Description: Add libs/langchain to container so it can be built
without error.
  - Issue: -
  - Dependencies: -
  - Tag maintainer: @hwchase17 @baskaryan 
  - Twitter handle: @finnless

The failed build log says:
```
#10 [langchain-dev-dependencies 2/2] RUN poetry install --no-interaction --no-ansi --with dev,test,docs
#10 sha256:e850ee99fc966158bfd2d85e82b7c57244f47ecbb1462e75bd83b981a56a1929
2023-07-23 23:30:33.692Z: #10 0.827 
#10 0.827 Directory libs/langchain does not exist
2023-07-23 23:30:33.738Z: #10 ERROR: executor failed running [/bin/sh -c poetry install --no-interaction --no-ansi --with dev,test,docs]: exit code: 1
```

The new pyproject.toml imports from libs/langchain:

77bf75c236/pyproject.toml (L14-L16)

But libs/langchain is never added to the dev.Dockerfile:


77bf75c236/libs/langchain/dev.Dockerfile (L37-L39)
2023-07-23 19:33:47 -07:00
Fielding Johnston
fb62f2be70 nit: small typo in evaluation module docs (#8155)
Hopefully, this doesn't come across as nitpicky! That isn't the
intention. I only noticed it, because I enjoy reading the documentation
and when I hit a mental road bump it is usually due to a missing word or
something =)

@baskaryan
2023-07-23 18:25:14 -07:00
Harrison Chase
9205919ad2 actually use input key (#8136) 2023-07-23 18:02:45 -07:00
Leonid Ganeline
670304a8b3 simplified nmspace (#8152)
recreated #7894 (it is easy to recreate than resolve conflicts)
A small refactoring to improve the API Reference Agents table
 @baskaryan
2023-07-23 18:02:20 -07:00
William FH
c5b50be225 Function calling logging fixup (#8153)
Fix bad overwriting of "functions" arg in invocation params.
Cleanup precedence in the dict
Clean up some inappropriate types (mapping should be dict)


Example:
https://dev.smith.langchain.com/public/9a7a6817-1679-49d8-8775-c13916975aae/r


![image](https://github.com/langchain-ai/langchain/assets/13333726/94cd0775-b6ef-40c3-9e5a-3ab65e466ab9)
2023-07-23 18:01:33 -07:00
SlapDrone
961a0e200f Implement AgentExecutorIterator (#6929)
- Description: Implements a `.iter()` method for the `AgentExecutor`
class. This allows hooking into and intercepting intermediate agent
steps.
  - Issue: #6925 
  - Dependencies: None
  - Tag maintainer: @vowelparrot @agola11 
  - Twitter handle: @SlapDron3 @lacicocodes

---------

Co-authored-by: Lacico <Lacicocodes@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-23 18:00:22 -07:00
Harrison Chase
77bf75c236 bump experimental to 002 (#8150) 2023-07-23 09:22:39 -07:00
Harrison Chase
e46126eac6 add llamaapi (#8140) 2023-07-23 09:16:16 -07:00
Harrison Chase
f0eb5db670 Harrison/agent intro (#8138)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-22 22:14:59 -07:00
Harrison Chase
cbf2fc8af8 prompt ergonomics (#7799) 2023-07-22 14:19:17 -07:00
Samuel Berthe
d81d6e874f doc(sqldatabasechain): use views when jsonb column description is not available (#8133)
I think the PR diff is self explaining ;)

@baskaryan
2023-07-22 11:30:04 -07:00
Harrison Chase
506b21bfc2 Update MIGRATE.md 2023-07-22 09:11:43 -07:00
Harrison Chase
9854d9e5cb cr 2023-07-22 09:07:26 -07:00
Harrison Chase
9f3073d418 bump versions (#8129) 2023-07-22 08:46:37 -07:00
Harrison Chase
86946a47a8 Harrison/add back in experimental (#8128) 2023-07-22 08:27:29 -07:00
Karthik Raja A
8b08687fc4 MultiOn client toolkit (#8110)
Addition of MultiOn Client Agent Toolkit
Dependencies: multion pip package
This PR consists of the following:
- MultiOn utility,tools and integration with agent
- sample jupyter notebook.
Request @hwchase17 , @hinthornw

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-22 08:19:01 -07:00
Harrison Chase
aa0e69bc98 Harrison/official pre release (#8106) 2023-07-21 18:44:32 -07:00
Philip Kiely - Baseten
95bcf68802 add kwargs support for Baseten models (#8091)
This bugfix PR adds kwargs support to Baseten model invocations so that
e.g. the following script works properly:

```python
chatgpt_chain = LLMChain(
    llm=Baseten(model="MODEL_ID"),
    prompt=prompt,
    verbose=False,
    memory=ConversationBufferWindowMemory(k=2),
    llm_kwargs={"max_length": 4096}
)
```
2023-07-21 13:56:27 -07:00
Harrison Chase
8dcabd9205 bump releases rc0 (#8097) 2023-07-21 13:54:57 -07:00
Bagatur
58f65fcf12 use top nav docs (#8090) 2023-07-21 13:52:03 -07:00
Harrison Chase
0faba034b1 add experimental release action (#8096) 2023-07-21 13:38:35 -07:00
Harrison Chase
d353d668e4 remove CVEs (#8092)
This PR aims to move all code with CVEs into `langchain.experimental`.
Note that we are NOT yet removing from the core `langchain` package - we
will give people a week to migrate here.

See MIGRATE.md for how to migrate

Zero changes to functionality

Vulnerabilities this addresses:

PALChain:
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5752409
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759265

SQLDatabaseChain
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759268

`load_prompt` (Python files only)
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5725807
2023-07-21 13:32:39 -07:00
Bagatur
08c658d3f8 fix api ref (#8083) 2023-07-21 12:37:21 -07:00
Harrison Chase
344cbd9c90 update contributor guide (#8088) 2023-07-21 12:01:05 -07:00
Harrison Chase
17c06ee456 cr 2023-07-21 10:48:00 -07:00
Harrison Chase
da04760de1 Harrison/move experimental (#8084) 2023-07-21 10:36:28 -07:00
Harrison Chase
f35db9f43e (WIP) set up experimental (#7959) 2023-07-21 09:20:24 -07:00
c-bata
623b321e75 Fix allowed_search_types in VectorStoreRetriever (#8064)
Unexpectedly changed at
6792a3557d

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

I guess `allowed_search_types` is unexpectedly changed in
6792a3557d,
so that we cannot specify `similarity_score_threshold` here.

```python
class VectorStoreRetriever(BaseRetriever):
    ...
    allowed_search_types: ClassVar[Collection[str]] = (
        "similarity",
        "similarityatscore_threshold",
        "mmr",
    )

    @root_validator()
    def validate_search_type(cls, values: Dict) -> Dict:
        """Validate search type."""
        search_type = values["search_type"]
        if search_type not in cls.allowed_search_types:
            raise ValueError(...)
        if search_type == "similarity_score_threshold":
            ... # UNREACHABLE CODE
```

VectorStores Maintainers: @rlancemartin @eyurtsev
2023-07-21 08:39:36 -07:00
Bagatur
95e369b38d bump 239 (#8077) 2023-07-21 07:31:14 -07:00
William FH
c38965fcba Add embedding and vectorstore provider info as tags (#8027)
Example:
https://smith.langchain.com/public/bcd3714d-abba-4790-81c8-9b5718535867/r


The vectorstore implementations aren't super standardized yet, so just
adding an optional embeddings property to pass in.
2023-07-20 22:40:01 -07:00
Mohammad Mohtashim
355b7d8b86 Getting SQL cmd directly from SQLDatabase Chain. (#7940)
- Description: Get SQL Cmd directly generated by SQL-Database Chain
without executing it in the DB engine.
- Issue: #4853 
- Tag maintainer: @hinthornw,@baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-20 22:36:55 -07:00
Lance Martin
5a084e1b20 Async HTML loader and HTML2Text transformer (#8036)
New HTML loader that asynchronously loader a list of urls. 
 
New transformer using [HTML2Text](https://github.com/Alir3z4/html2text/)
for HTML to clean, easy-to-read plain ASCII text (valid Markdown).
2023-07-20 22:30:59 -07:00
Wey Gu
cf60cff1ef feat: Add with_history option for chatglm (#8048)
In certain 0-shot scenarios, the existing stateful language model can
unintentionally send/accumulate the .history.

This commit adds the "with_history" option to chatglm, allowing users to
control the behavior of .history and prevent unintended accumulation.

Possible reviewers @hwchase17 @baskaryan @mlot

Refer to discussion over this thread:
https://twitter.com/wey_gu/status/1681996149543276545?s=20
2023-07-20 22:25:37 -07:00
Harrison Chase
1f3b987860 Harrison/GitHub toolkit (#8047)
Co-authored-by: Trevor Dobbertin <trevordobbertin@gmail.com>
2023-07-20 22:24:55 -07:00
Leonid Ganeline
ae8bc9e830 Refactored sql_database (#7945)
The `sql_database.py` is unnecessarily placed in the root code folder.
A similar code is usually placed in the `utilities/`.
As a byproduct of this placement, the sql_database is [placed on the top
level of classes in the API
Reference](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.sql_database)
which is confusing and not correct.


- moved the `sql_database.py` from the root code folder to the
`utilities/`

@baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-20 22:17:55 -07:00
William FH
dc9d6cadab Dedup methods (#8049) 2023-07-20 22:13:22 -07:00
Harrison Chase
f99f497b2c Harrison/predibase (#8046)
Co-authored-by: Abhay Malik <32989166+Abhay-765@users.noreply.github.com>
2023-07-20 19:26:50 -07:00
Jacob Lee
56c6ab1715 Fix bad docs sidebar header (#7966)
Quick fix for:

<img width="283" alt="Screenshot 2023-07-19 at 2 49 44 PM"
src="https://github.com/hwchase17/langchain/assets/6952323/91e4868c-b75e-413d-9f8f-d34762abf164">

CC @baskaryan
2023-07-20 19:06:57 -07:00
Wian Stipp
ebc5ff2948 HuggingFaceTextGenInference bug fix: Multiple values for keyword argument (#8044)
Fixed the bug causing: `TypeError: generate() got multiple values for
keyword argument 'stop_sequences'`

```python
res = await self.async_client.generate(
                prompt,
                **self._default_params,
                stop_sequences=stop,
                **kwargs,
            )
```
The above throws an error because stop_sequences is in also in the
self._default_params.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 19:05:08 -07:00
Kacper Łukawski
ed6a5532ac Implement async support in Qdrant local mode (#8001)
I've extended the support of async API to local Qdrant mode. It is faked
but allows prototyping without spinning a container. The tests are
improved to test the in-memory case as well.

@baskaryan @rlancemartin @eyurtsev @agola11
2023-07-20 19:04:33 -07:00
Bagatur
7717c24fc4 fix redis cache chat model (#8041)
Redis cache currently stores model outputs as strings. Chat generations
have Messages which contain more information than just a string. Until
Redis cache supports fully storing messages, cache should not interact
with chat generations.
2023-07-20 19:00:05 -07:00
Taqi Jaffri
973593c5c7 Added streaming support to Replicate (#8045)
Streaming support is useful if you are doing long-running completions or
need interactivity e.g. for chat... adding it to replicate, using a
similar pattern to other LLMs that support streaming.

Housekeeping: I ran `make format` and `make lint`, no issues reported in
the files I touched.

I did update the replicate integration test but ran into some issues,
specifically:

1. The original test was failing for me due to the model argument not
being specified... perhaps this test is not regularly run? I fixed it by
adding a call to the lightweight hello world model which should not be
burdensome for replicate infra.
2. I couldn't get the `make integration_tests` command to pass... a lot
of failures in other integration tests due to missing dependencies...
however I did make sure the particluar test file I updated does pass, by
running `poetry run pytest
tests/integration_tests/llms/test_replicate.py`

Finally, I am @tjaffri https://twitter.com/tjaffri for feature
announcement tweets... or if you could please tag @docugami
https://twitter.com/docugami we would really appreciate that :-)

Tagging model maintainers @hwchase17  @baskaryan 

Thank for all the awesome work you folks are doing.

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-07-20 18:59:54 -07:00
Piyush Jain
31b7ddc12c Neptune graph and openCypher QA Chain (#8035)
## Description
This PR adds a graph class and an openCypher QA chain to work with the
Amazon Neptune database.

## Dependencies
`requests` which is included in the LangChain dependencies.

## Maintainers for Review
@krlawrence
@baskaryan

### Twitter handle
pjain7
2023-07-20 18:56:47 -07:00
Leonid Ganeline
995220b797 Refactored math_utils (#7961)
`math_utils.py` is in the root code folder. This creates the
`langchain.math_utils: Math Utils` group on the API Reference navigation
ToC, on the same level with `Chains` and `Agents` which is not correct.

Refactoring:
- created the `utils/` folder
- moved `math_utils.py` to `utils/math.py`
- moved `utils.py` to `utils/utils.py`
- split `utils.py` into `utils.py, env.py, strings.py`
- added module description

@baskaryan
2023-07-20 18:55:43 -07:00
Paolo Picello
5137f40dd6 Update mongodb_atlas.py docstrings (#8033)
Hi all, I just added the "index_name" parameter to the docstrings for
mongodb_atlas.py (it is missing in the [public doc
page](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.mongodb_atlas.MongoDBAtlasVectorSearch.html#langchain-vectorstores-mongodb-atlas-mongodbatlasvectorsearch).

Thanks
2023-07-20 17:35:07 -07:00
felixocker
9226fda58b fix: create schema description from URIs and str w/out rdflib warnings (#8025)
- Description: fix to avoid rdflib warnings when concatenating URIs and
strings to create the text snippet for the knowledge graph's schema.
@marioscrock pointed this out in a comment related to #7165
- Issue: None, but the problem was mentioned as a comment in #7165
- Dependencies: None
- Tag maintainer: Related to memory -> @hwchase17, maybe @baskaryan as
it is a fix
2023-07-20 15:55:19 -07:00
Emory Petermann
7239d57a53 Update Golden integration documentation (#8030)
fixes some typos and cleans up onboarding for golden, thank you!

@hinthornw
2023-07-20 15:53:44 -07:00
Jonathon Belotti
021bb9be84 Update Modal.com integration docs (#8014)
Hey, I'm a Modal Labs engineer and I'm making this docs update after
getting a user question in [our beta Slack
space](https://join.slack.com/t/modalbetatesters/shared_invite/zt-1xl9gbob8-1QDgUY7_PRPg6dQ49hqEeQ)
about the Langchain integration docs.

🔗 [Modal beta-testers link to docs discussion
thread](https://modalbetatesters.slack.com/archives/C031Z7DBQFL/p1689777700594819?thread_ts=1689775859.855849&cid=C031Z7DBQFL)
2023-07-20 15:53:06 -07:00
Jeffrey Wang
62d0475c29 Add Metaphor new field and reformat docs (#8022)
This PR reformats our python notebook example and also adds a new field
we have.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-07-20 15:50:54 -07:00
William FH
e2a99bd169 Different error strings (#8010) 2023-07-20 09:58:25 -07:00
Bagatur
ec4f93b629 bump 238 (#8012) 2023-07-20 09:21:15 -07:00
vrushankportkey
5f10d2ea1d Add Portkey LLMOps integration (#7877)
Integrating Portkey, which adds production features like caching,
tracing, tagging, retries, etc. to langchain apps.

  - Dependencies: None
  - Twitter handle: https://twitter.com/portkeyai
  - test_portkey.py added for tests
  - example notebook added in new utilities folder in modules
  
 Also fixed a bug with OpenAIEmbeddings where headers weren't passing.

cc @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 09:08:44 -07:00
Boris Nieuwenhuis
095937ad52 Add google place ID to google places tool response (#7789)
- Description: this change will add the google place ID of the found
location to the response of the GooglePlacesTool
  - Issue: Not applicable
  - Dependencies: no dependencies
  - Tag maintainer: @hinthornw
  - Twitter handle: Not applicable
2023-07-20 09:04:31 -07:00
Bagatur
7c24a6b9d1 Bagatur/apify (#8008)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Jiří Moravčík <jiri.moravcik@gmail.com>
Co-authored-by: Jan Čurn <jan.curn@gmail.com>
2023-07-20 08:36:01 -07:00
Aiden Le
1d7414a371 Feature: Add openai_api_model attribute to Doctran models (#7868)
- Description: Added the ability to define the open AI model.
- Issue: Currently the Doctran instance uses gpt-4 by default, this does
not work if the user has no access to gpt -4.
  - rlancemartin, @eyurtsev, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:27:56 -07:00
Dwai Banerjee
d8c40253c3 Adding endpoint_url to embeddings/bedrock.py and updated docs (#7927)
BedrockEmbeddings does not have endpoint_url so that switching to custom
endpoint is not possible. I have access to Bedrock custom endpoint and
cannot use BedrockEmbeddings

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:25:59 -07:00
Bagatur
ea028b66ab undo vectstore memory bug (#8007) 2023-07-20 07:25:23 -07:00
Mohammad Mohtashim
453d4c3a99 VectorStoreRetrieverMemory exclude additional input keys feature (#7941)
- Description: Added a parameter in VectorStoreRetrieverMemory which
filters the input given by the key when constructing the buffering the
document for Vector. This feature is helpful if you have certain inputs
apart from the VectorMemory's own memory_key that needs to be ignored
e.g when using combined memory, we might need to filter the memory_key
of the other memory, Please see the issue.
  - Issue: #7695
  - Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:23:27 -07:00
Constantin Musca
d593833e4d Add Golden Query Tool (#7930)
**Description:** Golden Query is a wrapper on top of the [Golden Query
API](https://docs.golden.com/reference/query-api) which enables
programmatic access to query results on entities across Golden's
Knowledge Base. For more information about Golden API, please see the
[Golden API Getting
Started](https://docs.golden.com/reference/getting-started) page.
**Issue:** None
**Dependencies:** requests(already present in project)
**Tag maintainer:** @hinthornw

Signed-off-by: Constantin Musca <constantin.musca@gmail.com>
2023-07-20 07:03:20 -07:00
eahova
aea97efe8b Adding code to allow pandas to show all columns instead of truncating… (#7901)
- Description: Adding code to set pandas dataframe to display all the
columns. Otherwise, some data get truncated (it puts a "..." in the
middle and just shows the first 4 and last 4 columns) and the LLM
doesn't realize it isn't getting the full data. Default value is 8, so
this helps Dataframes larger than that.
  - Issue: none
  - Dependencies: none
  - Tag maintainer: @hinthornw 
  - Twitter handle: none
2023-07-20 07:02:01 -07:00
Santiago Delgado
c416dbe8e0 Amadeus Flight and Travel Search Tool (#7890)
## Background
With the addition on email and calendar tools, LangChain is continuing
to complete its functionality to automate business processes.

## Challenge
One of the pieces of business functionality that LangChain currently
doesn't have is the ability to search for flights and travel in order to
book business travel.

## Changes
This PR implements an integration with the
[Amadeus](https://developers.amadeus.com/) travel search API for
LangChain, enabling seamless search for flights with a single
authentication process.

## Who can review?
@hinthornw

## Appendix
@tsolakoua and @minjikarin, I utilized your
[amadeus-python](https://github.com/amadeus4dev/amadeus-python) library
extensively. Given the rising popularity of LangChain and similar AI
frameworks, the convergence of libraries like amadeus-python and tools
like this one is likely. So, I wanted to keep you updated on our
progress.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:59:29 -07:00
Hanit
ea149dbd89 Allowing outside parameters for Qdrant. (#7910)
@baskaryan @rlancemartin, @eyurtsev
2023-07-20 06:58:54 -07:00
Sheik Irfan Basha
d6493590da Add Verbose support (#7982) (#7984)
- Description: Add verbose support for the extraction_chain
- Issue: Fixes #7982 
- Dependencies: NA
- Twitter handle: sheikirfanbasha
@hwchase17 and @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:52:13 -07:00
Junlin Zhou
812a1643db chore(hf-text-gen): extract default params for reusing (#7929)
This PR extract common code (default generation params) for
`HuggingFaceTextGenInference`.

Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>
2023-07-20 06:49:12 -07:00
Yun Kim
54e02e4392 Add datadog-langchain integration doc (#7955)
## Description
Added a doc about the [Datadog APM integration for
LangChain](https://github.com/DataDog/dd-trace-py/pull/6137).
Note that the integration is on `ddtrace`'s end and so no code is
introduced/required by this integration into the langchain library. For
that reason I've refrained from adding an example notebook (although
I've added setup instructions for enabling the integration in the doc)
as no code is technically required to enable the integration.

Tagging @baskaryan as reviewer on this PR, thank you very much!

## Dependencies
Datadog APM users will need to have `ddtrace` installed, but the
integration is on `ddtrace` end and so does not introduce any external
dependencies to the LangChain project.


Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:44:58 -07:00
Wian Stipp
0ffb7fc10c One Line Fix: missing text output with huggingface TGI LLM (#7972)
Small bug fix. The async _call method was missing a line to return the
generated text.

@baskaryan
2023-07-20 06:44:29 -07:00
Jithin James
493cbc9410 docs: fix a couple of small indentation errors in the strings (#7951)
Fixed a few indentations I came across in the docs @baskaryan
2023-07-20 06:34:01 -07:00
Bhashithe Abeysinghe
73901ef132 Added windows specific instructions to Llama.cpp documentation. (#8000)
- Description: Added windows specific instructions on llama.cpp in the
notebook file
  - Issue: #6356 
  - Dependencies: None
  - Tag maintainer: @baskaryan
2023-07-20 06:31:25 -07:00
Leonid Ganeline
24b26a922a docstrings for embeddings (#7973)
Added/updated docstrings for the `embeddings`

@baskaryan
2023-07-20 06:26:44 -07:00
Leonid Ganeline
0613ed5b95 docstrings for LLMs (#7976)
docstrings for the `llms/`:
- added missed docstrings
- update existing docstrings to consistent format (no `Wrappers`!)
@baskaryan
2023-07-20 06:26:16 -07:00
Jeff Huber
5694e7b8cf Update chroma notebook (#7978)
Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
2023-07-20 06:25:31 -07:00
Harutaka Kawamura
4a5894db47 Fix incorrect field name in MLflow AI Gateway config example (#7983) 2023-07-20 06:24:59 -07:00
Kacper Łukawski
19e8472521 Add async Qdrant to async_agent.ipynb (#7993)
I added Qdrant to the async API docs. This is the only vector store that
supports full async API.

@baskaryan @rlancemartin, @eyurtsev
2023-07-20 06:23:15 -07:00
Nuno Campos
8edb1db9dc Fix key errors in weaviate hybrid retriever init (#7988) 2023-07-20 06:22:18 -07:00
Harrison Chase
df84e1bb64 pass callbacks along baby ai (#7908) 2023-07-19 22:40:33 -07:00
William FH
a4c5914c9a Bump LS Version (#7970) 2023-07-19 17:12:16 -07:00
Bagatur
5d021c0962 nb fix (#7962) 2023-07-19 15:27:43 -07:00
Julien Salinas
3adab5e5be Integrate NLP Cloud embeddings endpoint (#7931)
Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-19 15:27:34 -07:00
Bagatur
854a2be0ca Add debugging guide (#7956) 2023-07-19 14:15:11 -07:00
Brendan Collins
9aef79c2e3 Add Geopandas.GeoDataFrame Document Loader (#3817)
Work in Progress.
WIP
Not ready...

Adds Document Loader support for
[Geopandas.GeoDataFrames](https://geopandas.org/)

Example:
- [x] stub out `GeoDataFrameLoader` class
- [x] stub out integration tests
- [ ] Experiment with different geometry text representations
- [ ] Verify CRS is successfully added in metadata
- [ ] Test effectiveness of searches on geometries
- [ ] Test with different geometry types (point, line, polygon with
multi-variants).
- [ ] Add documentation

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
2023-07-19 12:14:41 -07:00
Lance Martin
dfc533aa74 Add llama-v2 to local document QA (#7952) 2023-07-19 11:15:47 -07:00
Bagatur
d9b5bcd691 bump (#7948) 2023-07-19 10:23:21 -07:00
Bagatur
f97535b33e fix (#7947) 2023-07-19 10:23:10 -07:00
Adilkhan Sarsen
7bb843477f Removed kwargs from add_texts (#7595)
Removing **kwargs argument from add_texts method in DeepLake vectorstore
as it confuses users and doesn't fail when user is typing incorrect
parameters.

Also added small test to ensure the change is applies correctly.

Guys could pls take a look: @rlancemartin, @eyurtsev, this is a small
PR.

Thx so much!
2023-07-19 09:23:49 -07:00
Bagatur
4d8b48bdb3 bump 236 (#7938) 2023-07-19 07:51:40 -07:00
Harutaka Kawamura
f6839a8682 Add integration for MLflow AI Gateway (#7113)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->


- Adds integration for MLflow AI Gateway (this will be shipped in MLflow
2.5 this week).


Manual testing:

```sh
# Move to mlflow repo
cd /path/to/mlflow

# install langchain
pip install git+https://github.com/harupy/langchain.git@gateway-integration

# launch gateway service
mlflow gateway start --config-path examples/gateway/openai/config.yaml

# Then, run the examples in this PR
```
2023-07-19 07:40:55 -07:00
David Preti
6792a3557d Update openai.py compatibility with azure 2023-07-01-preview (#7937)
Fixed missing "content" field in azure. 
Added a check for "content" in _dict (missing for azure
api=2023-07-01-preview)
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-19 07:31:18 -07:00
王斌(Bin Wang)
b65102bdb2 fix: pgvector search_type of similarity_score_threshold not working (#7771)
- Description: VectorStoreRetriever->similarity_score_threshold with
search_type of "similarity_score_threshold" not working with the
following two minor issues,
- Issue: 1. In line 237 of `vectorstores/base.py`, "score_threshold" is
passed to `_similarity_search_with_relevance_scores` as in the kwargs,
while score_threshold is not a valid argument of this method. As a fix,
before calling `_similarity_search_with_relevance_scores`,
score_threshold is popped from kwargs. 2. In line 596 to 607 of
`vectorstores/pgvector.py`, it's checking the distance_strategy against
the string in Enum. However, self.distance_strategy will get the
property of distance_strategy from line 316, where the callable function
is passed. To solve this issue, self.distance_strategy is changed to
self._distance_strategy to avoid calling the property method.,
  - Dependencies: No,
  - Tag maintainer: @rlancemartin, @eyurtsev,
  - Twitter handle: No

---------

Co-authored-by: Bin Wang <bin@arcanum.ai>
2023-07-19 07:20:52 -07:00
William FH
9d7e57f5c0 Docs Nit (#7918) 2023-07-18 21:47:28 -07:00
Wilson Leao Neto
8bb33f2296 Exposes Kendra result item DocumentAttributes in the document metadata (#7781)
- Description: exposes the ResultItem DocumentAttributes as document
metadata with key 'document_attributes' and refactors
AmazonKendraRetriever by providing a ResultItem base class in order to
avoid duplicate code;
- Tag maintainer: @3coins @hupe1980 @dev2049 @baskaryan
- Twitter handle: wilsonleao

### Why?
Some use cases depend on specific document attributes returned by the
retriever in order to improve the quality of the overall completion and
adjust what will be displayed to the user. For the sake of consistency,
we need to expose the DocumentAttributes as document metadata so we are
sure that we are using the values returned by the kendra request issued
by langchain.

I would appreciate your review @3coins @hupe1980 @dev2049. Thank you in
advance!

### References
- [Amazon Kendra
DocumentAttribute](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttribute.html)
- [Amazon Kendra
DocumentAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttributeValue.html)

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
2023-07-18 18:46:38 -07:00
Wilson Leao Neto
efa67ed0ef fix #7782: check title and excerpt separately for page_content (#7783)
- Description: check title and excerpt separately for page_content so
that if title is empty but excerpt is present, the page_content will
only contain the excerpt
  - Issue: #7782 
  - Tag maintainer: @3coins @baskaryan 
  - Twitter handle: wilsonleao
2023-07-18 18:46:23 -07:00
Leonid Ganeline
d92926cbc2 docstrings chains (#7892)
Added/updated docstrings.
2023-07-18 18:25:42 -07:00
Leonid Ganeline
4a810756f8 docstrings chains (#7892)
Added/updated docstrings.

@baskaryan
2023-07-18 18:25:27 -07:00
Jarek Kazmierczak
f2ef3ff54a Google Cloud Enterprise Search retriever (#7857)
Added a retriever that encapsulated Google Cloud Enterprise Search.


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 18:24:08 -07:00
Alonso Silva Allende
1152f4d48b Allow chat models that do not return token usage (#7907)
- Description: It allows to use chat models that do not return token
usage
- Issue: [#7900](https://github.com/hwchase17/langchain/issues/7900)
- Dependencies: None
- Tag maintainer: @agola11 @hwchase17 
- Twitter handle: @alonsosilva

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-07-18 18:12:09 -07:00
Zizhong Zhang
bdf0c2267f docs(custom_chain) fix typo (#7898)
Fix typo in the document of custom_chain
2023-07-18 18:03:19 -07:00
Jeff Huber
2139d0197e upgrade chroma to 0.4.0 (#7749)
** This should land Monday the 17th ** 

Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to
build, more durable, faster, smaller, and more extensible. This comes
with a few changes:

1. A simplified and improved client setup. Instead of having to remember
weird settings, users can just do `EphemeralClient`, `PersistentClient`
or `HttpClient` (the underlying direct `Client` implementation is also
still accessible)

2. We migrated data stores away from `duckdb` and `clickhouse`. This
changes the api for the `PersistentClient` that used to reference
`chroma_db_impl="duckdb+parquet"`. Now we simply set
`is_persistent=true`. `is_persistent` is set for you to `true` if you
use `PersistentClient`.

3. Because we migrated away from `duckdb` and `clickhouse` - this also
means that users need to migrate their data into the new layout and
schema. Chroma is committed to providing extension notification and
tooling around any schema and data migrations (for example - this PR!).

After upgrading to `0.4.0` - if users try to access their data that was
stored in the previous regime, the system will throw an `Exception` and
instruct them how to use the migration assistant to migrate their data.
The migration assitant is a pip installable CLI: `pip install
chroma_migrate`. And is runnable by calling `chroma_migrate`

-- TODO ADD here is a short video demonstrating how it works. 

Please reference the readme at
[chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate)
to see a full write-up of our philosophy on migrations as well as more
details about this particular migration.

Please direct any users facing issues upgrading to our Discord channel
called
[#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883).
We have also created a [email
listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers
directly in the future about breaking changes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 17:20:54 -07:00
Gergely Papp
10246375a5 Gpapp/chromadb (#7891)
- Description: version check to make sure chromadb >=0.4.0 does not
throw an error, and uses the default sqlite persistence engine when the
directory is set,
  - Issue: the issue #7887 

For attention of
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 17:03:42 -07:00
Lance Martin
41c841ec85 Add Llama-v2 to Llama.cpp notebook (#7913) 2023-07-18 15:13:27 -07:00
Bagatur
b9639f6067 fix docs (#7911) 2023-07-18 14:25:45 -07:00
Jeff Huber
dc8b790214 Improve vector store onboarding exp (#6698)
This PR
- fixes the `similarity_search_by_vector` example, makes the code run
and adds the example to mirror `similarity_search`
- reverts back to chroma from faiss to remove sharp edges / create a
happy path for new developers. (1) real metadata filtering, (2) expected
functionality like `update`, `delete`, etc to serve beyond the most
trivial use cases

@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 13:48:42 -07:00
Bagatur
25a2bdfb70 add pr template instructions (#7904) 2023-07-18 13:22:28 -07:00
Hanit
0d23c0c82a Allowing additional params for OpenAIEmbeddings. (#7752)
(#7654)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 12:14:51 -07:00
Lance Martin
862268175e Add llama-v2 to docs (#7893) 2023-07-18 12:09:09 -07:00
TRY-ER
21d1c988a9 Try er/redis index retrieval retry00 (#7773)
Replace this comment with:
- Description: Modified the code to return the document id from the
redis document search as metadata.
  - Issue: the issue # it fixes retrieval of id as metadata as string 
  - Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 10:49:50 -07:00
shibuiwilliam
177baef3a1 Add test for svm retriever (#7768)
# What
- This is to add unit test for svm retriever.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 09:57:24 -07:00
Filip Michalsky
69b9db2b5e Notebook update: sales agent with tools (#7753)
- Description: This is an update to a previously published notebook. 
Sales Agent now has access to tools, and this notebook shows how to use
a Product Knowledge base
  to reduce hallucinations and act as a better sales person!
  - Issue: N/A
  - Dependencies: `chromadb openai tiktoken`
  - Tag maintainer:  @baskaryan @hinthornw
  - Twitter handle: @FilipMichalsky
2023-07-18 09:53:12 -07:00
shibuiwilliam
f29a5d4bcc add test for knn retriever (#7769)
# What
- This is to add test for knn retriever.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 09:52:11 -07:00
Orgil
75d3f1e5e6 remove unused import in voice assistant doc (#7757)
Description: Removed unused import in voice_assistant doc. 
Tag maintainer: @baskaryan
2023-07-18 09:51:28 -07:00
maciej-skorupka
c6d1d6d7fc feat: moving azure OpenAI API version to the latest 2023-05-15 (#7764)
Moving to the latest non-preview Azure OpenAI API version=2023-05-15.
The previous 2023-03-15-preview doesn't have support, SLA etc. For
instance, OpenAI SDK has moved to this version
https://github.com/openai/openai-python/releases/tag/v0.27.7

@baskaryan
2023-07-18 09:50:15 -07:00
satorioh
259a409998 docs(zilliz): connection_args add token description for serverless cl… (#7810)
Description:

Currently, Zilliz only support dedicated clusters using a pair of
username and password for connection. Regarding serverless clusters,
they can connect to them by using API keys( [ see official note
detail](https://docs.zilliz.com/docs/manage-cluster-credentials)), so I
add API key(token) description in Zilliz docs to make it more obvious
and convenient for this group of users to better utilize Zilliz. No
changes done to code.

---------

Co-authored-by: Robin.Wang <3Jg$94sbQ@q1>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 09:31:39 -07:00
shibuiwilliam
235264a246 Add/test faiss (#7809)
# What
- Add missing test cases to faiss vectore stores
2023-07-18 08:30:35 -07:00
maciej-skorupka
5de7815310 docs: added comment from azure llm to azure chat about GPT-4 (#7884)
Azure GPT-4 models can't be accessed via LLM model. It's easy to miss
that and a lot of discussions about that are on the Internet. Therefore
I added a comment in Azure LLM docs that mentions that and points to
Azure Chat OpenAI docs.
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 08:05:41 -07:00
Leonid Ganeline
4a05b7f772 docstrings prompts (#7844)
Added missed docstrings in `prompts`
@baskaryan
2023-07-18 07:58:22 -07:00
Bill Zhang
dda11d2a05 WeaviateHybridSearchRetriever option to enable scores. (#7861)
Description: This PR adds the option to retrieve scores and explanations
in the WeaviateHybridSearchRetriever. This feature improves the
usability of the retriever by allowing users to understand the scoring
logic behind the search results and further refine their search queries.

Issue: This PR is a solution to the issue #7855 
Dependencies: This PR does not introduce any new dependencies.

Tag maintainer: @rlancemartin, @eyurtsev

I have included a unit test for the added feature, ensuring that it
retrieves scores and explanations correctly. I have also included an
example notebook demonstrating its use.
2023-07-18 07:57:17 -07:00
Leonid Ganeline
527210972e docstrings output_parsers (#7859)
Added/updated the docstrings from `output_parsers`
 @baskaryan
2023-07-18 07:51:44 -07:00
Jonathan Pedoeem
c460c29a64 Adding Docs for PromptLayerCallbackHandler (#7860)
Here I am adding documentation for the `PromptLayerCallbackHandler`.
When we created the initial PR for the callback handler the docs were
causing issues, so we merged without the docs.
2023-07-18 07:51:16 -07:00
ljeagle
3902b85657 Add metadata and page_content filters of documents in AwaDB (#7862)
1. Add the metadata filter of documents.
2. Add the text page_content filter of documents
3. fix the bug of similarity_search_with_score

Improvement and fix bug of AwaDB
Fix the conflict https://github.com/hwchase17/langchain/pull/7840
@rlancemartin @eyurtsev  Thanks!

---------

Co-authored-by: vincent <awadb.vincent@gmail.com>
2023-07-18 07:50:17 -07:00
German Martin
f1eaa9b626 Lost in the middle: We have been ordering documents the WRONG way. (for long context) (#7520)
Motivation, it seems that when dealing with a long context and "big"
number of relevant documents we must avoid using out of the box score
ordering from vector stores.
See: https://arxiv.org/pdf/2306.01150.pdf

So, I added an additional parameter that allows you to reorder the
retrieved documents so we can work around this performance degradation.
The relevance respect the original search score but accommodates the
lest relevant document in the middle of the context.
Extract from the paper (one image speaks 1000 tokens):

![image](https://github.com/hwchase17/langchain/assets/1821407/fafe4843-6e18-4fa6-9416-50cc1d32e811)
This seems to be common to all diff arquitectures. SO I think we need a
good generic way to implement this reordering and run some test in our
already running retrievers.
It could be that my approach is not the best one from the architecture
point of view, happy to have a discussion about that.
For me this was the best place to introduce the change and start
retesting diff implementations.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-18 07:45:15 -07:00
Bagatur
6a32f93669 add ls link (#7847) 2023-07-18 07:39:26 -07:00
Leonid Ganeline
17956ff08e docstrings agents (#7866)
Added/Updated docstrings for `agents`
@baskaryan
2023-07-18 02:23:24 -07:00
William FH
c6f2d27789 Docs Nits (#7874)
Add links to reference docs
2023-07-18 01:50:14 -07:00
William FH
3179ee3a56 Evals docs (#7460)
Still don't have good "how to's", and the guides / examples section
could be further pruned and improved, but this PR adds a couple examples
for each of the common evaluator interfaces.

- [x] Example docs for each implemented evaluator
- [x] "how to make a custom evalutor" notebook for each low level APIs
(comparison, string, agent)
- [x] Move docs to modules area
- [x] Link to reference docs for more information
- [X] Still need to finish the evaluation index page
- ~[ ] Don't have good data generation section~
- ~[ ] Don't have good how to section for other common scenarios / FAQs
like regression testing, testing over similar inputs to measure
sensitivity, etc.~
2023-07-18 01:00:01 -07:00
William FH
d87564951e LS0010 (#7871)
Bump langsmith version. Has some additional UX improvements
2023-07-18 00:28:37 -07:00
William FH
e294ba475a Some mitigations for RCE in PAL chain (#7870)
Some docstring / small nits to #6003

---------

Co-authored-by: BoazWasserman <49598618+boazwasserman@users.noreply.github.com>
Co-authored-by: HippoTerrific <49598618+HippoTerrific@users.noreply.github.com>
Co-authored-by: Or Raz <orraz1994@gmail.com>
2023-07-17 22:58:47 -07:00
Nicolas
46330da2e7 docs: Mendable: Fixes pretty sources not working (#7863)
This new version fixes the"Verified Sources" display that got broken.
Instead of displaying the full URL, it shows the title of the page the
source is from.
2023-07-17 18:23:46 -07:00
Leonid Ganeline
f5ae8f1980 docstrings tools (#7848)
Added docstrings in `tools`.

 @baskaryan
2023-07-17 17:50:19 -07:00
Leonid Ganeline
74b701f42b docstrings retrievers (#7858)
Added/updated docstrings `retrievers`

@baskaryan
2023-07-17 17:47:17 -07:00
Jasper
5b4d53e8ef Add text_content kwarg to BrowserlessLoader (#7856)
Added keyword argument to toggle between getting the text content of a
site versus its HTML when using the `BrowserlessLoader`
2023-07-17 17:02:19 -07:00
William FH
2aa3cf4e5f update notebook (#7852) 2023-07-17 14:46:42 -07:00
Matt Robinson
3c489be773 feat: optional post-processing for Unstructured loaders (#7850)
### Summary

Adds a post-processing method for Unstructured loaders that allows users
to optionally modify or clean extracted elements.

### Testing

```python
from langchain.document_loaders import UnstructuredFileLoader
from unstructured.cleaners.core import clean_extra_whitespace

loader = UnstructuredFileLoader(
    "./example_data/layout-parser-paper.pdf",
    mode="elements",
    post_processors=[clean_extra_whitespace],
)

docs = loader.load()
docs[:5]
```


### Reviewrs
  - @rlancemartin
  - @eyurtsev
  - @hwchase17
2023-07-17 12:13:05 -07:00
Bagatur
2a315dbee9 fix nb (#7843) 2023-07-17 09:39:11 -07:00
Bagatur
3f1302a4ab bump 235 (#7836) 2023-07-17 09:37:20 -07:00
Mike Lambert
9cdea4e0e1 Update to Anthropic's claude-v2 (#7793) 2023-07-17 08:55:49 -07:00
Bagatur
98c48f303a fix (#7838) 2023-07-17 07:53:11 -07:00
Bagatur
111bd7ddbe specify comparators (#7805) 2023-07-17 07:30:48 -07:00
Dayuan Jiang
ee40d37098 add bm25 module (#7779)
- Description: Add a BM25 Retriever that do not need Elastic search
- Dependencies: rank_bm25(if it is not installed it will be install by
using pip, just like TFIDFRetriever do)
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: DayuanJian21687

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-17 07:30:17 -07:00
Liu Ming
fa0a9e502a Add LLM for ChatGLM(2)-6B API (#7774)
Description:
Add LLM for ChatGLM-6B & ChatGLM2-6B API

Related Issue: 
Will the langchain support ChatGLM? #4766
Add support for selfhost models like ChatGLM or transformer models #1780

Dependencies: 
No extra library install required. 
It wraps api call to a ChatGLM(2)-6B server(start with api.py), so api
endpoint is required to run.

Tag maintainer:  @mlot 

Any comments on this PR would be appreciated.
---------

Co-authored-by: mlot <limpo2000@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-17 07:27:17 -07:00
sseide
25e3d3f283 Support Redis Sentinel database connections (#5196)
# Support Redis Sentinel database connections

This PR adds the support to connect not only to Redis standalone servers
but High Availability Replication sets too
(https://redis.io/docs/management/sentinel/)
Redis Replica Sets have on Master allowing to write data and 2+ replicas
with read-only access to the data. The additional Redis Sentinel
instances monitor all server and reconfigure the RW-Master on the fly if
it comes unavailable.

Therefore all connections must be made through the Sentinels the query
the current master for a read-write connection. This PR adds basic
support to also allow a redis connection url specifying a Sentinel as
Redis connection.

Redis documentation and Jupyter notebook with Redis examples are updated
to mention how to connect to a redis Replica Set with Sentinels

        - 

Remark - i did not found test cases for Redis server connections to add
new cases here. Therefor i tests the new utility class locally with
different kind of setups to make sure different connection urls are
working as expected. But no test case here as part of this PR.
2023-07-17 07:18:51 -07:00
Yifei Song
2e47412073 Add Xorbits agent (#7647)
- [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source
computing framework that makes it easy to scale data science and machine
learning workloads in parallel. Xorbits can leverage multi cores or GPUs
to accelerate computation on a single machine, or scale out up to
thousands of machines to support processing terabytes of data.

- This PR added support for the Xorbits agent, which allows langchain to
interact with Xorbits Pandas dataframe and Xorbits Numpy array.
- Dependencies: This change requires the Xorbits library to be installed
in order to be used.
`pip install xorbits`
- Request for review: @hinthornw
- Twitter handle: https://twitter.com/Xorbitsio
2023-07-17 07:09:51 -07:00
Ankush Gola
ff3aada0b2 minor langsmith notebook fixes (#7814)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-16 21:27:03 -07:00
William FH
ca79044948 Export Tracer from callbacks (#7812)
Improve discoverability
2023-07-16 20:58:13 -07:00
William FH
beb38f4f4d Share client in evaluation callback (#7807)
Guarantee the evaluator traces go to same endpoint
2023-07-16 17:47:38 -07:00
William FH
1db13e8a85 Fix chat example output mapper (#7808)
Was only serializing when no key was provided
2023-07-16 17:47:05 -07:00
William FH
c58d35765d Add examples to docstrings (#7796)
and:
- remove dataset name from autogenerated project name
- print out project name to view
2023-07-16 12:05:56 -07:00
William FH
ed97af423c Accept LLM via constructor (#7794) 2023-07-16 08:46:36 -07:00
Ankush Gola
c4ece52dac update LangSmith notebook (#7767)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-15 21:05:09 -07:00
Kenny
0d058d4046 Add try except block to OpenAIWhisperParser (#7505) 2023-07-15 15:42:00 -07:00
William FH
4cb9f1eda8 Update langsmith version (#7759) 2023-07-15 12:01:41 -07:00
Lance Martin
1d06eee3b5 Fix ntbk link in docs (#7755)
Minor fix to running to
[docs](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).
2023-07-15 09:11:18 -07:00
William FH
2e3d77c34e Fix eval loader when overriding arguments (#7734)
- Update the negative criterion descriptions to prevent bad predictions
- Add support for normalizing the string distance
- Fix potential json deserializing into float issues in the example
mapper
2023-07-15 08:30:32 -07:00
Bagatur
c871c04270 bump 234 (#7754) 2023-07-15 10:49:51 -04:00
Gordon Clark
96f3dff050 MediaWiki docloader improvements + unit tests (#5879)
Starting over from #5654 because I utterly borked the poetry.lock file.

Adds new paramerters for to the MWDumpLoader class:

* skip_redirecst (bool) Tells the loader to skip articles that redirect
to other articles. False by default.
* stop_on_error (bool) Tells the parser to skip any page that causes a
parse error. True by default.
* namespaces (List[int]) Tells the parser which namespaces to parse.
Contains namespaces from -2 to 15 by default.

Default values are chosen to preserve backwards compatibility.

Sample dump XML and full unit test coverage (with extended tests that
pass!) also included!

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-15 10:49:36 -04:00
Xavier
4c8106311f Add pip install langsmith for Quick Install part of README (#7694)
**Issue**
When I use conda to install langchain, a dependency error throwed -
"ModuleNotFoundError: No module named 'langsmith'"

**Updated**
Run `pip install langsmith` when install langchain with conda

Co-authored-by: xaver.xu <xavier.xu@batechworks.com>
2023-07-15 10:27:32 -04:00
Mohammad Mohtashim
b8b8a138df Simple Import fix in Tools Exception Docs (#7740)
Issue: #7720
 @hinthornw
2023-07-15 10:25:34 -04:00
Nicolas
43f900fd38 docs: Mendable Search Improvements (#7744)
- New pin-to-side (button). This functionality allows you to search the
docs while asking the AI for questions
- Fixed the search bar in Firefox that won't detect a mouse click
- Fixes and improvements overall in the model's performance
2023-07-15 10:19:21 -04:00
rjarun8
b7c409152a Document loader/debug (#7750)
Description: Added debugging output in DirectoryLoader to identify the
file being processed.
Issue: [Need a trace or debug feature in Lanchain DirectoryLoader
#7725](https://github.com/hwchase17/langchain/issues/7725)
Dependencies: No additional dependencies are required.
Tag maintainer: @rlancemartin, @eyurtsev
This PR enhances the DirectoryLoader with debugging output to help
diagnose issues when loading documents. This new feature does not add
any dependencies and has been tested on a local machine.
2023-07-15 10:18:27 -04:00
Lance Martin
b015647e31 Add GPT4All embeddings (#7743)
Support for [GPT4All
embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-15 10:04:29 -04:00
Chang Sau Sheong
b6a7f40ad3 added support for Google Images search (#7751)
- Description: Added Google Image Search support for SerpAPIWrapper 
  - Issue: NA
  - Dependencies: None
  - Tag maintainer: @hinthornw
  - Twitter handle: @sausheong

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-15 10:04:18 -04:00
Kacper Łukawski
1ff5b67025 Implement async API for Qdrant vector store (#7704)
Inspired by #5550, I implemented full async API support in Qdrant. The
docs were extended to mention the existence of asynchronous operations
in Langchain. I also used that chance to restructure the tests of Qdrant
and provided a suite of tests for the async version. Async API requires
the GRPC protocol to be enabled. Thus, it doesn't work on local mode
yet, but we're considering including the support to be consistent.
2023-07-15 09:33:26 -04:00
Bearnardd
275b926cf7 add missing import (#7730)
Just a nit documentation fix

 @baskaryan
2023-07-14 20:03:23 -04:00
Bearnardd
9800c6051c add support for truncate arg for HuggingFaceTextGenInference class (#7728)
Fixes https://github.com/hwchase17/langchain/issues/7650

* add support for `truncate` argument of `HugginFaceTextGenInference`

@baskaryan
2023-07-14 16:23:56 -04:00
Lorenzo
77e6bbe6f0 fix typo in deeplake.ipynb (#7718)
- Fixing typos in deeplake documentation
- @baskaryan
2023-07-14 13:38:31 -04:00
Samuel Berthe
2be3515a66 SQLDatabase: adding security disclamer (#7710)
It might be obvious to most engineers, but I think everybody should be
cautious when using such a chain.

![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)
2023-07-14 13:38:16 -04:00
William FH
fcf98dc4c1 Check for Tiktoken (#7705) 2023-07-14 09:49:01 -07:00
Bagatur
bae93682f6 update docs (#7714) 2023-07-14 11:49:09 -04:00
Bagatur
b065da6933 Bagatur/docs nit (#7712) 2023-07-14 11:13:02 -04:00
Bagatur
87d81b6acc Redirect old text splitter page (#7708)
related to #7665
2023-07-14 11:12:18 -04:00
Aarav Borthakur
210296a71f Integrate Rockset as a document loader (#7681)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Integrate [Rockset](https://rockset.com/docs/) as a document loader.

Issue: None
Dependencies: Nothing new (rockset's dependency was already added
[here](https://github.com/hwchase17/langchain/pull/6216))
Tag maintainer: @rlancemartin

I have added a test for the integration and an example notebook showing
its use. I ran `make lint` and everything looks good.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-14 07:58:13 -07:00
2479 changed files with 68829 additions and 28275 deletions

View File

@@ -15,7 +15,11 @@ You may use the button above, or follow these steps to open this repo in a Codes
For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).
## VS Code Dev Containers
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
Note: If you click this link you will open the main repo and not your local cloned repo, you can use this link and replace with your username and cloned repo name:
https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/<yourusername>/<yourclonedreponame>
If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
@@ -25,7 +29,7 @@ You can also follow these steps to open this repo in a container using the VS Co
2. Open a locally cloned copy of the code:
- Clone this repository to your local filesystem.
- Fork and Clone this repository to your local filesystem.
- Press <kbd>F1</kbd> and select the **Dev Containers: Open Folder in Container...** command.
- Select the cloned copy of this folder, wait for the container to start, and try things out!

View File

@@ -2,7 +2,7 @@ version: '3'
services:
langchain:
build:
dockerfile: dev.Dockerfile
dockerfile: libs/langchain/dev.Dockerfile
context: ..
volumes:
# Update this to wherever you want VS Code to mount the folder of your project

View File

@@ -69,6 +69,14 @@ This project uses [Poetry](https://python-poetry.org/) as a dependency manager.
3. Tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
4. Continue with the following steps.
There are two separate projects in this repository:
- `langchain`: core langchain code, abstractions, and use cases
- `langchain.experimental`: more experimental code
Each of these has their OWN development environment.
In order to run any of the commands below, please move into their respective directories.
For example, to contribute to `langchain` run `cd libs/langchain` before getting started with the below.
To install requirements:
```bash
@@ -248,6 +256,9 @@ When you run `poetry install`, the `langchain` package is installed as editable
## Documentation
While the code is split between `langchain` and `langchain.experimental`, the documentation is one holistic thing.
This covers how to get started contributing to documentation.
### Contribute Documentation
The docs directory contains Documentation and API Reference.

View File

@@ -7,6 +7,8 @@ Replace this comment with:
- Tag maintainer: for a quicker response, tag the relevant maintainer (see below),
- Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on network access,
2. an example notebook showing its use.

View File

@@ -52,11 +52,13 @@ runs:
- name: Check Poetry File
shell: bash
working-directory: ${{ inputs.working-directory }}
run: |
poetry check
- name: Check lock file
shell: bash
working-directory: ${{ inputs.working-directory }}
run: |
poetry lock --check

View File

@@ -1,15 +1,21 @@
name: lint
on:
push:
branches: [master]
pull_request:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.4.2"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
@@ -31,6 +37,10 @@ jobs:
- name: Install dependencies
run: |
poetry install
- name: Install langchain editable
if: ${{ inputs.working-directory != 'langchain' }}
run: |
pip install -e ../langchain
- name: Analysing the code with our lint
run: |
make lint

View File

@@ -1,13 +1,12 @@
name: release
on:
pull_request:
types:
- closed
branches:
- master
paths:
- 'pyproject.toml'
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.4.2"
@@ -18,6 +17,9 @@ jobs:
${{ github.event.pull_request.merged == true }}
&& ${{ contains(github.event.pull_request.labels.*.name, 'release') }}
runs-on: ubuntu-latest
defaults:
run:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v3
- name: Install poetry
@@ -35,6 +37,7 @@ jobs:
echo version=$(poetry version --short) >> $GITHUB_OUTPUT
- name: Create Release
uses: ncipollo/release-action@v1
if: ${{ inputs.working-directory == 'libs/langchain' }}
with:
artifacts: "dist/*"
token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -1,16 +1,25 @@
name: test
on:
push:
branches: [master]
pull_request:
workflow_dispatch:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
test_type:
type: string
description: "Test types to run"
default: '["core", "extended"]'
env:
POETRY_VERSION: "1.4.2"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
@@ -19,9 +28,7 @@ jobs:
- "3.9"
- "3.10"
- "3.11"
test_type:
- "core"
- "extended"
test_type: ${{ fromJSON(inputs.test_type) }}
name: Python ${{ matrix.python-version }} ${{ matrix.test_type }}
steps:
- uses: actions/checkout@v3
@@ -29,6 +36,7 @@ jobs:
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
working-directory: ${{ inputs.working-directory }}
poetry-version: "1.4.2"
cache-key: ${{ matrix.test_type }}
install-command: |
@@ -39,6 +47,10 @@ jobs:
echo "Running extended tests, installing dependencies with poetry..."
poetry install -E extended_testing
fi
- name: Install langchain editable
if: ${{ inputs.working-directory != 'langchain' }}
run: |
pip install -e ../langchain
- name: Run ${{matrix.test_type}} tests
run: |
if [ "${{ matrix.test_type }}" == "core" ]; then

View File

@@ -20,3 +20,5 @@ jobs:
uses: actions/checkout@v3
- name: Codespell
uses: codespell-project/actions-codespell@v2
with:
skip: guide_imports.json

27
.github/workflows/langchain_ci.yml vendored Normal file
View File

@@ -0,0 +1,27 @@
---
name: libs/langchain CI
on:
push:
branches: [ master ]
pull_request:
paths:
- '.github/workflows/_lint.yml'
- '.github/workflows/_test.yml'
- '.github/workflows/langchain_ci.yml'
- 'libs/langchain/**'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
lint:
uses:
./.github/workflows/_lint.yml
with:
working-directory: libs/langchain
secrets: inherit
test:
uses:
./.github/workflows/_test.yml
with:
working-directory: libs/langchain
secrets: inherit

View File

@@ -0,0 +1,29 @@
---
name: libs/langchain-experimental CI
on:
push:
branches: [ master ]
pull_request:
paths:
- '.github/workflows/_lint.yml'
- '.github/workflows/_test.yml'
- '.github/workflows/langchain_experimental_ci.yml'
- 'libs/langchain/**'
- 'libs/experimental/**'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
lint:
uses:
./.github/workflows/_lint.yml
with:
working-directory: libs/experimental
secrets: inherit
test:
uses:
./.github/workflows/_test.yml
with:
working-directory: libs/experimental
test_type: '["core"]'
secrets: inherit

View File

@@ -0,0 +1,20 @@
---
name: libs/langchain-experimental Release
on:
pull_request:
types:
- closed
branches:
- master
paths:
- 'libs/experimental/pyproject.toml'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
release:
uses:
./.github/workflows/_release.yml
with:
working-directory: libs/experimental
secrets: inherit

20
.github/workflows/langchain_release.yml vendored Normal file
View File

@@ -0,0 +1,20 @@
---
name: libs/langchain Release
on:
pull_request:
types:
- closed
branches:
- master
paths:
- 'libs/langchain/pyproject.toml'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
release:
uses:
./.github/workflows/_release.yml
with:
working-directory: libs/langchain
secrets: inherit

1
.gitignore vendored
View File

@@ -162,6 +162,7 @@ docs/.docusaurus/
docs/.cache-loader/
docs/_dist
docs/api_reference/api_reference.rst
docs/api_reference/experimental_api_reference.rst
docs/api_reference/_build
docs/api_reference/*/
!docs/api_reference/_static/

View File

@@ -24,6 +24,6 @@ sphinx:
# Optionally declare the Python requirements required to build your docs
python:
install:
- requirements: docs/requirements.txt
- requirements: docs/api_reference/requirements.txt
- method: pip
path: .

61
MIGRATE.md Normal file
View File

@@ -0,0 +1,61 @@
# Migrating to `langchain_experimental`
We are moving any experimental components of LangChain, or components with vulnerability issues, into `langchain_experimental`.
This guide covers how to migrate.
## Installation
Previously:
`pip install -U langchain`
Now (only if you want to access things in experimental):
`pip install -U langchain langchain_experimental`
## Things in `langchain.experimental`
Previously:
`from langchain.experimental import ...`
Now:
`from langchain_experimental import ...`
## PALChain
Previously:
`from langchain.chains import PALChain`
Now:
`from langchain_experimental.pal_chain import PALChain`
## SQLDatabaseChain
Previously:
`from langchain.chains import SQLDatabaseChain`
Now:
`from langchain_experimental.sql import SQLDatabaseChain`
Alternatively, if you are just interested in using the query generation part of the SQL chain, you can check out [`create_sql_query_chain`](https://github.com/langchain-ai/langchain/blob/master/docs/extras/use_cases/tabular/sql_query.ipynb)
`from langchain.chains import create_sql_query_chain`
## `load_prompt` for Python files
Note: this only applies if you want to load Python files as prompts.
If you want to load json/yaml files, no change is needed.
Previously:
`from langchain.prompts import load_prompt`
Now:
`from langchain_experimental.prompts import load_prompt`

View File

@@ -1,18 +1,8 @@
.PHONY: all clean docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck format lint test tests test_watch integration_tests docker_tests help extended_tests
.PHONY: all clean docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck
# Default target executed when no arguments are given to make.
all: help
######################
# TESTING AND COVERAGE
######################
# Run unit tests and generate a coverage report.
coverage:
poetry run pytest --cov \
--cov-config=.coveragerc \
--cov-report xml \
--cov-report term-missing:skip-covered
######################
# DOCUMENTATION
@@ -41,46 +31,6 @@ api_docs_clean:
api_docs_linkcheck:
poetry run linkchecker docs/api_reference/_build/html/index.html
# Define a variable for the test file path.
TEST_FILE ?= tests/unit_tests/
test:
poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)
tests:
poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)
extended_tests:
poetry run pytest --disable-socket --allow-unix-socket --only-extended tests/unit_tests
test_watch:
poetry run ptw --now . -- tests/unit_tests
integration_tests:
poetry run pytest tests/integration_tests
docker_tests:
docker build -t my-langchain-image:test .
docker run --rm my-langchain-image:test
######################
# LINTING AND FORMATTING
######################
# Define a variable for Python and notebook files.
PYTHON_FILES=.
lint format: PYTHON_FILES=.
lint_diff format_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
lint lint_diff:
poetry run mypy $(PYTHON_FILES)
poetry run black $(PYTHON_FILES) --check
poetry run ruff .
format format_diff:
poetry run black $(PYTHON_FILES)
poetry run ruff --select I --fix $(PYTHON_FILES)
spell_check:
poetry run codespell --toml pyproject.toml
@@ -97,12 +47,3 @@ help:
@echo 'docs_build - build the documentation'
@echo 'docs_clean - clean the documentation build artifacts'
@echo 'docs_linkcheck - run linkchecker on the documentation'
@echo 'format - run code formatters'
@echo 'lint - run linters'
@echo 'test - run unit tests'
@echo 'tests - run unit tests'
@echo 'test TEST_FILE=<test_file> - run all tests in file'
@echo 'extended_tests - run only extended unit tests'
@echo 'test_watch - run unit tests in watch mode'
@echo 'integration_tests - run integration tests'
@echo 'docker_tests - run unit tests in docker'

View File

@@ -3,8 +3,8 @@
⚡ Building applications with LLMs through composability ⚡
[![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)
[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml)
[![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml)
[![CI](https://github.com/hwchase17/langchain/actions/workflows/langchain_ci.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/langchain_ci.yml)
[![Experimental CI](https://github.com/hwchase17/langchain/actions/workflows/langchain_experimental_ci.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/langchain_experimental_ci.yml)
[![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
@@ -12,20 +12,28 @@
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)
[![GitHub star chart](https://img.shields.io/github/stars/hwchase17/langchain?style=social)](https://star-history.com/#hwchase17/langchain)
[![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain)
[![Dependency Status](https://img.shields.io/librariesio/github/langchain-ai/langchain)](https://libraries.io/github/langchain-ai/langchain)
[![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues)
Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).
**Production Support:** As you move your LangChains into production, we'd love to offer more comprehensive support.
Please fill out [this form](https://forms.gle/57d8AmXBYp8PP8tZA) and we'll set up a dedicated support Slack channel.
Please fill out [this form](https://6w1pwbss0py.typeform.com/to/rrbrdTH2) and we'll set up a dedicated support Slack channel.
## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28
In an effort to make `langchain` leaner and safer, we are moving select chains to `langchain_experimental`.
This migration has already started, but we are remaining backwards compatible until 7/28.
On that date, we will remove functionality from `langchain`.
Read more about the motivation and the progress [here](https://github.com/hwchase17/langchain/discussions/8043).
Read how to migrate your code [here](MIGRATE.md).
## Quick Install
`pip install langchain`
or
`conda install langchain -c conda-forge`
`pip install langsmith && conda install langchain -c conda-forge`
## 🤔 What is this?

View File

@@ -13,5 +13,6 @@ cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
poetry run nbdoc_build
poetry run python generate_api_reference_links.py
yarn install
yarn start

View File

@@ -7,19 +7,67 @@
# -- Path setup --------------------------------------------------------------
import json
import os
import sys
from pathlib import Path
import toml
from docutils import nodes
from sphinx.util.docutils import SphinxDirective
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
import toml
_DIR = Path(__file__).parent.absolute()
sys.path.insert(0, os.path.abspath("."))
sys.path.insert(0, os.path.abspath("../../libs/langchain"))
sys.path.insert(0, os.path.abspath("../../libs/experimental"))
with open("../../pyproject.toml") as f:
with (_DIR.parents[1] / "libs" / "langchain" / "pyproject.toml").open("r") as f:
data = toml.load(f)
with (_DIR / "guide_imports.json").open("r") as f:
imported_classes = json.load(f)
class ExampleLinksDirective(SphinxDirective):
"""Directive to generate a list of links to examples.
We have a script that extracts links to API reference docs
from our notebook examples. This directive uses that information
to backlink to the examples from the API reference docs."""
has_content = False
required_arguments = 1
def run(self):
"""Run the directive.
Called any time :example_links:`ClassName` is used
in the template *.rst files."""
class_or_func_name = self.arguments[0]
links = imported_classes.get(class_or_func_name, {})
list_node = nodes.bullet_list()
for doc_name, link in links.items():
item_node = nodes.list_item()
para_node = nodes.paragraph()
link_node = nodes.reference()
link_node["refuri"] = link
link_node.append(nodes.Text(doc_name))
para_node.append(link_node)
item_node.append(para_node)
list_node.append(item_node)
if list_node.children:
title_node = nodes.title()
title_node.append(nodes.Text(f"Examples using {class_or_func_name}"))
return [title_node, list_node]
return [list_node]
def setup(app):
app.add_directive("example_links", ExampleLinksDirective)
# -- Project information -----------------------------------------------------

View File

@@ -4,14 +4,16 @@ import re
from pathlib import Path
ROOT_DIR = Path(__file__).parents[2].absolute()
PKG_DIR = ROOT_DIR / "langchain"
PKG_DIR = ROOT_DIR / "libs" / "langchain" / "langchain"
EXP_DIR = ROOT_DIR / "libs" / "experimental" / "langchain_experimental"
WRITE_FILE = Path(__file__).parent / "api_reference.rst"
EXP_WRITE_FILE = Path(__file__).parent / "experimental_api_reference.rst"
def load_members() -> dict:
def load_members(dir: Path) -> dict:
members: dict = {}
for py in glob.glob(str(PKG_DIR) + "/**/*.py", recursive=True):
module = py[len(str(PKG_DIR)) + 1 :].replace(".py", "").replace("/", ".")
for py in glob.glob(str(dir) + "/**/*.py", recursive=True):
module = py[len(str(dir)) + 1 :].replace(".py", "").replace("/", ".")
top_level = module.split(".")[0]
if top_level not in members:
members[top_level] = {"classes": [], "functions": []}
@@ -26,12 +28,10 @@ def load_members() -> dict:
return members
def construct_doc(members: dict) -> str:
full_doc = """\
.. _api_reference:
def construct_doc(pkg: str, members: dict) -> str:
full_doc = f"""\
=============
API Reference
``{pkg}`` API Reference
=============
"""
@@ -40,16 +40,12 @@ API Reference
functions = _members["functions"]
if not (classes or functions):
continue
module_title = module.replace("_", " ").title()
if module_title == "Llms":
module_title = "LLMs"
section = f":mod:`langchain.{module}`: {module_title}"
section = f":mod:`{pkg}.{module}`"
full_doc += f"""\
{section}
{'=' * (len(section) + 1)}
.. automodule:: langchain.{module}
.. automodule:: {pkg}.{module}
:no-members:
:no-inherited-members:
@@ -60,7 +56,7 @@ API Reference
full_doc += f"""\
Classes
--------------
.. currentmodule:: langchain
.. currentmodule:: {pkg}
.. autosummary::
:toctree: {module}
@@ -74,10 +70,11 @@ Classes
full_doc += f"""\
Functions
--------------
.. currentmodule:: langchain
.. currentmodule:: {pkg}
.. autosummary::
:toctree: {module}
:template: function.rst
{fstring}
@@ -86,10 +83,14 @@ Functions
def main() -> None:
members = load_members()
full_doc = construct_doc(members)
lc_members = load_members(PKG_DIR)
lc_doc = ".. _api_reference:\n\n" + construct_doc("langchain", lc_members)
with open(WRITE_FILE, "w") as f:
f.write(full_doc)
f.write(lc_doc)
exp_members = load_members(EXP_DIR)
exp_doc = ".. _experimental_api_reference:\n\n" + construct_doc("langchain_experimental", exp_members)
with open(EXP_WRITE_FILE, "w") as f:
f.write(exp_doc)
if __name__ == "__main__":

File diff suppressed because one or more lines are too long

View File

@@ -1,3 +1,4 @@
-e libs/langchain
autodoc_pydantic==1.8.0
myst_parser
nbsphinx==0.8.9
@@ -9,6 +10,4 @@ sphinx-panels
toml
myst_nb
sphinx_copybutton
pydata-sphinx-theme==0.13.1
nbdoc
urllib3<2
pydata-sphinx-theme==0.13.1

View File

@@ -26,3 +26,5 @@
{%- endfor %}
{% endif %}
{% endblock %}
.. example_links:: {{ objname }}

View File

@@ -0,0 +1,8 @@
:mod:`{{module}}`.{{objname}}
{{ underline }}==============
.. currentmodule:: {{ module }}
.. autofunction:: {{ objname }}
.. example_links:: {{ objname }}

View File

@@ -45,6 +45,9 @@
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('api_reference') }}">API</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('experimental_api_reference') }}">Experimental</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" target="_blank" rel="noopener noreferrer" href="https://python.langchain.com/">Python Docs</a>
</li>

View File

@@ -745,6 +745,11 @@ span.descname {
background-color: transparent;
padding: 0;
font-family: monospace;
font-size: 1.2rem;
}
em.property {
font-weight: normal;
}
span.descclassname {

View File

@@ -1,10 +0,0 @@
---
sidebar_position: 0
---
# Integrations
Visit the [Integrations Hub](https://integrations.langchain.com) to further explore, upvote and request integrations across key LangChain components.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -51,7 +51,7 @@ Walkthroughs and best-practices for common end-to-end use cases, like:
Learn best practices for developing with LangChain.
### [Ecosystem](/docs/ecosystem/)
LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/ecosystem/integrations/) and [dependent repos](/docs/ecosystem/dependents.html).
LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/integrations/) and [dependent repos](/docs/ecosystem/dependents).
### [Additional resources](/docs/additional_resources/)
Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/additional_resources/youtube.html) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).

View File

@@ -22,28 +22,74 @@ import OpenAISetup from "@snippets/get_started/quickstart/openai_setup.mdx"
## Building an application
Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications. Modules can be used as stand-alones in simple applications and they can be combined for more complex use cases.
Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications.
Modules can be used as stand-alones in simple applications and they can be combined for more complex use cases.
The core building block of LangChain applications is the LLMChain.
This combines three things:
- LLM: The language model is the core reasoning engine here. In order to work with LangChain, you need to understand the different types of language models and how to work with them.
- Prompt Templates: This provides instructions to the language model. This controls what the language model outputs, so understanding how to construct prompts and different prompting strategies is crucial.
- Output Parsers: These translate the raw response from the LLM to a more workable format, making it easy to use the output downstream.
In this getting started guide we will cover those three components by themselves, and then cover the LLMChain which combines all of them.
Understanding these concepts will set you up well for being able to use and customize LangChain applications.
Most LangChain applications allow you to configure the LLM and/or the prompt used, so knowing how to take advantage of this will be a big enabler.
## LLMs
#### Get predictions from a language model
The basic building block of LangChain is the LLM, which takes in text and generates more text.
There are two types of language models, which in LangChain are called:
As an example, suppose we're building an application that generates a company name based on a company description. In order to do this, we need to initialize an OpenAI model wrapper. In this case, since we want the outputs to be MORE random, we'll initialize our model with a HIGH temperature.
- LLMs: this is a language model which takes a string as input and returns a string
- ChatModels: this is a language model which takes a list of messages as input and returns a message
import LLM from "@snippets/get_started/quickstart/llm.mdx"
The input/output for LLMs is simple and easy to understand - a string.
But what about ChatModels? The input there is a list of `ChatMessage`s, and the output is a single `ChatMessage`.
A `ChatMessage` has two required components:
<LLM/>
- `content`: This is the content of the message.
- `role`: This is the role of the entity from which the `ChatMessage` is coming from.
## Chat models
LangChain provides several objects to easily distinguish between different roles:
Chat models are a variation on language models. While chat models use language models under the hood, the interface they expose is a bit different: rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
- `HumanMessage`: A `ChatMessage` coming from a human/user.
- `AIMessage`: A `ChatMessage` coming from an AI/assistant.
- `SystemMessage`: A `ChatMessage` coming from the system.
- `FunctionMessage`: A `ChatMessage` coming from a function call.
You can get chat completions by passing one or more messages to the chat model. The response will be a message. The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`.
If none of those roles sound right, there is also a `ChatMessage` class where you can specify the role manually.
For more information on how to use these different messages most effectively, see our prompting guide.
import ChatModel from "@snippets/get_started/quickstart/chat_model.mdx"
LangChain exposes a standard interface for both, but it's useful to understand this difference in order to construct prompts for a given language model.
The standard interface that LangChain exposes has two methods:
- `predict`: Takes in a string, returns a string
- `predict_messages`: Takes in a list of messages, returns a message.
Let's see how to work with these different types of models and these different types of inputs.
First, let's import an LLM and a ChatModel.
import ImportLLMs from "@snippets/get_started/quickstart/import_llms.mdx"
<ImportLLMs/>
The `OpenAI` and `ChatOpenAI` objects are basically just configuration objects.
You can initialize them with parameters like `temperature` and others, and pass them around.
Next, let's use the `predict` method to run over a string input.
import InputString from "@snippets/get_started/quickstart/input_string.mdx"
<InputString/>
Finally, let's use the `predict_messages` method to run over a list of messages.
import InputMessages from "@snippets/get_started/quickstart/input_messages.mdx"
<InputMessages/>
For both these methods, you can also pass in parameters as key word arguments.
For example, you could pass in `temperature=0` to adjust the temperature that is used from what the object was configured with.
Whatever values are passed in during run time will always override what the object was configured with.
<ChatModel/>
## Prompt templates
@@ -51,108 +97,66 @@ Most LLM applications do not pass user input directly into an LLM. Usually they
In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instructions.
PromptTemplates help with exactly this!
They bundle up all the logic for going from user input into a fully formatted prompt.
This can start off very simple - for example, a prompt to produce the above string would just be:
import PromptTemplateLLM from "@snippets/get_started/quickstart/prompt_templates_llms.mdx"
import PromptTemplateChatModel from "@snippets/get_started/quickstart/prompt_templates_chat_models.mdx"
<Tabs>
<TabItem value="llms" label="LLMs" default>
With PromptTemplates this is easy! In this case our template would be very simple:
<PromptTemplateLLM/>
</TabItem>
<TabItem value="chat_models" label="Chat models">
Similar to LLMs, you can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplate`s. You can use `ChatPromptTemplate`'s `format_messages` method to generate the formatted messages.
However, the advantages of using these over raw string formatting are several.
You can "partial" out variables - eg you can format only some of the variables at a time.
You can compose them together, easily combining different templates into a single prompt.
For explanations of these functionalities, see the [section on prompts](/docs/modules/model_io/prompts) for more detail.
Because this is generating a list of messages, it is slightly more complex than the normal prompt template which is generating only a string. Please see the detailed guides on prompts to understand more options available to you here.
PromptTemplates can also be used to produce a list of messages.
In this case, the prompt not only contains information about the content, but also each message (its role, its position in the list, etc)
Here, what happens most often is a ChatPromptTemplate is a list of ChatMessageTemplates.
Each ChatMessageTemplate contains instructions for how to format that ChatMessage - its role, and then also its content.
Let's take a look at this below:
<PromptTemplateChatModel/>
</TabItem>
</Tabs>
## Chains
ChatPromptTemplates can also include other things besides ChatMessageTemplates - see the [section on prompts](/docs/modules/model_io/prompts) for more detail.
Now that we've got a model and a prompt template, we'll want to combine the two. Chains give us a way to link (or chain) together multiple primitives, like models, prompts, and other chains.
## Output Parsers
import ChainLLM from "@snippets/get_started/quickstart/chains_llms.mdx"
import ChainChatModel from "@snippets/get_started/quickstart/chains_chat_models.mdx"
OutputParsers convert the raw output of an LLM into a format that can be used downstream.
There are few main type of OutputParsers, including:
<Tabs>
<TabItem value="llms" label="LLMs" default>
- Convert text from LLM -> structured information (eg JSON)
- Convert a ChatMessage into just a string
- Convert the extra information returned from a call besides the message (like OpenAI function invocation) into a string.
The simplest and most common type of chain is an LLMChain, which passes an input first to a PromptTemplate and then to an LLM. We can construct an LLM chain from our existing model and prompt template.
For full information on this, see the [section on output parsers](/docs/modules/model_io/output_parsers)
<ChainLLM/>
In this getting started guide, we will write our own output parser - one that converts a comma separated list into a list.
There we go, our first chain! Understanding how this simple chain works will set you up well for working with more complex chains.
import OutputParser from "@snippets/get_started/quickstart/output_parser.mdx"
</TabItem>
<TabItem value="chat_models" label="Chat models">
<OutputParser/>
The `LLMChain` can be used with chat models as well:
## LLMChain
<ChainChatModel/>
</TabItem>
</Tabs>
We can now combine all these into one chain.
This chain will take input variables, pass those to a prompt template to create a prompt, pass the prompt to an LLM, and then pass the output through an (optional) output parser.
This is a convenient way to bundle up a modular piece of logic.
Let's see it in action!
## Agents
import LLMChain from "@snippets/get_started/quickstart/llm_chain.mdx"
import AgentLLM from "@snippets/get_started/quickstart/agents_llms.mdx"
import AgentChatModel from "@snippets/get_started/quickstart/agents_chat_models.mdx"
<LLMChain/>
Our first chain ran a pre-determined sequence of steps. To handle complex workflows, we need to be able to dynamically choose actions based on inputs.
## Next Steps
Agents do just this: they use a language model to determine which actions to take and in what order. Agents are given access to tools, and they repeatedly choose a tool, run the tool, and observe the output until they come up with a final answer.
This is it!
We've now gone over how to create the core building block of LangChain applications - the LLMChains.
There is a lot more nuance in all these components (LLMs, prompts, output parsers) and a lot more different components to learn about as well.
To continue on your journey:
To load an agent, you need to choose a(n):
- LLM/Chat model: The language model powering the agent.
- Tool(s): A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. For a list of predefined tools and their specifications, see the [Tools documentation](/docs/modules/agents/tools/).
- Agent name: A string that references a supported agent class. An agent class is largely parameterized by the prompt the language model uses to determine which action to take. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see [here](/docs/modules/agents/how_to/custom_agent.html). For a list of supported agents and their specifications, see [here](/docs/modules/agents/agent_types/).
For this example, we'll be using SerpAPI to query a search engine.
You'll need to install the SerpAPI Python package:
```bash
pip install google-search-results
```
And set the `SERPAPI_API_KEY` environment variable.
<Tabs>
<TabItem value="llms" label="LLMs" default>
<AgentLLM/>
</TabItem>
<TabItem value="chat_models" label="Chat models">
Agents can also be used with chat models, you can initialize one using `AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION` as the agent type.
<AgentChatModel/>
</TabItem>
</Tabs>
## Memory
The chains and agents we've looked at so far have been stateless, but for many applications it's necessary to reference past interactions. This is clearly the case with a chatbot for example, where you want it to understand new messages in the context of past messages.
The Memory module gives you a way to maintain application state. The base Memory interface is simple: it lets you update state given the latest run inputs and outputs and it lets you modify (or contextualize) the next input using the stored state.
There are a number of built-in memory systems. The simplest of these is a buffer memory which just prepends the last few inputs/outputs to the current input - we will use this in the example below.
import MemoryLLM from "@snippets/get_started/quickstart/memory_llms.mdx"
import MemoryChatModel from "@snippets/get_started/quickstart/memory_chat_models.mdx"
<Tabs>
<TabItem value="llms" label="LLMs" default>
<MemoryLLM/>
</TabItem>
<TabItem value="chat_models" label="Chat models">
You can use Memory with chains and agents initialized with chat models. The main difference between this and Memory for LLMs is that rather than trying to condense all previous messages into a string, we can keep them as their own unique memory object.
<MemoryChatModel/>
</TabItem>
</Tabs>
- [Dive deeper](/docs/modules/model_io) into LLMs, prompts, and output parsers
- Learn the other [key components](/docs/modules)
- Check out our [helpful guides](/docs/guides) for detailed walkthroughs on particular topics
- Explore [end-to-end use cases](/docs/use_cases)

View File

@@ -0,0 +1,24 @@
---
sidebar_position: 3
---
# Comparison Evaluators
Comparison evaluators in LangChain help measure two different chain or LLM outputs. These evaluators are helpful for comparative analyses, such as A/B testing between two language models, or comparing different versions of the same model. They can also be useful for things like generating preference scores for ai-assisted reinforcement learning.
These evaluators inherit from the `PairwiseStringEvaluator` class, providing a comparison interface for two strings - typically, the outputs from two different prompts or models, or two versions of the same model. In essence, a comparison evaluator performs an evaluation on a pair of strings and returns a dictionary containing the evaluation score and other relevant details.
To create a custom comparison evaluator, inherit from the `PairwiseStringEvaluator` class and overwrite the `_evaluate_string_pairs` method. If you require asynchronous evaluation, also overwrite the `_aevaluate_string_pairs` method.
Here's a summary of the key methods and properties of a comparison evaluator:
- `evaluate_string_pairs`: Evaluate the output string pairs. This function should be overwritten when creating custom evaluators.
- `aevaluate_string_pairs`: Asynchronously evaluate the output string pairs. This function should be overwritten for asynchronous evaluation.
- `requires_input`: This property indicates whether this evaluator requires an input string.
- `requires_reference`: This property specifies whether this evaluator requires a reference label.
Detailed information about creating custom evaluators and the available built-in comparison evaluators are provided in the following sections.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,12 @@
---
sidebar_position: 5
---
# Examples
🚧 _Docs under construction_ 🚧
Below are some examples for inspecting and checking different chains.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,31 @@
---
sidebar_position: 6
---
import DocCardList from "@theme/DocCardList";
# Evaluation
Building applications with language models involves many moving parts. One of the most critical components is ensuring that the outcomes produced by your models are reliable and useful across a broad array of inputs, and that they work well with your application's other software components. Ensuring reliability usually boils down to some combination of application design, testing & evaluation, and runtime checks.
The guides in this section review the APIs and functionality LangChain provides to help yous better evaluate your applications. Evaluation and testing are both critical when thinking about deploying LLM applications, since production environments require repeatable and useful outcomes.
LangChain offers various types of evaluators to help you measure performance and integrity on diverse data, and we hope to encourage the the community to create and share other useful evaluators so everyone can improve. These docs will introduce the evaluator types, how to use them, and provide some examples of their use in real-world scenarios.
Each evaluator type in LangChain comes with ready-to-use implementations and an extensible API that allows for customization according to your unique requirements. Here are some of the types of evaluators we offer:
- [String Evaluators](/docs/guides/evaluation/string/): These evaluators assess the predicted string for a given input, usually comparing it against a reference string.
- [Trajectory Evaluators](/docs/guides/evaluation/trajectory/): These are used to evaluate the entire trajectory of agent actions.
- [Comparison Evaluators](/docs/guides/evaluation/comparison/): These evaluators are designed to compare predictions from two runs on a common input.
These evaluators can be used across various scenarios and can be applied to different chain and LLM implementations in the LangChain library.
We also are working to share guides and cookbooks that demonstrate how to use these evaluators in real-world scenarios, such as:
- [Chain Comparisons](/docs/guides/evaluation/examples/comparisons): This example uses a comparison evaluator to predict the preferred output. It reviews ways to measure confidence intervals to select statistically significant differences in aggregate preference scores across different models or prompts.
## Reference Docs
For detailed information on the available evaluators, including how to instantiate, configure, and customize them, check out the [reference documentation](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.evaluation) directly.
<DocCardList />

View File

@@ -0,0 +1,27 @@
---
sidebar_position: 2
---
# String Evaluators
A string evaluator is a component within LangChain designed to assess the performance of a language model by comparing its generated outputs (predictions) to a reference string or an input. This comparison is a crucial step in the evaluation of language models, providing a measure of the accuracy or quality of the generated text.
In practice, string evaluators are typically used to evaluate a predicted string against a given input, such as a question or a prompt. Often, a reference label or context string is provided to define what a correct or ideal response would look like. These evaluators can be customized to tailor the evaluation process to fit your application's specific requirements.
To create a custom string evaluator, inherit from the `StringEvaluator` class and implement the `_evaluate_strings` method. If you require asynchronous support, also implement the `_aevaluate_strings` method.
Here's a summary of the key attributes and methods associated with a string evaluator:
- `evaluation_name`: Specifies the name of the evaluation.
- `requires_input`: Boolean attribute that indicates whether the evaluator requires an input string. If True, the evaluator will raise an error when the input isn't provided. If False, a warning will be logged if an input _is_ provided, indicating that it will not be considered in the evaluation.
- `requires_reference`: Boolean attribute specifying whether the evaluator requires a reference label. If True, the evaluator will raise an error when the reference isn't provided. If False, a warning will be logged if a reference _is_ provided, indicating that it will not be considered in the evaluation.
String evaluators also implement the following methods:
- `aevaluate_strings`: Asynchronously evaluates the output of the Chain or Language Model, with support for optional input and label.
- `evaluate_strings`: Synchronously evaluates the output of the Chain or Language Model, with support for optional input and label.
The following sections provide detailed information on available string evaluator implementations as well as how to create a custom string evaluator.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,28 @@
---
sidebar_position: 4
---
# Trajectory Evaluators
Trajectory Evaluators in LangChain provide a more holistic approach to evaluating an agent. These evaluators assess the full sequence of actions taken by an agent and their corresponding responses, which we refer to as the "trajectory". This allows you to better measure an agent's effectiveness and capabilities.
A Trajectory Evaluator implements the `AgentTrajectoryEvaluator` interface, which requires two main methods:
- `evaluate_agent_trajectory`: This method synchronously evaluates an agent's trajectory.
- `aevaluate_agent_trajectory`: This asynchronous counterpart allows evaluations to be run in parallel for efficiency.
Both methods accept three main parameters:
- `input`: The initial input given to the agent.
- `prediction`: The final predicted response from the agent.
- `agent_trajectory`: The intermediate steps taken by the agent, given as a list of tuples.
These methods return a dictionary. It is recommended that custom implementations return a `score` (a float indicating the effectiveness of the agent) and `reasoning` (a string explaining the reasoning behind the score).
You can capture an agent's trajectory by initializing the agent with the `return_intermediate_steps=True` parameter. This lets you collect all intermediate steps without relying on special callbacks.
For a deeper dive into the implementation and use of Trajectory Evaluators, refer to the sections below.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,9 @@
# LangChain Expression Language
import DocCardList from "@theme/DocCardList";
LangChain Expression Language is a declarative way to easily compose chains together.
Any chain constructed this way will automatically have full sync, async, and streaming support.
See guides below for how to interact with chains constructed this way as well as cookbook examples.
<DocCardList />

View File

@@ -0,0 +1,6 @@
# Preventing harmful outputs
One of the key concerns with using LLMs is that they may generate harmful or unethical text. This is an area of active research in the field. Here we present some built-in chains inspired by this research, which are intended to make the outputs of LLMs safer.
- [Moderation chain](/docs/use_cases/safety/moderation): Explicitly check if any output text is harmful and flag it.
- [Constitutional chain](/docs/use_cases/safety/constitutional_chain): Prompt the model with a set of principles which should guide it's behavior.

View File

@@ -3,46 +3,80 @@ sidebar_position: 4
---
# Agents
Some applications require a flexible chain of calls to LLMs and other tools based on user input. The **Agent** interface provides the flexibility for such applications. An agent has access to a suite of tools, and determines which ones to use depending on the user input. Agents can use multiple tools, and use the output of one tool as the input to the next.
The core idea of agents is to use an LLM to choose a sequence of actions to take.
In chains, a sequence of actions is hardcoded (in code).
In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.
There are two main types of agents:
There are several key components here:
- **Action agents**: at each timestep, decide on the next action using the outputs of all previous actions
- **Plan-and-execute agents**: decide on the full sequence of actions up front, then execute them all without updating the plan
## Agent
Action agents are suitable for small tasks, while plan-and-execute agents are better for complex or long-running tasks that require maintaining long-term objectives and focus. Often the best approach is to combine the dynamism of an action agent with the planning abilities of a plan-and-execute agent by letting the plan-and-execute agent use action agents to execute plans.
This is the class responsible for deciding what step to take next.
This is powered by a language model and a prompt.
This prompt can include things like:
For a full list of agent types see [agent types](/docs/modules/agents/agent_types/). Additional abstractions involved in agents are:
- [**Tools**](/docs/modules/agents/tools/): the actions an agent can take. What tools you give an agent highly depend on what you want the agent to do
- [**Toolkits**](/docs/modules/agents/toolkits/): wrappers around collections of tools that can be used together a specific use case. For example, in order for an agent to
interact with a SQL database it will likely need one tool to execute queries and another to inspect tables
1. The personality of the agent (useful for having it respond in a certain way)
2. Background context for the agent (useful for giving it more context on the types of tasks it's being asked to do)
3. Prompting strategies to invoke better reasoning (the most famous/widely used being [ReAct](https://arxiv.org/abs/2210.03629))
## Action agents
LangChain provides a few different types of agents to get started.
Even then, you will likely want to customize those agents with parts (1) and (2).
For a full list of agent types see [agent types](/docs/modules/agents/agent_types/)
At a high-level an action agent:
1. Receives user input
2. Decides which tool, if any, to use and the tool input
3. Calls the tool and records the output (also known as an "observation")
4. Decides the next step using the history of tools, tool inputs, and observations
5. Repeats 3-4 until it determines it can respond directly to the user
## Tools
Action agents are wrapped in **agent executors**, which are responsible for calling the agent, getting back an action and action input, calling the tool that the action references with the generated input, getting the output of the tool, and then passing all that information back into the agent to get the next action it should take.
Tools are functions that an agent calls.
There are two important considerations here:
Although an agent can be constructed in many ways, it typically involves these components:
1. Giving the agent access to the right tools
2. Describing the tools in a way that is most helpful to the agent
- **Prompt template**: Responsible for taking the user input and previous steps and constructing a prompt
to send to the language model
- **Language model**: Takes the prompt with use input and action history and decides what to do next
- **Output parser**: Takes the output of the language model and parses it into the next action or a final answer
Without both, the agent you are trying to build will not work.
If you don't give the agent access to a correct set of tools, it will never be able to accomplish the objective.
If you don't describe the tools properly, the agent won't know how to properly use them.
## Plan-and-execute agents
LangChain provides a wide set of tools to get started, but also makes it easy to define your own (including custom descriptions).
For a full list of tools, see [here](/docs/modules/agents/tools/)
At a high-level a plan-and-execute agent:
1. Receives user input
2. Plans the full sequence of steps to take
3. Executes the steps in order, passing the outputs of past steps as inputs to future steps
## Toolkits
The most typical implementation is to have the planner be a language model, and the executor be an action agent. Read more [here](/docs/modules/agents/agent_types/plan_and_execute.html).
Often the set of tools an agent has access to is more important than a single tool.
For this LangChain provides the concept of toolkits - groups of tools needed to accomplish specific objectives.
There are generally around 3-5 tools in a toolkit.
LangChain provides a wide set of toolkits to get started.
For a full list of toolkits, see [here](/docs/modules/agents/toolkits/)
## AgentExecutor
The agent executor is the runtime for an agent.
This is what actually calls the agent and executes the actions it chooses.
Pseudocode for this runtime is below:
```python
next_action = agent.get_action(...)
while next_action != AgentFinish:
observation = run(next_action)
next_action = agent.get_action(..., next_action, observation)
return next_action
```
While this may seem simple, there are several complexities this runtime handles for you, including:
1. Handling cases where the agent selects a non-existent tool
2. Handling cases where the tool errors
3. Handling cases where the agent produces output that cannot be parsed into a tool invocation
4. Logging and observability at all levels (agent decisions, tool calls) either to stdout or [LangSmith](https://smith.langchain.com).
## Other types of agent runtimes
The `AgentExecutor` class is the main agent runtime supported by LangChain.
However, there are other, more experimental runtimes we also support.
These include:
- [Plan-and-execute Agent](/docs/modules/agents/agent_types/plan_and_execute.html)
- [Baby AGI](/docs/use_cases/autonomous_agents/baby_agi.html)
- [Auto GPT](/docs/use_cases/autonomous_agents/autogpt.html)
## Get started

View File

@@ -3,8 +3,8 @@ sidebar_position: 3
---
# Toolkits
:::info
Head to [Integrations](/docs/integrations/toolkits/) for documentation on built-in toolkit integrations.
:::
Toolkits are collections of tools that are designed to be used together for specific tasks and have convenience loading methods.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -1,2 +0,0 @@
label: 'How-to'
position: 0

View File

@@ -3,6 +3,10 @@ sidebar_position: 2
---
# Tools
:::info
Head to [Integrations](/docs/integrations/tools/) for documentation on built-in tool integrations.
:::
Tools are interfaces that an agent can use to interact with the world.
## Get started

View File

@@ -1 +0,0 @@
label: 'Integrations'

View File

@@ -1,2 +0,0 @@
label: 'How-to'
position: 0

View File

@@ -3,6 +3,10 @@ sidebar_position: 5
---
# Callbacks
:::info
Head to [Integrations](/docs/integrations/callbacks/) for documentation on built-in callbacks integrations with 3rd-party tools.
:::
LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
import GetStarted from "@snippets/modules/callbacks/get_started.mdx"

View File

@@ -1 +0,0 @@
label: 'Integrations'

View File

@@ -1,7 +0,0 @@
# Dynamically selecting from multiple prompts
This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects the prompt to use for a given input. Specifically we show how to use the `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt.
import Example from "@snippets/modules/chains/additional/multi_prompt_router.mdx"
<Example/>

View File

@@ -1,6 +1,6 @@
# Sequential
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! Instead, edit the notebook w/the location & name as this file. -->
The next step after calling a language model is make a series of calls to a language model. This is particularly useful when you want to take the output from one call and use it as the input to another.

View File

@@ -1,2 +0,0 @@
label: 'How-to'
position: 0

View File

@@ -3,6 +3,10 @@ sidebar_position: 0
---
# Document loaders
:::info
Head to [Integrations](/docs/integrations/document_loaders/) for documentation on built-in document loader integrations with 3rd-party tools.
:::
Use document loaders to load data from a source as `Document`'s. A `Document` is a piece of text
and associated metadata. For example, there are document loaders for loading a simple `.txt` file, for loading the text
contents of any web page, or even for loading a transcript of a YouTube video.

View File

@@ -3,6 +3,10 @@ sidebar_position: 1
---
# Document transformers
:::info
Head to [Integrations](/docs/integrations/document_transformers/) for documentation on built-in document transformer integrations with 3rd-party tools.
:::
Once you've loaded documents, you'll often want to transform them to better suit your application. The simplest example
is you may want to split a long document into smaller chunks that can fit into your model's context window. LangChain
has a number of built-in document transformers that make it easy to split, combine, filter, and otherwise manipulate documents.

View File

@@ -1,2 +0,0 @@
label: 'How-to'
position: 0

View File

@@ -3,6 +3,10 @@ sidebar_position: 4
---
# Retrievers
:::info
Head to [Integrations](/docs/integrations/retrievers/) for documentation on built-in retriever integrations with 3rd-party tools.
:::
A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store.
A retriever does not need to be able to store documents, only to return (or retrieve) it. Vector stores can be used
as the backbone of a retriever, but there are other types of retrievers as well.

View File

@@ -3,6 +3,10 @@ sidebar_position: 2
---
# Text embedding models
:::info
Head to [Integrations](/docs/integrations/text_embedding/) for documentation on built-in integrations with text embedding model providers.
:::
The Embeddings class is a class designed for interfacing with text embedding models. There are lots of embedding model providers (OpenAI, Cohere, Hugging Face, etc) - this class is designed to provide a standard interface for all of them.
Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.

View File

@@ -3,11 +3,17 @@ sidebar_position: 3
---
# Vector stores
:::info
Head to [Integrations](/docs/integrations/vectorstores/) for documentation on built-in integrations with 3rd-party vector stores.
:::
One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding
vectors, and then at query time to embed the unstructured query and retrieve the embedding vectors that are
'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search
for you.
![vector store diagram](/img/vector_stores.jpg)
## Get started
This walkthrough showcases basic functionality related to VectorStores. A key part of working with vector stores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [text embedding model](/docs/modules/data_connection/text_embedding/) interfaces before diving into this.
@@ -15,3 +21,11 @@ This walkthrough showcases basic functionality related to VectorStores. A key pa
import GetStarted from "@snippets/modules/data_connection/vectorstores/get_started.mdx"
<GetStarted/>
## Asynchronous operations
Vector stores are usually run as a separate service that requires some IO operations, and therefore they might be called asynchronously. That gives performance benefits as you don't waste time waiting for responses from external services. That might also be important if you work with an asynchronous framework, such as [FastAPI](https://fastapi.tiangolo.com/).
import AsyncVectorStore from "@snippets/modules/data_connection/vectorstores/async.mdx"
<AsyncVectorStore/>

View File

@@ -17,4 +17,6 @@ Let chains choose which tools to use given high-level directives
#### [Memory](/docs/modules/memory/)
Persist application state between runs of a chain
#### [Callbacks](/docs/modules/callbacks/)
Log and stream intermediate steps of any chain
Log and stream intermediate steps of any chain
#### [Evaluation](/docs/modules/evaluation/)
Evaluate the performance of a chain.

View File

@@ -0,0 +1,17 @@
---
sidebar_position: 1
---
# Chat Messages
:::info
Head to [Integrations](/docs/integrations/memory/) for documentation on built-in memory integrations with 3rd-party databases and tools.
:::
One of the core utility classes underpinning most (if not all) memory modules is the `ChatMessageHistory` class.
This is a super lightweight wrapper which exposes convenience methods for saving Human messages, AI messages, and then fetching them all.
You may want to use this class directly if you are managing memory outside of a chain.
import GetStarted from "@snippets/modules/memory/chat_messages/get_started.mdx"
<GetStarted/>

View File

@@ -1,2 +0,0 @@
label: 'How-to'
position: 0

View File

@@ -1,30 +1,62 @@
---
sidebar_position: 3
---
# Memory
🚧 _Docs under construction_ 🚧
Most LLM applications have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation.
At bare minimum, a conversational system should be able to access some window of past messages directly.
A more complex system will need to have a world model that it is constantly updating, which allows it to do things like maintain information about entities and their relationships.
By default, Chains and Agents are stateless,
meaning that they treat each incoming query independently (like the underlying LLMs and chat models themselves).
In some applications, like chatbots, it is essential
to remember previous interactions, both in the short and long-term.
The **Memory** class does exactly that.
We call this ability to store information about past interactions "memory".
LangChain provides a lot of utilities for adding memory to a system.
These utilities can be used by themselves or incorporated seamlessly into a chain.
LangChain provides memory components in two forms.
First, LangChain provides helper utilities for managing and manipulating previous chat messages.
These are designed to be modular and useful regardless of how they are used.
Secondly, LangChain provides easy ways to incorporate these utilities into chains.
A memory system needs to support two basic actions: reading and writing.
Recall that every chain defines some core execution logic that expects certain inputs.
Some of these inputs come directly from the user, but some of these inputs can come from memory.
A chain will interact with its memory system twice in a given run.
1. AFTER receiving the initial user inputs but BEFORE executing the core logic, a chain will READ from its memory system and augment the user inputs.
2. AFTER executing the core logic but BEFORE returning the answer, a chain will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.
![memory-diagram](/img/memory_diagram.png)
## Building memory into a system
The two core design decisions in any memory system are:
- How state is stored
- How state is queried
### Storing: List of chat messages
Underlying any memory is a history of all chat interactions.
Even if these are not all used directly, they need to be stored in some form.
One of the key parts of the LangChain memory module is a series of integrations for storing these chat messages,
from in-memory lists to persistent databases.
- [Chat message storage](/docs/modules/memory/chat_messages/): How to work with Chat Messages, and the various integrations offered
### Querying: Data structures and algorithms on top of chat messages
Keeping a list of chat messages is fairly straight-forward.
What is less straight-forward are the data structures and algorithms built on top of chat messages that serve a view of those messages that is most useful.
A very simply memory system might just return the most recent messages each run. A slightly more complex memory system might return a succinct summary of the past K messages.
An even more sophisticated system might extract entities from stored messages and only return information about entities referenced in the current run.
Each application can have different requirements for how memory is queried. The memory module should make it easy to both get started with simple memory systems and write your own custom systems if needed.
- [Memory types](/docs/modules/memory/types/): The various data structures and algorithms that make up the memory types LangChain supports
## Get started
Memory involves keeping a concept of state around throughout a user's interactions with an language model. A user's interactions with a language model are captured in the concept of ChatMessages, so this boils down to ingesting, capturing, transforming and extracting knowledge from a sequence of chat messages. There are many different ways to do this, each of which exists as its own memory type.
In general, for each type of memory there are two ways to understanding using memory. These are the standalone functions which extract information from a sequence of messages, and then there is the way you can use this type of memory in a chain.
Memory can return multiple pieces of information (for example, the most recent N messages and a summary of all previous messages). The returned information can either be a string or a list of messages.
Let's take a look at what Memory actually looks like in LangChain.
Here we'll cover the basics of interacting with an arbitrary memory class.
import GetStarted from "@snippets/modules/memory/get_started.mdx"
<GetStarted/>
## Next steps
And that's it for getting started!
Please see the other sections for walkthroughs of more advanced topics,
like custom memory, multiple memories, and more.

View File

@@ -1 +0,0 @@
label: 'Integrations'

View File

@@ -4,6 +4,6 @@ This notebook shows how to use `ConversationBufferMemory`. This memory allows fo
We can first extract it as a string.
import Example from "@snippets/modules/memory/how_to/buffer.mdx"
import Example from "@snippets/modules/memory/types/buffer.mdx"
<Example/>

View File

@@ -4,6 +4,6 @@
Let's first explore the basic functionality of this type of memory.
import Example from "@snippets/modules/memory/how_to/buffer_window.mdx"
import Example from "@snippets/modules/memory/types/buffer_window.mdx"
<Example/>

View File

@@ -4,6 +4,6 @@ Entity Memory remembers given facts about specific entities in a conversation. I
Let's first walk through using this functionality.
import Example from "@snippets/modules/memory/how_to/entity_summary_memory.mdx"
import Example from "@snippets/modules/memory/types/entity_summary_memory.mdx"
<Example/>

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 2
---
# Memory Types
There are many different types of memory.
Each have their own parameters, their own return types, and are useful in different scenarios.
Please see their individual page for more detail on each one.

View File

@@ -4,6 +4,6 @@ Conversation summary memory summarizes the conversation as it happens and stores
Let's first explore the basic functionality of this type of memory.
import Example from "@snippets/modules/memory/how_to/summary.mdx"
import Example from "@snippets/modules/memory/types/summary.mdx"
<Example/>

View File

@@ -6,6 +6,6 @@ This differs from most of the other Memory classes in that it doesn't explicitly
In this case, the "docs" are previous conversation snippets. This can be useful to refer to relevant pieces of information that the AI was told earlier in the conversation.
import Example from "@snippets/modules/memory/how_to/vectorstore_retriever_memory.mdx"
import Example from "@snippets/modules/memory/types/vectorstore_retriever_memory.mdx"
<Example/>

View File

@@ -1,2 +0,0 @@
label: 'How-to'
position: 0

View File

@@ -3,18 +3,16 @@ sidebar_position: 1
---
# Chat models
:::info
Head to [Integrations](/docs/integrations/chat/) for documentation on built-in integrations with chat model providers.
:::
Chat models are a variation on language models.
While chat models use language models under the hood, the interface they expose is a bit different.
Rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
Chat model APIs are fairly new, so we are still figuring out the correct abstractions.
The following sections of documentation are provided:
- **How-to guides**: Walkthroughs of core functionality, like streaming, creating chat prompts, etc.
- **Integrations**: How to use different chat model providers (OpenAI, Anthropic, etc).
## Get started
import GetStarted from "@snippets/modules/model_io/models/chat/get_started.mdx"

View File

@@ -1,2 +0,0 @@
label: 'How-to'
position: 0

View File

@@ -3,14 +3,12 @@ sidebar_position: 0
---
# LLMs
:::info
Head to [Integrations](/docs/integrations/llms/) for documentation on built-in integrations with LLM providers.
:::
Large Language Models (LLMs) are a core component of LangChain.
LangChain does not serve it's own LLMs, but rather provides a standard interface for interacting with many different LLMs.
For more detailed documentation check out our:
- **How-to guides**: Walkthroughs of core functionality, like streaming, async, etc.
- **Integrations**: How to use different LLM providers (OpenAI, Anthropic, etc.)
LangChain does not serve its own LLMs, but rather provides a standard interface for interacting with many different LLMs.
## Get started

View File

@@ -0,0 +1 @@
label: 'How to'

View File

@@ -2,7 +2,7 @@
sidebar_position: 2
---
# Conversational Retrieval QA
# Store and reference chat history
The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.
It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.

View File

@@ -1,4 +1,4 @@
# Dynamically selecting from multiple retrievers
# Dynamically select from multiple retrievers
This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects which Retrieval system to use. Specifically we show how to use the `MultiRetrievalQAChain` to create a question-answering chain that selects the retrieval QA chain which is most relevant for a given question, and then answers the question using it.

View File

@@ -1,4 +1,4 @@
# Document QA
# QA over in-memory documents
Here we walk through how to use LangChain for question answering over a list of documents. Under the hood we'll be using our [Document chains](/docs/modules/chains/document/).

View File

@@ -1,7 +1,7 @@
---
sidebar_position: 1
---
# Retrieval QA
# QA using a Retriever
This example showcases question answering over an index.

Some files were not shown because too many files have changed in this diff Show More