Compare commits

...

1168 Commits

Author SHA1 Message Date
Lance Martin
04fa5bd65f Merge branch 'master' into rlm/sql-pgvector-template 2023-11-13 15:31:16 -08:00
Lance Martin
e549397001 fmt 2023-11-13 15:30:48 -08:00
wemysschen
a591cdb67d add cookbook for RAG with baidu QIANFAN and elasticsearch (#13287)
**Description:** 
Add cookbook for RAG with baidu QIANFAN and elasticsearch.

Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>
2023-11-13 14:45:24 -08:00
mertkayhan
9b4974871d IMPROVEMENT Increase flexibility of ElasticVectorSearch (#6863)
Hey @rlancemartin, @eyurtsev ,

I did some minimal changes to the `ElasticVectorSearch` client so that
it plays better with existing ES indices.

Main changes are as follows:

1. You can pass the dense vector field name into `_default_script_query`
2. You can pass a custom script query implementation and the respective
parameters to `similarity_search_with_score`
3. You can pass functions for building page content and metadata for the
resulting `Document`

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  4. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-11-13 14:36:03 -08:00
Lance Martin
39852dffd2 Cookbook for multi-modal RAG eval (#13272) 2023-11-13 14:26:02 -08:00
Erick Friis
50a5c919f0 IMPROVEMENT self-query template (#13305)
- [ ]
https://github.com/langchain-ai/langchain/pull/12694#discussion_r1391334719
-> keep date
- [x]
https://github.com/langchain-ai/langchain/pull/12694#discussion_r1391336586
2023-11-13 14:03:15 -08:00
Lance Martin
0686096728 fmt 2023-11-13 13:47:05 -08:00
Lance Martin
adfe14001a fmt 2023-11-13 13:28:09 -08:00
Yasin
b46f88d364 IMPROVEMENT add license file to subproject (#8403)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

hi!
This is pretty straight-forward: The sdist package does not contain the
license file (which is needed by e.g. conda) because the package is
built from the subdir and can't see the license.
I _copied_ the license but since I'm unfamiliar with the projects
direction, I'm not sure that's correct.
thanks!

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-13 11:48:21 -08:00
Rui Ramos
ff19a62afc Fix Pinecone cosine relevance score (#8920)
Fixes: #8207

Description:
Pinecone returns scores (not distances) with cosine similarity. The
values according to the docs are [-1, 1], although I could never
reproduce negative values.

This PR ensures that the score returned from Pinecone is preserved,
rather than inverted, so the most relevant documents can be filtered (eg
when using similarity thresholds)

I'll leave this as a draft PR as I couldn't run the tests (my pinecone
account might not be enough - some errors were being thrown around
namespaces) so hopefully someone who _can_ will pick this up.

Maintainers:
@rlancemartin, @eyurtsev

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-13 11:47:38 -08:00
Bagatur
2e42ed5de6 Self-query template (#12694)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-13 11:44:19 -08:00
Konstantin Spieß
1e43025bf5 Fix serialization issue in Matching Engine Vector Store (#13266)
- **Description:** Fixed a serialization issue in the add_texts method
of the Matching Engine Vector Store caused by a typo, leading to an
attempt to serialize the json module itself.
  - **Issue:** #12154 
  - **Dependencies:** ./.
  - **Tag maintainer:**
2023-11-13 11:04:11 -08:00
William FH
9169d77cf6 Update error message in evaluation runner (#13296) 2023-11-13 11:03:20 -08:00
Leonie
32c493e3df Refine Weaviate docs and add RAG example (#13057)
- **Description:** Refine Weaviate tutorial and add an example for
Retrieval-Augmented Generation (RAG)
  - **Issue:** (not applicable),
  - **Dependencies:** none
  - **Tag maintainer:** @baskaryan <!--
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
  - **Twitter handle:** @helloiamleonie

Co-authored-by: Leonie <leonie@Leonies-MBP-2.fritz.box>
2023-11-13 10:59:19 -08:00
takatost
f22f273f93 FIX: 'from_texts' method in Weaviate with non-existent kwargs param (#11604)
Due to the possibility of external inputs including UUIDs, there may be
additional values in **kwargs, while Weaviate's `__init__` method does
not support passing extra **kwarg parameters.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-13 10:32:20 -08:00
Frank995
971d2b2e34 Add missing filter to max_marginal_relevance_search inner call to max_marginal_relevance_search_by_vector (#13260)
When calling max_marginal_relevance_search from PGVector the filter
param is not carried over to max_marginal_relevance_search_by_vector

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-13 10:31:34 -08:00
chevalmuscle
3ad78e48e2 Use endpoint_url if provided with boto3 session for dynamodb (#11622)
- **Description:** Uses `endpoint_url` if provided with a boto3 session.
When running dynamodb locally, credentials are required even if invalid.
With this change, it will be possible to pass a boto3 session with
credentials and specify an endpoint_url

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-13 10:31:16 -08:00
Erick Friis
18acc22f29 Ollama pass kwargs as options instead of top (#13280)
Noticed params are really in `options` instead while reviewing #12895
2023-11-13 10:28:47 -08:00
刘 方瑞
46af56dc4f Add MyScaleWithoutJSON which allows user to wrap columns into Document's Metadata (#13164)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
Replace this entire comment with:
- **Description:** Add MyScaleWithoutJSON which allows user to wrap
columns into Document's Metadata
  - **Tag maintainer:** @baskaryan
2023-11-13 10:10:36 -08:00
Michael Landis
2aa13f1e10 chore: bump momento dependency version and refactor search hit usage (#13111)
**Description**

Bumps the Momento dependency to the latest version and refactors the
usage of `SearchHit` in the Momento Vector Index (MVI) vector store
integration. This change is a one liner where we use the preferred
attribute `score` to read the query-document similarity instead of
`distance`. The latest versions of Momento clients will use this
attribute going forward.

**Dependencies**

Updated the Momento dependency to latest version.

**Tests**

💚 I re-ran the existing MVI integration tests
(`tests/integration_tests/vectorstores/test_momento_vector_index.py`)
and they pass.

**Review**
cc @baskaryan @eyurtsev
2023-11-13 09:12:21 -08:00
Junlin Zhou
4da2faba41 docs: align custom_tool document headers (#13252)
On the [Defining Custom
Tools](https://python.langchain.com/docs/modules/agents/tools/custom_tools)
page, there's a 'Subclassing the BaseTool class' paragraph under the
'Completely New Tools - String Input and Output' header. Also there's
another 'Subclassing the BaseTool' paragraph under no header, which I
think may belong to the 'Custom Structured Tools' header.

Another thing is, there's a 'Using the tool decorator' and a 'Using the
decorator' paragraph, I think should belong to 'Completely New Tools -
String Input and Output' and 'Custom Structured Tools' separately.

This PR moves those paragraphs to corresponding headers.
2023-11-13 09:03:56 -08:00
Ikko Eltociear Ashimine
700293cae9 Fix typo in timescalevector.ipynb (#13239)
enviornment -> environment
2023-11-13 09:03:07 -08:00
kYLe
cc55d2fcee Add OpenAI API v1 support for ChatAnyscale and fixed a bug with openai_api_key (#13237)
1. Add OpenAI API v1 support
2. Fixed a bug to call `get_secret_value` on a str value
(values["openai_api_key"])
2023-11-13 09:01:54 -08:00
juan-calvo-datatonic
545b76b0fd Add rag google vertex ai search template (#13294)
- **Description:** This is a template demonstrating how to utilize
Google Vertex AI Search in conjunction with ChatVertexAI()
2023-11-13 08:45:36 -08:00
Govind.S.B
9024593468 added system prompt and template fields to ollama (#13022)
**Description**
the ollama api now supports passing system prompt and template directly
instead of modifying the model file , but the ollama integration in
langchain did not have this change updated . The update just adds these
two parameters to it ( there are 2 more parameters that are pending to
be updated, I was not sure about their utility wrt to langchain )
Refer :
8713ac23a8

**Issue** : None Applicable

**Dependencies** : None Changed

**Twitter handle** : https://twitter.com/violetto96

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-11-13 08:45:11 -08:00
langchain-infra
f55f67055f Add dockerfile template (#13240) 2023-11-13 10:33:01 -05:00
Shaurya Rohatgi
f70aa82c84 Update README.md - Added notebook for extraction_openai_tools (#13205)
added Parallel Function Calling for Structured Data Extraction notebook

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-13 00:12:46 -08:00
Guillem Orellana Trullols
0f31cd8b49 Remove _get_kwarg_value function (#13184)
`_get_kwarg_value` function is useless, one can rely on python builtin
functionalities to do the exact same thing.

- **Description:** Removed `_get_kwarg_value`. Helps with code
readability.
  - **Issue:** the issue # it fixes (if applicable),
  - **Twitter handle:** @Guillem_96
2023-11-13 00:09:54 -08:00
SuperDa Fu
e1c020dfe1 dalle add model parameter (#13201)
- **Description:** dalle_image_generator adding a new model parameter,
  - **Issue:** N/A,
  - **Dependencies:** 
  - **Tag maintainer: @hwchase17
  - **Twitter handle:**

---------

Co-authored-by: dafu <xiangbingze@wenru.wang>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2023-11-13 00:09:20 -08:00
Mario Angst
96b56a4d4f Typo fix to quickstart.mdx (#13178)
- **Description:** I fixed a very small typo in the quickstart docs
(BaeMessage -> BaseMessage)
2023-11-13 00:02:18 -08:00
Dennis de Greef
64e11592bb Improve CSV reader which can't call .strip() on NoneType (#13079)
Improve CSV reader which can't call .strip() on NoneType if there are
less cells in the row compared to the header

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** 
I have a CSV file as followed

```
headerA,headerB,headerC
v1A,v1B,v1C,
v2A,v2B
v3A,v3B,v3C
```
In this case, row 2 is missing a value, which results in reading a None
type. The strip() method can not be called on None, hence raising. In
this PR I am making the change to only call strip if the value if not
None.

  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-12 23:51:39 -08:00
glad4enkonm
339973db47 Update ollama.py (#12895)
duplicate option removed
**Description:**  An issue fix, http stop option duplicate removed.
**Issue:** the issue #12892 fix
**Dependencies:** no
**Tag maintainer:** @eyurtsev

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-12 23:43:59 -08:00
刘 方瑞
e89e830c55 Free knowledge base pod information update (#12813)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

We updated MyScale free knowledge base, where you can try your RAG with
36 million paragraphs from wikipedia and 2 million paragraphs from
ArXiv.

The pod has two tables
```sql
CREATE TABLE default.ChatArXiv (
    `abstract` String, 
    `id` String, 
    `vector` Array(Float32), 
    `metadata` Object('JSON'), 
    `pubdate` DateTime,
    `title` String,
    `categories` Array(String),
    `authors` Array(String), 
    `comment` String,
    `primary_category` String,
    VECTOR INDEX vec_idx vector TYPE MSTG('metric_type=Cosine'), 
    CONSTRAINT vec_len CHECK length(vector) = 768) 
ENGINE = ReplacingMergeTree ORDER BY id;

CREATE TABLE wiki.Wikipedia (
    `id` String, 
    `title` String, 
    `text` String,
    `url` String,
    `wiki_id` UInt64,
    `views` Float32,
    `paragraph_id` UInt64,
    `langs` UInt32, 
    `emb` Array(Float32), 
    VECTOR INDEX emb_idx emb TYPE MSTG('metric_type=Cosine'), 
    CONSTRAINT emb_len CHECK length(emb) = 768) 
ENGINE = ReplacingMergeTree ORDER BY id;
```

You can connect those two tables using credentials below (just the same
to the old one)
URL: `msc-4a9e710a.us-east-1.aws.staging.myscale.cloud`
Port: `443`
Username: `chatdata`
Password: `myscale_rocks`

It's FREE and you can also use it with 
ChatData: https://github.com/myscale/ChatData
Retrieval-QA-Benchmark:
https://github.com/myscale/Retrieval-QA-Benchmark
... and also LangChain!

Request for review @baskaryan
2023-11-12 23:22:42 -08:00
Luis Valencia
c40973814d Update README.md (#8570)
- Description: updated readme.
  - Tag maintainer: @baskaryan
  - Twitter handle: @Levalencia

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2023-11-12 22:07:49 -08:00
Manuel Soria
f7f6aebc8d add code to generate embeddings 2023-11-12 18:41:41 -03:00
Manuel Soria
362e1c5233 x 2023-11-12 18:27:46 -03:00
Manuel Soria
e65b73157e adapting readme file 2023-11-12 18:27:36 -03:00
Manuel Soria
ee7c68e8b9 seprating prompts as they are too long 2023-11-12 18:15:42 -03:00
Manuel Soria
597a8c084d replacing chain 2023-11-12 17:54:19 -03:00
Manuel Soria
97d4e028f4 creating template folder 2023-11-12 17:43:41 -03:00
Isak Nyberg
8f81703d76 Add new models to openai callback (#13244)
**Description:** Adding the new models to the openai callback function,
info taken from [model
announcement](https://platform.openai.com/docs/models) and
[pricing](https://openai.com/pricing)

A short description for a short PR :)
2023-11-12 12:01:19 -08:00
Bagatur
ea6dd3a550 bump 335 (#13261) 2023-11-12 11:30:25 -08:00
William FH
a837b03e55 Update langsmith version 0.63 (#13208) 2023-11-12 11:29:25 -08:00
Harrison Chase
7f1d26160d update tools (#13243) 2023-11-12 10:22:54 -08:00
Nuno Campos
8d6faf5665 Make it easier to subclass runnable binding with custom init args (#13189) 2023-11-11 09:01:17 +00:00
Peter Vandenabeele
7f1964b264 Fix BeautifulSoupTransformer: no more duplicates and correct order of tags + tests (#12596) 2023-11-11 08:56:37 +00:00
Bagatur
937d7c41f3 update stack diagram (#13213) 2023-11-10 16:50:20 -08:00
Erick Friis
9c7afa8adb Upgrade cohere embedding model to v3 (#13219)
Just updates API docs, doesn't change default param from 2.0 (could be
breaking change)
2023-11-10 16:25:58 -08:00
Matvey Arye
180657ca7a Add template for conversational rag with timescale vector (#13041)
**Description:** This is like the rag-conversation template in many
ways. What's different is:
- support for a timescale vector store.
- support for time-based filters.
- support for metadata filters.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-10 16:12:32 -08:00
Andrew Zhou
1a1a1a883f fleet_context docs update (#13221)
- **Description:** Changed the fleet_context documentation to use
`context.download_embeddings()` from the latest release from our
package. More details here:
https://github.com/fleet-ai/context/tree/main#api
  - **Issue:** n/a
  - **Dependencies:** n/a
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** @andrewthezhou
2023-11-10 14:53:57 -08:00
Erick Friis
8fdf15c023 Fix Document Loader Unit Test - Docusaurus (#13228) 2023-11-10 14:52:01 -08:00
Lee
72ad448daa feat: Docusaurus Loader (#9138)
Added a Docusaurus Loader

Issue: #6353

I had to implement this for working with the Ionic documentation, and
wanted to open this up as a draft to get some guidance on building this
out further. I wasn't sure if having it be a light extension of the
SitemapLoader was in the spirit of a proper feature for the library --
but I'm grateful for the opportunities Langchain has given me and I'd
love to build this out properly for the sake of the community.

Any feedback welcome!
2023-11-10 14:21:55 -08:00
VAS
8fa960641a Update Documentation: Corrected Typos and Improved Clarity (#11725)
Docs updates

---------

Co-authored-by: Advaya <126754021+bluevayes@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-10 14:14:44 -08:00
Leonid Ganeline
e165daa0ae new course on DeepLearning.ai (#12755)
Added a new course on
[DeepLearning.ai](https://learn.deeplearning.ai/functions-tools-agents-langchain)
Added the LangChain `Wikipedia` link. Probably, it can be placed in the
"More" menu.
2023-11-10 13:55:27 -08:00
Erick Friis
93ae589f1b Add mongo parent template to index (#13222) 2023-11-10 11:56:44 -08:00
Tomaz Bratanic
0dc4ab0be1 Neo4j chat message history (#13008) 2023-11-10 11:53:34 -08:00
Bagatur
bf8cf7e042 Bagatur/langserve blurb (#13217) 2023-11-10 14:05:43 -05:00
fyasla
d266b3ea4a issue #12165 mask API key in chat_models/azureml_endpoint module (#12836)
- **Description:** `AzureMLChatOnlineEndpoint` object from
langchain/chat_models/azureml_endpoint.py safe to print
without having any secrets included in raw format in the string
representation.
  - **Issue:** #12165,
  - **Tag maintainer:** @eyurtsev

---------

Co-authored-by: Faysal Bougamale <faysal.bougamale@horiba.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-10 14:05:19 -05:00
Anush
52f34de9b7 feat: FastEmbed embedding provider (#13109)
## Description:
This PR intends to add
[Qdrant/FastEmbed](https://qdrant.github.io/fastembed/) as a local
embeddings provider, associated tests and documentation.

**Documentation preview:**
https://langchain-git-fork-anush008-master-langchain.vercel.app/docs/integrations/text_embedding/fastembed

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-11-10 13:51:52 -05:00
Eugene Yurtsev
b0e8cbe0b3 Add RunnableSequence documentation (#13094)
Add RunnableSequence documentation
2023-11-10 13:44:43 -05:00
Eugene Yurtsev
869df62736 Document RunnableWithFallbacks (#13088)
Add documentation to RunnableWithFallbacks
2023-11-10 13:16:21 -05:00
Eugene Yurtsev
8313c218da Add more runnable documentation (#13083)
- Adding documentation to the runnable.
- Documentation is not organized in the best way for the runnable; i.e.,
in
terms of LCEL vs. other standard methods, will follow up with more
edits.
2023-11-10 13:14:57 -05:00
Erick Friis
a26105de8e vectara rag mq (#13214)
Description: another Vectara template for MultiQuery RAG flow
Twitter handle: @ofermend

Fixes to #13106

---------

Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
2023-11-10 10:08:45 -08:00
Bagatur
24386e0860 bump 334, exp 40 (#13211) 2023-11-10 09:43:29 -08:00
Lance Martin
d2e50b3108 Add Chroma multimodal cookbook (#12952)
Pending:
* https://github.com/chroma-core/chroma/pull/1294
* https://github.com/chroma-core/chroma/pull/1293

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-10 09:43:10 -08:00
The1Bill
55912868da Update toolkit.py to remove single quotes around table names (#12445)
**Description:** Removing the single quote wrapper around the table
names in the SQL agent toolkit.py file as it misleads the LLM into
querying against tables with single quotes around their names.
**Issue:** #7457 
**Dependencies:** None
**Tag maintainer:** @hwchase17 
**Twitter handle:** None
2023-11-10 06:39:15 -08:00
Nuno Campos
362a446999 Changes to root listener (#12174)
- Implement config_specs to include session_id
- Remove Runnable method and update notebook
- Add more details to notebook, eg. show input schema and config schema
before and after adding message history

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-11-10 09:53:48 +00:00
Nuno Campos
b2b94424db Update return type for Runnable.__or__ (#12880)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-10 09:52:38 +00:00
Bagatur
dd7959f4ac template readme's in docs (#13152) 2023-11-09 23:36:21 -08:00
Bagatur
86b93b5810 Add serve to quickstart (#13174) 2023-11-09 23:10:26 -08:00
Bagatur
fbf7047468 Bagatur/update agent docs (#13167) 2023-11-09 21:14:30 -08:00
Harrison Chase
0a2b1c7471 improve duck duck go tool (#13165) 2023-11-09 20:49:39 -08:00
Bagatur
850336bcf1 Update model i/o docs (#13160) 2023-11-09 20:35:55 -08:00
Jacob Lee
cf271784fa Add basic critique revise template (#12688)
@baskaryan @hwchase17

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-09 17:33:29 -08:00
Cweili
ee3ceb0fb8 Document: Fix "Biadu" typo (#12985)
Fix document "Baidu Cloud ElasticSearch VectorSearch" `Biadu` typo.
2023-11-09 17:32:38 -08:00
Chenyu Zhao
defd4b4f11 Clean up Fireworks provider documentation (#13157) 2023-11-09 16:35:05 -08:00
Bagatur
d9e493e96c fix module sidebar (#13158) 2023-11-09 16:31:45 -08:00
wemysschen
e76ff63125 fix baiducloud_vector_search document typo (#12976)
**Issue:**
fix baiducloud_vector_search document typo

---------

Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>
2023-11-09 16:27:04 -08:00
Holt Skinner
fceae456b9 fix: Updates to formatting in Google Drive Retriever docs (#13015)
- Minor updates to formatting to make easier to read
2023-11-09 16:15:55 -08:00
Bagatur
c63eb9d797 LCEL nits (#13155) 2023-11-09 16:09:33 -08:00
Shinya Maeda
28cc60b347 Fix langchain.llms OpenAI completion doesn't work due to v1 client update (#13099)
This commit fixes the issue that langchain.llms OpenAI completion
stopped working since the V1 openai client update.

Replace this entire comment with:
- **Description:** This PR fixes the issue [AttributeError: module
'openai' has no attribute
'Completion'](https://github.com/langchain-ai/langchain/issues/12967)
similar to
8e0cb2eb84
and https://github.com/langchain-ai/langchain/pull/12969,
  - **Issue:** https://github.com/langchain-ai/langchain/issues/12967,
  - **Dependencies:** `openai` v1.x.x client,
  - **Tag maintainer:** @baskaryan,
  - **Twitter handle:** @dosuken123 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-09 15:12:19 -08:00
Bagatur
555ce600ef Bagatur/docs serve context (#13150) 2023-11-09 15:05:18 -08:00
Bagatur
ff43cd6701 OpenAI remove httpx typing (#13154)
Addresses #13124
2023-11-09 14:32:09 -08:00
Erick Friis
8ad3b255dc Pirate Speak Configurable Template (#13153) 2023-11-09 22:13:45 +00:00
Bagatur
eb51150557 update oai tool agent doc (#13147) 2023-11-09 12:37:30 -08:00
Bagatur
b298f550fe update modules sidebar (#13141) 2023-11-09 11:57:09 -08:00
Bagatur
84e65533e9 Docs: combine LCEL index and why (#13142) 2023-11-09 11:16:45 -08:00
Bagatur
1311450646 fix langsmith links (#13144) 2023-11-09 11:12:50 -08:00
Bagatur
8b2a82b5ce Bagatur/docs smith context (#13139) 2023-11-09 10:22:49 -08:00
Erick Friis
58da6e0d47 Multimodal rag traces (#13140) 2023-11-09 09:54:00 -08:00
Bagatur
150d58304d update oai cookbooks (#13135) 2023-11-09 08:04:51 -08:00
Bagatur
f04cc4b7e1 bump 333 (#13131) 2023-11-09 07:33:15 -08:00
billytrend-cohere
b346d4a455 Add message to documents (#12552)
This adds the response message as a document to the rag retriever so
users can choose to use this. Also drops document limit.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-09 07:30:48 -08:00
Harrison Chase
5f38770161 Support oai tool call (#13110)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-11-09 07:29:29 -08:00
Stefano Lottini
c52725bdc5 (Astra DB/Cassandra) Minor clarification about dependencies in the demo notebook (#13118)
This PR helps developers trying the Astra DB / Cassandra vector store
quickstart notebook by making it clear what other dependencies are
required.
2023-11-09 09:19:15 -05:00
Holt Skinner
0fc8fd12bd feat: Vertex AI Search - Add Snippet Retrieval for Non-Advanced Website Data Stores (#13020)
https://cloud.google.com/generative-ai-app-builder/docs/snippets#snippets

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-11-08 21:52:50 -05:00
Erick Friis
3dbaaf59b2 Tool Retrieval Template (#13104)
Adds a template like
https://python.langchain.com/docs/modules/agents/how_to/custom_agent_with_tool_retrieval

Uses OpenAI functions, LCEL, and FAISS
2023-11-08 18:33:31 -08:00
Jacob Lee
76283e9625 Adds embeddings filter option to return scores in state (#12489)
CC @baskaryan @assafelovic
2023-11-08 17:50:06 -08:00
jakerachleff
18601bd4c8 Get project from langchain sdk (#13100)
## Description
We need to centralize the API we use to get the project name for our
tracers. This PR makes it so we always get this from a shared function
in the langsmith sdk.

## Dependencies
Upgraded langsmith from 0.52 to 0.62 to include the new API
`get_tracer_project`
2023-11-08 17:10:12 -08:00
Bagatur
72e12f6bcf update more azure docs (#13093) 2023-11-08 14:11:16 -08:00
Bagatur
1703f132c6 update azure embedding docs (#13091) 2023-11-08 13:39:31 -08:00
Bagatur
9fdfac22c2 bump 332 (#13089) 2023-11-08 13:23:16 -08:00
Bagatur
1f85ec34d5 bump 331rc3 exp 39 (#13086) 2023-11-08 13:00:13 -08:00
Anton Troynikov
9f077270c8 Don't pass EF to chroma (#13085)
- **Description:** 

Recently Chroma rolled out a breaking change on the way we handle
embedding functions, in order to support multi-modal collections.

This broke the way LangChain's `Chroma` objects get created, because we
were passing the EF down into the Chroma collection:
https://docs.trychroma.com/migration#migration-to-0416---november-7-2023

However, internally, we are never actually using embeddings on the
chroma collection - LangChain's `Chroma` object calls it instead. Thus
we just don't pass an `embedding_function` to Chroma itself, which fixes
the issue.
2023-11-08 12:55:35 -08:00
Erick Friis
f15f8e01cf Azure OpenAI Embeddings (#13039)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-08 12:37:17 -08:00
David Peterson
37561d8986 Add Proper Import Error (#13042)
- **Description:** The issue was not listing the proper import error for
amazon textract loader.
- **Issue:** Time wasted trying to figure out what to install...
(langchain docs don't list the dependency either)
  - **Dependencies:** N/A
  - **Tag maintainer:** @sbusso 
  - **Twitter handle:** @h9ste

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-11-08 10:29:08 -08:00
Eugene Yurtsev
06c503f672 Add RunnableRetry Documentation (#13074) 2023-11-08 18:20:18 +00:00
Bagatur
55aeff6777 oai assistant multiple actions (#13068) 2023-11-08 08:25:37 -08:00
Erick Friis
a9b70baef9 cli updates, 0.0.16 (#13034)
- confirm flags, serve detection
- 0.0.16
- always gen code
- pip bool
2023-11-08 07:47:30 -08:00
Bagatur
1f27104626 Fleet context (#13038)
cc @adrwz
2023-11-07 18:57:09 -08:00
Bagatur
d26fd6f0d1 redirect langsmith walkthrough (#13040) 2023-11-07 18:24:13 -08:00
Erick Friis
6f45532620 Upgrade docs postcss (#13031) 2023-11-07 15:50:25 -08:00
Erick Friis
54ad3cc2b8 template versions again (#13030)
- scipy was locked due to py version
- same guardrails-output-parser
- rag-redis
2023-11-07 15:15:18 -08:00
Erick Friis
506f81563f Update Deps in Experimental (#13029) 2023-11-07 15:15:09 -08:00
Erick Friis
db4b97d590 Relock Templates (#13028) 2023-11-07 15:01:49 -08:00
Stefano Lottini
4f4b020582 Add "Astra DB" vector store integration (#12966)
# Astra DB Vector store integration

- **Description:** This PR adds a `VectorStore` implementation for
DataStax Astra DB using its HTTP API
  - **Issue:** (no related issue)
- **Dependencies:** A new required dependency is `astrapy` (`>=0.5.3`)
which was added to pyptoject.toml, optional, as per guidelines
- **Tag maintainer:** I recently mentioned to @baskaryan this
integration was coming
  - **Twitter handle:** `@rsprrs` if you want to mention me

This PR introduces the `AstraDB` vector store class, extensive
integration test coverage, a reworking of the documentation which
conflates Cassandra and Astra DB on a single "provider" page and a new,
completely reworked vector-store example notebook (common to the
Cassandra store, since parts of the flow is shared by the two APIs). I
also took care in ensuring docs (and redirects therein) are behaving
correctly.

All style, linting, typechecks and tests pass as far as the `AstraDB`
integration is concerned.

I could build the documentation and check it all right (but ran into
trouble with the `api_docs_build` makefile target which I could not
verify: `Error: Unable to import module
'plan_and_execute.agent_executor' with error: No module named
'langchain_experimental'` was the first of many similar errors)

Thank you for a review!
Stefano

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-07 14:45:33 -08:00
Tomaz Bratanic
13bd83bd61 Add neo4j vector memory template (#12993) 2023-11-07 13:00:49 -08:00
Bagatur
5ac2fc5bb2 update stack diagram (#13021) 2023-11-07 12:59:24 -08:00
Yang, Bo
600caff03c Add Memorize tool (#11722)
- **Description:** Add `Memorize` tool
  - **Tag maintainer:** @hwchase17

This PR added a new tool `Memorize` so that an agent can use it to
fine-tune itself. This tool requires `TrainableLLM` introduced in #11721

DEMO:
6a9003d5db

![image](https://github.com/langchain-ai/langchain/assets/601530/d6f0cb45-54df-4dcf-b143-f8aefb1e76e3)
2023-11-07 12:42:10 -08:00
Bagatur
cf481c9418 bump exp 38 (#13016) 2023-11-07 11:49:23 -08:00
Bagatur
57e19989f6 Bagatur/oai assistant (#13010) 2023-11-07 11:44:53 -08:00
Erick Friis
74134dd7e1 cli pyproject updating (#12945)
`langchain app add` and `langchain app remove` will now keep the
dependencies list updated.

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-11-07 11:06:08 -08:00
Tomaz Bratanic
d9abcf1aae Neo4j conversation cypher template (#12927)
Adding custom graph memory to Cypher chain

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-07 11:05:28 -08:00
Lance Martin
2287a311cf Multi modal RAG + QA Cookbooks (#12946)
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Vinzenz Klass <76391770+VinzenzKlass@users.noreply.github.com>
Co-authored-by: Praveen Venkateswaran <praveenv@uci.edu>
Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>
Co-authored-by: Kacper Łukawski <kacperlukawski@users.noreply.github.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-11-07 09:10:24 -08:00
Bagatur
6175dc30aa bump 331rc2 (#13006) 2023-11-07 08:52:17 -08:00
Jasan
ff87f4b4f9 Fix for rag-supabase readme (#12869)
- **Description:** Correct naming for package in README
- **Issue:** README wasn't aligned with pyproject.toml, resulting in not
being able to install the rag-supabase package.
  - **Tag maintainer:** @gregnr
2023-11-06 19:38:22 -08:00
Harrison Chase
99ffeb239f add ingest for mongo (#12897) 2023-11-06 19:28:22 -08:00
Ofer Mendelevitch
ce21308f29 Vectara RAG template (#12975)
- **Description:** RAG template using Vectara
  - **Twitter handle:** @ofermend
2023-11-06 19:24:00 -08:00
Erick Friis
0c81cd923e oai v1 embeddings (#12969)
Initial PR to get OpenAIEmbeddings working with the new sdk

fyi @rlancemartin 

Fixes #12943

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-06 18:52:33 -08:00
Bagatur
fdbb45d79e bump 331rc1 (#12965) 2023-11-06 15:36:43 -08:00
Bagatur
3bb8030a6e fix max_tokens (#12964) 2023-11-06 15:36:05 -08:00
Bagatur
a9002a82b8 bump 331rc0 (#12963) 2023-11-06 15:19:33 -08:00
Harrison Chase
c27400efeb Support multimodal messages (#11320)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-06 15:14:18 -08:00
Bagatur
388f248391 add oai v1 cookbook (#12961) 2023-11-06 14:28:32 -08:00
Bagatur
4f7dff9d66 Record system fingerprint chat openai (#12960) 2023-11-06 14:25:53 -08:00
Bagatur
8e0cb2eb84 ChatOpenAI and AzureChatOpenAI openai>=1 compatible (#12948) 2023-11-06 13:24:18 -08:00
Kacper Łukawski
52d0055a91 Add support of Cohere Embed v3 (#12940)
Cohere released the new embedding API (Embed v3:
https://txt.cohere.com/introducing-embed-v3/) that treats document and
query embeddings differently. This PR updated the `CohereEmbeddings` to
use them appropriately. It also works with the old models.
2023-11-06 15:06:58 -05:00
Praveen Venkateswaran
8e0dcb37d2 Add SecretStr for Symbl.ai Nebula API (#12896)
Description: This PR masks API key secrets for the Nebula model from
Symbl.ai
Issue: #12165 
Maintainer: @eyurtsev

---------

Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>
2023-11-06 14:13:59 -05:00
Vinzenz Klass
59d0bd2150 feat: acquire advisory lock before creating extension in pgvector (#12935)
- **Description:** Acquire advisory lock before attempting to create
extension on postgres server, preventing errors in concurrent
executions.
  - **Issue:** #12933
  - **Dependencies:** None

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-11-06 14:00:39 -05:00
Eugene Yurtsev
b376854b26 Fix for anyscale chat model api key (#12938)
* ChatAnyscale was missing coercion to SecretStr for anyscale api key
* The model inherits from ChatOpenAI so it should not force the openai
api key to be secret str until openai model has the same changes

https://github.com/langchain-ai/langchain/issues/12841
2023-11-06 13:28:02 -05:00
Bagatur
58889149c2 fix guides link (#12941) 2023-11-06 08:13:02 -08:00
matthieudelaro
52503a367f Remove useless line of code from sql.ipynb (#12906)
This PR remove a single line of code from a notebook of the
documentation. This line used to define a variable, which is never used
in the code.
For further context, for reviewers, here is the online documentation:
https://python.langchain.com/docs/use_cases/qa_structured/sql#case-3-sql-agents
2023-11-06 07:59:12 -08:00
hmasdev
622bf12c2e fix regex pattern of structured output parser (#12929)
- **Description:** fix the regex pattern of
[StructuredChatOutputParser](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/structured_chat/output_parser.py#L18)
and add unit tests for the code change.
- **Issue:** #12158 #12922
- **Dependencies:** None
- **Tag maintainer:** 
- **Twitter handle:** @hmdev3
- **NOTE:** This PR conflicts #7495 . After #7495 is merged, I am going
to update PR.
2023-11-06 07:53:14 -08:00
wemysschen
8c02f4fbd8 add baidu cloud vectorsearch document (#12928)
**Description:** 
Add BaiduCloud VectorSearch document with implement of BESVectorSearch
in langchain vectorstores

---------

Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>
2023-11-06 07:52:50 -08:00
wemysschen
8d7144e6a6 fix baiducloud directory loader import file loader (#12924)
**Issue:** 
fix baiducloud BOS directory loader imports its file loader

---------

Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>
2023-11-06 07:52:31 -08:00
Alex Howard
5bb2ea51a5 docs: clean up vestigial markdown (#12907)
- **Description:** Remove text "LangChain currently does not support"
which appears to be vestigial leftovers from a previous change.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** @baskaryan, @eyurtsev
  - **Twitter handle:** thezanke
2023-11-06 07:51:56 -08:00
Praveen Venkateswaran
1eb7d3a862 docs: update hf pipeline docs (#12908)
- **Description:** Noticed that the Hugging Face Pipeline documentation
was a bit out of date.
Updated with information about passing in a pipeline directly
(consistent with docstring) and a recent contribution of mine on adding
support for multi-gpu specifications with Accelerate in
21eeba075c
2023-11-06 07:51:31 -08:00
Christoffer Bo Petersen
37da6e546b Fix typo in e2b_data_analysis.ipynb (#12930)
Just a small typo fix
2023-11-06 07:37:30 -08:00
Kacper Łukawski
621419f71e Fix normalizing the cosine distance in Qdrant (#12934)
Qdrant was incorrectly calculating the cosine similarity and returning
`0.0` for the best match, instead of `1.0`. Internally Qdrant returns a
cosine score from `-1.0` (worst match) to `1.0` (best match), and the
current formula reflects it.
2023-11-06 07:36:59 -08:00
Hech
8fe6bcc662 Fix return metadata when searching for DingoDB (#12937) 2023-11-06 07:35:36 -08:00
Jakub Novák
ada3d2cbd1 Add possibility to pass on_artifacts for a specific conversation (#12687)
Possibility to pass on_artifacts to a conversation. It can be then
achieved by adding this way:

```python
result = agent.run(
    input=message.text,
    metadata={
        "on_artifact": CALLBACK_FUNCTION
    },
)
```
2023-11-06 07:29:47 -08:00
Bagatur
0378662e1d fix langsmith link (#12939) 2023-11-06 07:17:05 -08:00
Harrison Chase
1a92d2245d Harrison/docs smith serve (#12898)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-06 07:07:25 -08:00
Bagatur
53f453f01a bump 331 (#12932) 2023-11-06 05:58:12 -08:00
Priyadutt
a4d9e986fb Update csv.ipynb description (#12878)
The line removed is not required as there are no other alternative
solutions above than that.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-06 03:32:04 -08:00
Erick Friis
5000c7308e cli template gitignores (#12914)
- ap gitignore
- package
2023-11-05 22:34:45 -08:00
Harrison Chase
aba407f774 use keys not items (#12918) 2023-11-05 22:08:29 -08:00
Harrison Chase
60d025b83b mongo parent document retrieval (#12887) 2023-11-04 10:16:02 -07:00
Michael Hunger
e43b4079c8 template: use dashes instead of underscores for neo4j-cypher package and path in readme (#12827)
Minimal readme template update

underscores didn't work, dashes do
2023-11-03 15:54:48 -07:00
wemysschen
e14aa37d59 fix bes vector store search (#12828)
**Issue:** 
fix search body in baidu cloud vectorsearch

---------

Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>
2023-11-03 15:39:19 -07:00
standby24x7
f04e4df7f9 coockbook: Fix typo in wikibase_agent.ipynb (#12839)
This patch fixes a spelling typo in message
within wikibase_agent.ipynb.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2023-11-03 14:57:37 -07:00
Kacper Łukawski
66c41c0dbf Add template for self-query-qdrant (#12795)
This PR adds a self-querying template using Qdrant as a vector store.
The template uses an artificial dataset and was implemented in a way
that simplifies passing different components and choosing LLM and
embedding providers.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-03 13:37:29 -07:00
Daniel Chalef
f41f4c5e37 zep/rag conversation zep template (#12762)
LangServe template for a RAG Conversation App using Zep.

 @baskaryan, @eyurtsev

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-03 13:34:44 -07:00
Lance Martin
ea1ab391d4 Open Clip multimodal embeddings (#12754) 2023-11-03 13:33:36 -07:00
Bagatur
ebee616822 bump 330 (#12853) 2023-11-03 13:26:41 -07:00
Tomaz Bratanic
0dbdb8498a Neo4j Advanced RAG template (#12794)
Todo:

- [x] Docs
2023-11-03 13:22:55 -07:00
Harrison Chase
83cee2cec4 Template Readmes and Standardization (#12819)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-03 13:15:29 -07:00
Erick Friis
6c237716c4 Update readmes with new cli install (#12847)
Old command still works. Just simplifying.

Merge after releasing CLI 0.0.15
2023-11-03 12:10:32 -07:00
Erick Friis
7db49d3842 Confirm sys.path includes current dir for app serve (#12851)
- Make sure sys.path is set properly for langchain app serve
- bump
2023-11-03 11:37:20 -07:00
Erick Friis
1bc35f61cb CLI 0.0.14, Uvicorn update and no more [serve] (#12845)
Calls uvicorn directly from cli:
Reload works if you define app by import string instead of object.
(was doing subprocess in order to get reloading)

Version bump to 0.0.14

Remove the need for [serve] for simplicity.

Readmes are updated in #12847 to avoid cluttering this PR
2023-11-03 11:05:52 -07:00
Brace Sproul
76bcac5bb3 Remove admin prefix/suffix from docs for anthropic (#12849) 2023-11-03 10:54:16 -07:00
Harrison Chase
523e5803bb update mongo template (#12838) 2023-11-03 10:31:53 -07:00
William FH
18005c6384 Disable trace_on_chain_group auto-tracing (#12807)
Previously we treated trace_on_chain_group as a command to always start
tracing. This is unintuitive (makes the function do 2 things), and makes
it harder to toggle tracing
2023-11-03 10:05:09 -07:00
Erick Friis
0da75b9ebd Autopopulate module name in cli init (#12814) 2023-11-02 23:45:38 -07:00
William FH
98aff29fbd Add Dataset Page to printout (#12816) 2023-11-02 20:36:56 -07:00
Joseph Martinez
f573a4d0b3 Update quickstart.mdx (#12386)
**Description**
Removed confusing sentence. 
Not clear what "both" was referring to. The two required components
mentioned previously? The two methods listed below?

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-02 18:38:21 -07:00
Leonid Ganeline
e112b2f2e6 updated integrations/providers/google (#12226)
Added missed integrations. Updated formats.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-02 18:35:31 -07:00
Manuel Rech
2e2b9c76d9 Keep also original query - multi_query.py (#12696)
When you use a MultiQuery it might be useful to use the original query
as well as the newly generated ones to maximise the changes to retriever
the correct document. I haven't created an issue, it seems a very small
and easy thing.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-02 18:15:02 -07:00
Michael Landis
4fe9bf70b6 feat: add a rag template for momento vector index (#12757)
# Description
Add a RAG template showcasing Momento Vector Index as a vector store.
Includes a project directory and README.

# **Twitter handle** 

Tag the company @momentohq for a mention and @mlonml for the
contribution.
2023-11-02 17:59:15 -07:00
刘 方瑞
26c4ec1eaf myscale notebook url change (#12810)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-02 17:56:26 -07:00
Lance Martin
2683c2fc53 Update template index (#12809) 2023-11-02 17:51:40 -07:00
apeng-singlestore
5c0e9ac578 Add template for rag-singlestoredb (#12805)
This change adds a new template for simple RAG using the SingleStoreDB
vectorstore.

Twitter: @alexjpeng

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-02 17:51:00 -07:00
Bagatur
658a3a8607 FEAT: Merge TileDB vecstore (#12811) 2023-11-02 17:40:32 -07:00
Akio Nishimura
c04647bb4e Correct number of elements in config list in batch() and abatch() of BaseLLM (#12713)
- **Description:** Correct number of elements in config list in
`batch()` and `abatch()` of `BaseLLM` in case `max_concurrency` is not
None.
- **Issue:** #12643
- **Twitter handle:** @akionux

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-02 17:28:48 -07:00
James Braza
88b506b321 Adds missing urllib.parse for IDE warning of PubMedAPIWrapper (#12808)
Resolves an IDE (PyCharm 2023.2.3 PE) warning around
`urllib.parse.quote`, also enabling CTRL-click
2023-11-02 17:27:25 -07:00
Bagatur
a2bb0dd445 TileDB update import unit tests 2023-11-02 17:24:22 -07:00
Nikos Papailiou
2fdaa1e5fd Add TileDB vectorstore implementation (#12624)
- **Description:** Add [TileDB](https://tiledb.com) vectorstore
implementation. TileDB offers ANN search capabilities using the
[TileDB-Vector-Search](https://github.com/TileDB-Inc/TileDB-Vector-Search)
module. It provides serverless execution of ANN queries and storage of
vector indexes both on local disk and cloud object stores (i.e. AWS S3).
More details in:
- [Why TileDB as a Vector
Database](https://tiledb.com/blog/why-tiledb-as-a-vector-database)
- [TileDB 101: Vector
Search](https://tiledb.com/blog/tiledb-101-vector-search)
- **Twitter handle:** @tiledb
2023-11-02 17:21:03 -07:00
盐粒 Yanli
1b233798a0 feat: Supprt pgvecto.rs as a VectorStore (#12718)
Supprt [pgvecto.rs](https://github.com/tensorchord/pgvecto.rs) as a new
VectorStore type.

This introduces a new dependency
[pgvecto_rs](https://pypi.org/project/pgvecto_rs/) and upgrade
SQLAlchemy to ^2.

Relate to https://github.com/tensorchord/pgvecto.rs/issues/11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-02 17:16:04 -07:00
Daniel Chalef
0cbdba6a9b zep: VectorStore: Use Native MMR (#12690)
- refactor to use Zep's native MMR; update example
- 
@baskaryan @eyurtsev
2023-11-02 16:45:42 -07:00
Daniel Chalef
cc3d3920e3 Zep: Summary Search and Example (#12686)
Zep now has the ability to search over chat history summaries. This PR
adds support for doing so. More here: https://blog.getzep.com/zep-v0-17/

@baskaryan @eyurtsev
2023-11-02 16:31:11 -07:00
Bagatur
526313002c add import tests to all modules (#12806) 2023-11-02 15:32:55 -07:00
Harrison Chase
6609a6033f fix vectorstore imports (#12804)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-11-02 15:32:31 -07:00
Nuno Campos
f66a9d2adf Automatically add configurable key to config_schema if config_specs i… (#12798)
…s present

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-02 21:46:15 +00:00
Praveen Venkateswaran
21eeba075c enable the device_map parameter in huggingface pipeline (#12731)
### Enabling `device_map` in HuggingFacePipeline 

For multi-gpu settings with large models, the
[accelerate](https://huggingface.co/docs/accelerate/usage_guides/big_modeling#using--accelerate)
library provides the `device_map` parameter to automatically distribute
the model across GPUs / disk.

The [Transformers
pipeline](3520e37e86/src/transformers/pipelines/__init__.py (L543))
enables users to specify `device` (or) `device_map`, and handles cases
(with warnings) when both are specified.

However, Langchain's HuggingFacePipeline only supports specifying
`device` when calling transformers which limits large models and
multi-gpu use-cases.
Additionally, the [default
value](8bd3ce59cd/libs/langchain/langchain/llms/huggingface_pipeline.py (L72))
of `device` is initialized to `-1` , which is incompatible with the
transformers pipeline when `device_map` is specified.

This PR addresses the addition of `device_map` as a parameter , and
solves the incompatibility of `device = -1` when `device_map` is also
specified.
An additional test has been added for this feature. 

Additionally, some existing tests no longer work since 
1. `max_new_tokens` has to be specified under `pipeline_kwargs` and not
`model_kwargs`
2. The GPT2 tokenizer raises a `ValueError: Pipeline with tokenizer
without pad_token cannot do batching`, since the `tokenizer.pad_token`
is `None` ([related
issue](https://github.com/huggingface/transformers/issues/19853) on the
transformers repo).

This PR handles fixing these tests as well.

Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>
2023-11-02 14:29:06 -07:00
Mark Bell
3276aa3e17 __getattr__ should rase AttributeError not ImportError on missing attributes (#12801)
[The python
spec](https://docs.python.org/3/reference/datamodel.html#object.__getattr__)
requires that `__getattr__` throw `AttributeError` for missing
attributes but there are several places throwing `ImportError` in the
current code base. This causes a specific problem with `hasattr` since
it calls `__getattr__` then looks only for `AttributeError` exceptions.
At present, calling `hasattr` on any of these modules will raise an
unexpected exception that most code will not handle as `hasattr`
throwing exceptions is not expected.

In our case this is triggered by an exception tracker (Airbrake) that
attempts to collect the version of all installed modules with code that
looks like: `if hasattr(mod, "__version__"):`. With `HEAD` this is
causing our exception tracker to fail on all exceptions.

I only changed instances of unknown attributes raising `ImportError` and
left instances of known attributes raising `ImportError`. It feels a
little weird but doesn't seem to break anything.
2023-11-02 17:08:54 -04:00
Daniel Chalef
d966e4d13a zep: Update Zep docs and messaging (#12764)
Update Zep documentation with messaging, more details.

 @baskaryan, @eyurtsev
2023-11-02 13:39:17 -07:00
Illia
71d1a48b66 Use data from all Google search results in SerpApi.com wrapper (#12770)
- **Description:** Use all Google search results data in SerpApi.com
wrapper instead of the first one only
  - **Tag maintainer:** @hwchase17 

_P.S. `libs/langchain/tests/integration_tests/utilities/test_serpapi.py`
are not executed during the `make test`._
2023-11-02 13:31:27 -07:00
ba230t
9214d8e6ed Fixed a typo in templates/docs/CONTRIBUTING.md (delimeters =>delimiters) (#12774)
- **Description:** Just fixed a minor typo in
templates/docs/CONTRIBUTING.md.
  - **Issue:** No linked issues.

Very small contribution!
2023-11-02 13:31:04 -07:00
Armin Stepanjan
185ddc573e Fix broken links to use cases (#12777)
This PR replaces broken links to end to end usecases
([/docs/use_cases](https://python.langchain.com/docs/use_cases)) with a
non-broken version
([/docs/use_cases/qa_structured/sql](https://python.langchain.com/docs/use_cases/qa_structured/sql)),
consistently with the "Use cases" navigation button at the top of the
page.

---------

Co-authored-by: Matvey Arye <mat@timescale.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-02 13:20:54 -07:00
니콜라스
25ee10ed4f Docs: 'memory' -> 'history' typo. (#12779)
The 'MessagesPlaceholder' expects 'history' but 'RunnablePassthrough' is
assigning 'memory'.
2023-11-02 13:09:39 -07:00
yudai yamamoto
1f7e811156 Fixed broken link in Quickstart page (#12516)
- **Description:** 
Corrected a specific link within the documentation.
  
  - **Issue:**
  #12490 

  - **Dependencies:**
  - **Tag maintainer:**
  - **Twitter handle:**

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-02 13:00:53 -07:00
Ikko Eltociear Ashimine
9b02f7d59c Update llamacpp.ipynb (#12791)
HuggingFace -> Hugging Face
2023-11-02 12:52:12 -07:00
Tomaz Bratanic
2a9f40ed28 Add input types to cypher templates (#12800) 2023-11-02 12:46:02 -07:00
Nuno Campos
c4fdf78d03 Fix AddableDict raising exception when used with non-addable values (#12785)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-02 18:56:29 +00:00
Erick Friis
49e283a0cd CLI 0.0.13, Configurable Template Demo (#12796) 2023-11-02 11:42:57 -07:00
Nuno Campos
d1c6ad7769 Fix on_llm_new_token(chunk=) for some chat models (#12784)
It was passing in message instead of generation

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-02 16:33:44 +00:00
Erick Friis
070823f294 CLI 0.0.12 (#12787) 2023-11-02 08:29:27 -07:00
Bagatur
979501c0ca bump 329 (#12778) 2023-11-02 06:02:43 -07:00
Matvey Arye
9369d6aca0 Fixes to the docs for timescale vector template (#12756) 2023-11-01 18:48:23 -07:00
Lance Martin
33810126bd Update chat prompt structure in LLaMA SQL cookbook (#12364)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-01 16:37:03 -07:00
ElliotKetchup
58b90f30b0 Update llama.cpp integration (#11864)
<!-- 
- **Description:** removed redondant link, replaced it with Meta's LLaMA
repo, add resources for models' hardware requirements,
  - **Issue:** None,
  - **Dependencies:** None,
  - **Tag maintainer:** None,
  - **Twitter handle:** @ElliotAlladaye
 -->
2023-11-01 16:32:02 -07:00
Manuel Soria
a228f340f1 Semantic search within postgreSQL using pgvector (#12365)
Cookbook showing how to incoporate RAG search within a postgreSQL
database using pgvector.

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-01 16:21:34 -07:00
Erick Friis
da821320d3 Fixes 'Nonetype' not iterable for ObsidianLoader (#12751)
Implements #12726 from @Di3mex
2023-11-01 16:07:09 -07:00
Juan Bustos
67b6f4dc71 Update google_vertex_ai_palm.ipynb (#12715)
Fixed a typo

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** Fixed a typo on the code
  - **Issue:** the issue # it fixes (if applicable),


Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-01 16:05:44 -07:00
Eugene Yurtsev
b1caae62fd APIChain add restrictions to domains (CVE-2023-32786) (#12747)
* Restrict the chain to specific domains by default
* This is a breaking change, but it will fail loudly upon object
instantiation -- so there should be no silent errors for users
* Resolves CVE-2023-32786
2023-11-01 18:50:34 -04:00
Erick Friis
4421ba46d7 Demo Server, Fix Timescale (#12746)
- improve demo server
- missing deps
2023-11-01 15:29:34 -07:00
Eugene Yurtsev
0e1aedb9f4 Use jinja2 sandboxing by default (#12733)
* This is an opt-in feature, so users should be aware of risks if using
jinja2.
* Regardless we'll add sandboxing by default to jinja2 templates -- this
  sandboxing is a best effort basis.
* Best strategy is still to make sure that jinja2 templates are only
loaded from trusted sources.
2023-11-01 14:54:01 -07:00
Erick Friis
ab5309f6f2 template updates (#12736)
- langchain license
- add timescale vector dep to that template
2023-11-01 13:53:26 -07:00
Lance Martin
6406c53089 Update template index w/ Timescale (#12729) 2023-11-01 12:04:54 -07:00
Erick Friis
14340ee7cd use http.client instead of urllib3 (#12660)
dep problems with requests

cloudflare debugging not worth it with urllib
2023-11-01 11:15:05 -07:00
Bagatur
eee5181b7a bump 328, exp 37 (#12722) 2023-11-01 10:27:39 -07:00
Erick Friis
3405dbbc64 dash not underscore (#12716)
template names are auto-populating with the wrong convention (with
underscores)
2023-11-01 09:48:37 -07:00
123-fake-st
8bd3ce59cd PyPDFLoader use url in metadata source if file is a web path (#12092)
**Description:** Update `langchain.document_loaders.pdf.PyPDFLoader` to
store url in metadata (instead of a temporary file path) if user
provides a web path to a pdf

- **Issue:** Related to #7034; the reporter on that issue submitted a PR
updating `PyMuPDFParser` for this behavior, but it has unresolved merge
issues as of 20 Oct 2023 #7077
- In addition to `PyPDFLoader` and `PyMuPDFParser`, these other classes
in `langchain.document_loaders.pdf` exhibit similar behavior and could
benefit from an update: `PyPDFium2Loader`, `PDFMinerLoader`,
`PDFMinerPDFasHTMLLoader`, `PDFPlumberLoader` (I'm happy to contribute
to some/all of that, including assisting with `PyMuPDFParser`, if my
work is agreeable)
- The root cause is that the underlying pdf parser classes, e.g.
`langchain.document_loaders.parsers.pdf.PyPDFParser`, never receive
information about the url; the parsers receive a
`langchain.document_loaders.blob_loaders.blob`, which contains the pdf
contents and local file path, but not the url
- This update passes the web path directly to the parser since it's
minimally invasive and doesn't require further changes to maintain
existing behavior for local files... bigger picture, I'd consider
extending `blob` so that extra information like this can be
communicated, but that has much bigger implications on the codebase
which I think warrants maintainer input

  - **Dependencies:** None

```python
# old behavior
>>> from langchain.document_loaders import PyPDFLoader
>>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': '/var/folders/w2/zx77z1cs01s1thx5dhshkd58h3jtrv/T/tmpfgrorsi5/tmp.pdf', 'page': 0}

# new behavior
>>> from langchain.document_loaders import PyPDFLoader
>>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0}
```
2023-11-01 11:27:00 -04:00
Dave Kwon
b1954aab13 feat: Add page metadata on PDFMinerLoader (#12277)
- **Description:** #12273 's suggestion PR
Like other PDFLoader, loading pdf per each page and giving page
metadata.
  - **Issue:** #12273 
  - **Twitter handle:** @blue0_0hope

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-11-01 11:25:37 -04:00
Duda Nogueira
7148f3e1fe Weaviate - Fix schema existence check (#12711)
This will allow you create the schema beforehand. The check was failing
and preventing importing into existing classes.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-11-01 08:22:15 -07:00
Sayandip
8dbbcf0b6c Adding a template for Solo Performance Prompting Agent (#12627)
**Description:** This template creates an agent that transforms a single
LLM into a cognitive synergist by engaging in multi-turn
self-collaboration with multiple personas.
**Tag maintainer:** @hwchase17

---------

Co-authored-by: Sayandip Sarkar <sayandip.sarkar@skypointcloud.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-11-01 08:10:07 -07:00
Aidos Kanapyanov
ae63c186af Mask API key for Anyscale LLM (#12406)
Description: Add masking of API Key for Anyscale LLM when printed.
Issue: #12165 
Dependencies: None
Tag maintainer: @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-11-01 10:22:26 -04:00
Predrag Gruevski
5ae51a8a85 Fix typo highlighted by ruff autoformatter. (#12691)
H/t @MichaReiser for spotting it:
https://github.com/langchain-ai/langchain/pull/12585/files#r1378253045
2023-10-31 22:16:06 -04:00
Predrag Gruevski
724b92231d Remove black caching config from CI lint workflow. (#12594)
To merge after #12585 is merged.
2023-10-31 21:39:05 -04:00
Predrag Gruevski
0ea837404a Only publish to test PyPI from the _test_release.yml workflow. (#12668)
PyPI trusted publishing wants to know which workflow is expected to do
the publish. We always want to publish from the same workflow, so we're
making `_test_release.yml` the only workflow that publishes to Test
PyPI.
2023-10-31 21:36:38 -04:00
Predrag Gruevski
321cd44f13 Use separate jobs for building and publishing test releases. (#12671)
This follows the principle of least privilege. Our `poetry build` step
doesn't need, and shouldn't get, access to our GitHub OIDC capability.

This is the same structure as I used in the already-merged PR for
refactoring the regular PyPI release workflow: #12578.
2023-10-31 21:36:26 -04:00
Erick Friis
44c8b159b9 properly increment version in cli (#12685)
Went from 0.0.9 -> 0.0.11 without releasing. Back to 10, then release.
2023-10-31 17:27:43 -07:00
Erick Friis
b825dddf95 fix elastic rag template in playground (#12682)
- a few instructions in the readme (load_documents -> ingest.py)
- added docker run command for local elastic
- adds input type definition to render playground properly
2023-10-31 17:18:35 -07:00
Lance Martin
f0eba1ac63 Add RAG input types (#12684)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-31 17:13:44 -07:00
Erick Friis
392cfbee24 link to templates (#12680) 2023-10-31 16:19:22 -07:00
Leonid Ganeline
ddcec005bc fix for YahooFinanceNewsTool (#12665)
Added YahooFinanceNewsTool to the __init__.py 
It was missed here.
2023-10-31 14:58:09 -07:00
Predrag Gruevski
09711ad5a1 Both lint and format templates with ruff v0.1.3. (#12676)
- Both lint and format code in `templates`.
- Upgrade to ruff v0.1.3.
2023-10-31 14:52:00 -07:00
Predrag Gruevski
01a3c9b94e Use an in-project virtualenv in the CLI package. (#12678)
Keeping it in sync with how our other packages are configured.
2023-10-31 14:51:24 -07:00
Predrag Gruevski
f7f35a9102 Use black to lint notebooks and docs for now. (#12679)
Due to #12677 having lots of errors for the time being.
2023-10-31 14:51:05 -07:00
Jacob Lee
bd668fcea1 Adds version CLI command (#12619)
Will be automatically bumped with `poetry version patch`.

@efriis @hwchase17

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-31 14:50:04 -07:00
Frank
bf5805bb32 Add quip loader (#12259)
- **Description:** implement [quip](https://quip.com) loader
  - **Issue:** https://github.com/langchain-ai/langchain/issues/10352
  - **Dependencies:** No
  -  pass make format, make lint, make test

---------

Co-authored-by: Hao Fan <h_fan@apple.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-31 14:11:24 -07:00
Roman Vasilyev
c9a6940d58 PGVector fix (#12592)
latest release broken, this fixes it

---------

Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-31 17:01:15 -04:00
Lance Martin
9e17d1a225 Update Vertex template (#12644)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-31 14:00:22 -07:00
Predrag Gruevski
aa3f4a9bc8 Remove the CLI package's pydantic compatibility tests. (#12675)
They aren't necessary, since the CLI package doesn't have a direct
dependency on pydantic.
2023-10-31 16:57:38 -04:00
Predrag Gruevski
e8b99364b3 Use ruff for both linting and formatting in langchain-cli. (#12672)
Prior to this PR, `ruff` was used only for linting and not for
formatting, despite the names of the commands. This PR makes it be used
for both linting code and autoformatting it.
2023-10-31 13:52:25 -07:00
Harrison Chase
9a10b2b047 fix plate chain (#12673) 2023-10-31 13:45:09 -07:00
Margaret Qian
acfc485808 Update MosaicML Embedding Input Key (#12657)
This input key was missed in the last update PR:
https://github.com/langchain-ai/langchain/pull/7391

The input/output formats are intended to be like this:

```
{"inputs": [<prompt>]} 

{"outputs": [<output_text>]}
```
2023-10-31 14:43:30 -04:00
Erika Cardenas
d26ac5f999 Update README for Hybrid Search Weaviate (#12661)
- **Description:** Updated the README for Hybrid Search Weaviate
2023-10-31 11:02:34 -07:00
Predrag Gruevski
c871cc5055 Remove print() statements which seemed leftover from debugging. (#12648)
Added in #12159 presumably during debugging. Right now they cause a bit of visual noise.
2023-10-31 13:45:48 -04:00
Erick Friis
2a7e0a27cb update lc version (#12655)
also updated py version in `csv-agent` and `rag-codellama-fireworks`
because they have stricter python requirements
2023-10-31 10:19:15 -07:00
Predrag Gruevski
360cff81a3 Overwrite existing distributions when uploading to test PyPI. (#12658) 2023-10-31 10:02:50 -07:00
Lance Martin
da94c750c5 Add RAG template for Timescale Vector (#12651)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Matvey Arye <mat@timescale.com>
2023-10-31 09:56:29 -07:00
Noam Gat
14e8c74736 LM Format Enforcer Integration + Sample Notebook (#12625)
## Description

This PR adds support for
[lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) to
LangChain.

![image](https://raw.githubusercontent.com/noamgat/lm-format-enforcer/main/docs/Intro.webp)

The library is similar to jsonformer / RELLM which are supported in
Langchain, but has several advantages such as
- Batching and Beam search support
- More complete JSON Schema support
- LLM has control over whitespace, improving quality
- Better runtime performance due to only calling the LLM's generate()
function once per generate() call.

The integration is loosely based on the jsonformer integration in terms
of project structure.

## Dependencies

No compile-time dependency was added, but if `lm-format-enforcer` is not
installed, a runtime error will occur if it is trying to be used.

## Tests

Due to the integration modifying the internal parameters of the
underlying huggingface transformer LLM, it is not possible to test
without building a real LM, which requires internet access. So, similar
to the jsonformer and RELLM integrations, the testing is via the
notebook.

## Twitter Handle

[@noamgat](https://twitter.com/noamgat)


Looking forward to hearing feedback!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-31 09:49:01 -07:00
Stefano Lottini
a4e4b5a86f Relax python version and remove need for explicit setup step (#12637)
This PR addresses what seems like a unnecessary Python version
restriction in the pyroject.toml specs within both Cassandra (/Astra DB)
templates. With "^3.11" I got some version incompatibilities with the
latest "langchain add [...]" commands, so these are now relaxed in line
with the other templates I could inspect.

Incidentally, in the "entomology" template, the need for an explicit
"setup" step for the user to carry on has been removed, replaced by a
check-and-execute-if-necessary instruction on app startup.

Thank you for your attention!
2023-10-31 09:42:27 -07:00
Predrag Gruevski
5308b836c7 Upgrade to actions/checkout@v4 in the docs lint job. (#12581) 2023-10-31 12:41:18 -04:00
Predrag Gruevski
94f018f1ba Support release-testing packages with dashes in their names. (#12654) 2023-10-31 12:40:34 -04:00
Erick Friis
912ace18e9 fix template py verisons (#12650) 2023-10-31 09:20:29 -07:00
Brian McBrayer
b74468f399 Fix small typo on Founcational -> Router notebook (#12634)
- **Description:** Fix small typo on Founcational -> Router notebook
2023-10-31 09:16:29 -07:00
Predrag Gruevski
72fa5a463d Show ruff output inline in GitHub PRs. (#12647) 2023-10-31 12:16:01 -04:00
William FH
17c2e3b87e Rename Template (#12649)
To chatbot feedback. Update import
2023-10-31 09:15:30 -07:00
Erick Friis
7f6e751a3d template updates (#12646) 2023-10-31 09:13:58 -07:00
Leonid Kuligin
a53cac4508 added template to use Vertex Vector Search for q&a (#12622)
added template to use Vertex Vector Search for q&a
2023-10-31 08:49:24 -07:00
Lance Martin
944cb552bb Minor updates to READMEs (#12642) 2023-10-31 08:34:46 -07:00
William FH
88f0f1e73b Conversational Feedback (#12590)
Context in the README.

Show how score chat responses based on a followup from the user and then
log that as feedback in LangSmith
2023-10-31 08:34:17 -07:00
Predrag Gruevski
f94e24dfd7 Install and use ruff format instead of black for code formatting. (#12585)
Best to review one commit at a time, since two of the commits are 100%
autogenerated changes from running `ruff format`:
- Install and use `ruff format` instead of black for code formatting.
- Output of `ruff format .` in the `langchain` package.
- Use `ruff format` in experimental package.
- Format changes in experimental package by `ruff format`.
- Manual formatting fixes to make `ruff .` pass.
2023-10-31 10:53:12 -04:00
William FH
bfd719f9d8 bind_functions convenience method (#12518)
I always take 20-30 seconds to re-discover where the
`convert_to_openai_function` wrapper lives in our codebase. Chat
langchain [has no
clue](https://smith.langchain.com/public/3989d687-18c7-4108-958e-96e88803da86/r)
what to do either. There's the older `create_openai_fn_chain` , but we
haven't been recommending it in LCEL. The example we show in the
[cookbook](https://python.langchain.com/docs/expression_language/how_to/binding#attaching-openai-functions)
is really verbose.


General function calling should be as simple as possible to do, so this
seems a bit more ergonomic to me (feel free to disagree). Another option
would be to directly coerce directly in the class's init (or when
calling invoke), if provided. I'm not 100% set against that. That
approach may be too easy but not simple. This PR feels like a decent
compromise between simple and easy.

```
from enum import Enum
from typing import Optional

from pydantic import BaseModel, Field


class Category(str, Enum):
    """The category of the issue."""

    bug = "bug"
    nit = "nit"
    improvement = "improvement"
    other = "other"


class IssueClassification(BaseModel):
    """Classify an issue."""

    category: Category
    other_description: Optional[str] = Field(
        description="If classified as 'other', the suggested other category"
    )
    

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI().bind_functions([IssueClassification])
llm.invoke("This PR adds a convenience wrapper to the bind argument")

# AIMessage(content='', additional_kwargs={'function_call': {'name': 'IssueClassification', 'arguments': '{\n  "category": "improvement"\n}'}})
```
2023-10-31 07:15:37 -07:00
Nuno Campos
3143324984 Improve Runnable type inference for input_schemas (#12630)
- Prefer lambda type annotations over inferred dict schema
- For sequences that start with RunnableAssign infer seq input type as
"input type of 2nd item in sequence - output type of runnable assign"
2023-10-31 13:22:54 +00:00
Nuno Campos
2f563cee20 Add Runnable.with_listeners() (#12549)
- This binds start/end/error listeners to a runnable, which will be
called with the Run object
2023-10-31 11:04:51 +00:00
Bagatur
bcc62d63be bump 327 (#12623) 2023-10-31 02:18:08 -07:00
Erick Friis
a1fae1fddd Readme rewrite (#12615)
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-31 00:06:02 -07:00
Ankur Singh
00766c9f31 Improves the description of the installation command (#12354)
- **Description:**

 Before: 
`
To install modules needed for the common LLM providers, run:
`

After:
`
To install modules needed for the common LLM providers, run the
following command. Please bear in mind that this command is exclusively
compatible with the `bash` shell:
`


> This is required for the user so that the user will know if this
command is compatible with `zsh` or not.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 18:56:48 -07:00
Yujie Qian
1dbb77d7db VoyageEmbeddings (#12608)
- **Description:** Integrate VoyageEmbeddings into LangChain, with tests
and docs
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** N/A
  - **Twitter handle:** @Voyage_AI_

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 18:37:43 -07:00
chocolate4
92bf40a921 Add a new vector store hippo for langchain #11763 (#12412)
#11763

---------

Co-authored-by: TranswarpHippo <hippo.0.assistant@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 18:35:23 -07:00
Karthik Raja A
342d6c7ab6 Multi on client toolkit (#12392)
Replace this entire comment with:
-Add MultiOn close function and update key value and add async
functionality
- solved the key value TabId not found.. (updated to use latest key
value)
  
@hwchase17
2023-10-30 18:34:56 -07:00
Prabin Nepal
b109cb031b SecretStr for fireworks api (#12475)
- **Description:** This pull request removes secrets present in raw
format,
- **Issue:** Fireworks api key was exposed when printing out the
langchain object
[#12165](https://github.com/langchain-ai/langchain/issues/12165)
 - **Maintainer:** @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 18:17:53 -07:00
Harrison Chase
f35a65124a improve agent templates (#12528)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-30 18:15:13 -07:00
Harrison Chase
75bb28afd8 Harrison/pii chatbot (#12523)
the pii detection in the template is pretty basic, will need to be
customized per use case

the chain it "protects" can be swapped out for any chain
2023-10-30 18:13:12 -07:00
Harrison Chase
a32c236c64 bump cli to 009 (#12611) 2023-10-30 18:12:08 -07:00
Erika Cardenas
b97b9eda21 Hybrid Search Weaviate Template (#12606)
- **Description:** This template covers hybrid search in Weaviate
  - **Dependencies:** No
  - **Twitter handle:** @ecardenas300

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-30 18:10:48 -07:00
Martin Schade
0c7f1d8b21 Textract linearizer (#12446)
**Description:** Textract PDF Loader generating linearized output,
meaning it will replicate the structure of the source document as close
as possible based on the features passed into the call (e. g. LAYOUT,
FORMS, TABLES). With LAYOUT reading order for multi-column documents or
identification of lists and figures is supported and with TABLES it will
generate the table structure as well. FORMS will indicate "key: value"
with columms.
  - **Issue:** the issue fixes #12068 
- **Dependencies:** amazon-textract-textractor is added, which provides
the linearization
  - **Tag maintainer:** @3coins 

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 18:02:10 -07:00
Harrison Chase
a7d5e0ce8a add guardrails profanity (#12609) 2023-10-30 17:01:23 -07:00
Erick Friis
e933212a3d run poetry build in working dir (#12610)
Was failing because was trying to build from root:
https://github.com/langchain-ai/langchain/actions/runs/6700033981/job/18205251365
2023-10-30 16:58:34 -07:00
Erick Friis
f39246bd7e cli should pull instead of delete+clone (#12607)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-30 16:44:09 -07:00
Harrison Chase
8b5e879171 add a template for the package readme (#12499)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-30 16:39:39 -07:00
Bagatur
9bedda50f2 Bagatur/lakefs loader2 (#12524)
Co-authored-by: Jonathan Rosenberg <96974219+Jonathan-Rosenberg@users.noreply.github.com>
2023-10-30 16:30:27 -07:00
Brian McBrayer
3243dcc83e Fix very small typo (#12603)
- **Description:** this is the world's smallest typo change of a typo I
saw while reading the docs
2023-10-30 16:30:18 -07:00
Ackermann Yuriy
99b69fe607 Fixed missing optional tags. Added default key value for Ollama (#12599)
Added missing Optional typings. Added default values for Ollama optional
keys.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 16:30:10 -07:00
Lance Martin
f6f3ca12e7 Codebase RAG fireworks (#12597) 2023-10-30 16:21:56 -07:00
Harrison Chase
481bf6fae6 hosting note (#12589) 2023-10-30 15:31:31 -07:00
David Duong
b5c17ff188 Force List[Tuple[str,str]] to chat history widget (#12530)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 15:19:32 -07:00
David Duong
d39b4b61b6 Batch apply poetry lock --no-update for all templates (#12531)
Ran the following bash script for all templates

```bash
#!/bin/bash

set -e
current_dir="$(pwd)"
for directory in */; do
    if [ -d "$directory" ]; then
        (cd "$directory" && poetry lock --no-update)
    fi
done

cd "$current_dir"
```

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 15:18:53 -07:00
Kenzie Mihardja
e914283cf9 add docs to min_chunk_size (#12537)
Minor addition to documentation to elaborate on min_chunk_size.

Co-authored-by: Kenzie Mihardja <kenzie@docugami.com>
2023-10-30 15:13:52 -07:00
Bagatur
016813d189 factor out to_secret (#12593) 2023-10-30 15:10:25 -07:00
hsuyuming
630ae24b28 implement get_num_tokens to use google's count_tokens function (#10565)
can get the correct token count instead of using gpt-2 model

**Description:** 
Implement get_num_tokens within VertexLLM to use google's count_tokens
function.
(https://cloud.google.com/vertex-ai/docs/generative-ai/get-token-count).
So we don't need to download gpt-2 model from huggingface, also when we
do the mapreduce chain we can get correct token count.

**Tag maintainer:** 
@lkuligin 
**Twitter handle:** 
My twitter: @abehsu1992626

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 15:10:05 -07:00
Pham Vu Thai Minh
33e77a1007 Async support for FAISS (#11333)
Following this tutoral about using OpenAI Embeddings with FAISS

https://python.langchain.com/docs/integrations/vectorstores/faiss

```python
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.document_loaders import TextLoader
from langchain.document_loaders import TextLoader

loader = TextLoader("../../../extras/modules/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
```

This works fine

```python
db = FAISS.from_documents(docs, embeddings)
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)
```

But the async version is not

```python
db = await FAISS.afrom_documents(docs, embeddings)  # NotImplementedError
query = "What did the president say about Ketanji Brown Jackson"

docs = await db.asimilarity_search(query) # this will use await asyncio.get_event_loop().run_in_executor under the hood and will not call OpenAIEmbeddings.aembed_query but call OpenAIEmbeddings.embed_query
```

So this PR add async/await supports for FAISS

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-30 15:08:53 -07:00
Lance Martin
26f0ca222d RAG template for MongoDB Atlas Vector Search (#12526) 2023-10-30 14:31:34 -07:00
Jeff Zhuo
13b89815a3 Issue: fix the issue #11648 init minimax llm (#12554)
e https://github.com/langchain-ai/langchain/issues/11648 Minimax
llm failed to initialize

The idea of this fix is
https://github.com/langchain-ai/langchain/issues/10917#issuecomment-1765606725

do not use  underscore in python model class

---------

Co-authored-by: zhuojianming@cmcm.com <zhuojianming@cmcm.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 14:30:17 -07:00
Florian Valeye
bfb27324cb [Matching Engine] Update the Matching Engine to include the distance and filters (#12555)
Hello 👋,

This Pull Request adds more capability to the
[MatchingEngine](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.matching_engine.MatchingEngine.html)
vectorstore of GCP. It includes the
`similarity_search_by_vector_with_relevance_scores` function and also
[filters](https://cloud.google.com/vertex-ai/docs/vector-search/filtering)
to `filter` the namespaces when retrieving the results.

- **Description:** Add
[filter](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndexEndpoint#google_cloud_aiplatform_MatchingEngineIndexEndpoint_find_neighbors)
in `similarity_search` and add
`similarity_search_by_vector_with_relevance_scores` method
  - **Dependencies:** None
  - **Tag maintainer:** Unknown

Thank you!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 14:12:59 -07:00
Predrag Gruevski
3c5c384f1a Test-publish to test PyPI and separate jobs to limit permissions. (#12578)
Before making a new `langchain` release, we want to test that everything
works as expected. This PR lets us publish `langchain` to test PyPI,
then install it from there and run checks to ensure everything works
normally before publishing it "for real".

It also takes the opportunity to refactor the build process, splitting
up the build, release-creation, and PyPI upload steps into separate jobs
that do not share their elevated permissions with each other.
2023-10-30 17:10:14 -04:00
Harrison Chase
1d51363e49 change project template (#12493) 2023-10-30 14:06:30 -07:00
Holt Skinner
e53b9ccd70 feat: Add Google Cloud Text-to-Speech Tool (#12572)
- Add Tool for [Google Cloud
Text-to-Speech](https://cloud.google.com/text-to-speech)
- Follows similar structure to [Eleven Labs
Text2Speech](https://python.langchain.com/docs/integrations/tools/eleven_labs_tts)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 14:05:39 -07:00
Bagatur
1f2c672d4a add routing by embedding doc (#12580) 2023-10-30 13:03:16 -07:00
William FH
199630ff93 Replace You with DDG in xml agent (#12504)
You requires an email to get an API key which IMO is too much friction.
Duckduck go is free and easy to install.
2023-10-30 12:51:00 -07:00
Adilkhan Sarsen
6e702b9c36 Deep memory support in LangChain (#12268)
- Description: adding support to Activeloop's DeepMemory feature that
boosts recall up to 25%. Added Jupyter notebook showcasing the feature
and also made index params explicit.
- Twitter handle: will really appreciate if we could announce this on
twitter.

---------

Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>
2023-10-30 12:16:14 -07:00
Lance Martin
c57945e0a8 Formatting on ntbks (#12576) 2023-10-30 11:32:31 -07:00
Lance Martin
08103e6d48 Minor template cleaning (#12573) 2023-10-30 11:27:44 -07:00
billytrend-cohere
b1e3843931 Add client_name="langchain" to Cohere usage (#11328)
Hey, we're looking to invest more in adding cohere integrations to
langchain so would love to get more of an idea for how it's used.
Hopefully this pr is acceptable. This week I'm also going to be looking
into adding our new [retrieval augmented generation
product](https://txt.cohere.com/chat-with-rag/) to langchain.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-30 11:20:55 -07:00
Bagatur
37aec1e050 bump 326 (#12569) 2023-10-30 10:11:17 -07:00
Eugene Yurtsev
1b1a2d5740 Image Caption accepts bytes for images (#12561)
Accept bytes for images in image caption

---------

Co-authored-by: webcoderz <19884161+webcoderz@users.noreply.github.com>
2023-10-30 12:29:54 -04:00
Nuno Campos
7897483819 Allow astream_log to be used inside atrace_as_chain_group (#12558)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-30 15:55:16 +00:00
Tomaz Bratanic
8e88ba16a8 Update neo4j template readmes (#12540) 2023-10-30 07:57:53 -07:00
Bagatur
b2138508cb google translate nb formatting (#12534) 2023-10-29 21:27:04 -07:00
Holt Skinner
e05bb938de Merge pull request #12433
* feat: Add Google Cloud Translation document transformer

* Merge branch 'langchain-ai:master' into google-translate

* Add documentation for Google Translate Document Transformer

* Fix line length error

* Merge branch 'master' into google-translate

* Merge branch 'google-translate' of https://github.com/holtskinner/lan…

* Addressed code review comments

* Merge branch 'master' into google-translate

* Merge branch 'google-translate' of https://github.com/holtskinner/lan…

* Removed extra variable

* Merge branch 'google-translate' of https://github.com/holtskinner/lan…

* Merge branch 'master' into google-translate

* Merge branch 'google-translate' of https://github.com/holtskinner/lan…

* Removed extra import
2023-10-29 21:22:36 -04:00
Samad Koita
d1fdcd4fcb Masking of API Key for GooseAI LLM (#12496)
Description: Add masking of API Key for GooseAI LLM when printed.
Issue: https://github.com/langchain-ai/langchain/issues/12165
Dependencies: None
Tag maintainer: @eyurtsev

---------

Co-authored-by: Samad Koita <>
2023-10-29 21:21:33 -04:00
Andrew Zhou
64c4a698a8 More comprehensive readthedocs document loader (#12382)
## **Description:**
When building our own readthedocs.io scraper, we noticed a couple
interesting things:

1. Text lines with a lot of nested <span> tags would give unclean text
with a bunch of newlines. For example, for [Langchain's
documentation](https://api.python.langchain.com/en/latest/document_loaders/langchain.document_loaders.readthedocs.ReadTheDocsLoader.html#langchain.document_loaders.readthedocs.ReadTheDocsLoader),
a single line is represented in a complicated nested HTML structure, and
the naive `soup.get_text()` call currently being made will create a
newline for each nested HTML element. Therefore, the document loader
would give a messy, newline-separated blob of text. This would be true
in a lot of cases.

<img width="945" alt="Screenshot 2023-10-26 at 6 15 39 PM"
src="https://github.com/langchain-ai/langchain/assets/44193474/eca85d1f-d2bf-4487-a18a-e1e732fadf19">
<img width="1031" alt="Screenshot 2023-10-26 at 6 16 00 PM"
src="https://github.com/langchain-ai/langchain/assets/44193474/035938a0-9892-4f6a-83cd-0d7b409b00a3">

Additionally, content from iframes, code from scripts, css from styles,
etc. will be gotten if it's a subclass of the selector (which happens
more often than you'd think). For example, [this
page](https://pydeck.gl/gallery/contour_layer.html#) will scrape 1.5
million characters of content that looks like this:

<img width="1372" alt="Screenshot 2023-10-26 at 6 32 55 PM"
src="https://github.com/langchain-ai/langchain/assets/44193474/dbd89e39-9478-4a18-9e84-f0eb91954eac">

Therefore, I wrote a recursive _get_clean_text(soup) class function that
1. skips all irrelevant elements, and 2. only adds newlines when
necessary.

2. Index pages (like [this
one](https://api.python.langchain.com/en/latest/api_reference.html))
would be loaded, chunked, and eventually embedded. This is really bad
not just because the user will be embedding irrelevant information - but
because index pages are very likely to show up in retrieved content,
making retrieval less effective (in our tests). Therefore, I added a
bool parameter `exclude_index_pages` defaulted to False (which is the
current behavior — although I'd petition to default this to True) that
will skip all pages where links take up 50%+ of the page. Through manual
testing, this seems to be the best threshold.



## Other Information:
  - **Issue:** n/a
  - **Dependencies:** n/a
  - **Tag maintainer:** n/a
  - **Twitter handle:** @andrewthezhou

---------

Co-authored-by: Andrew Zhou <andrew@heykona.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-29 16:26:53 -07:00
Peter Vandenabeele
3468c038ba Add unit tests for document_transformers/beautiful_soup_transformer.py (#12520)
- **Description:**
* Add unit tests for document_transformers/beautiful_soup_transformer.py
* Basic functionality is tested (extract tags, remove tags, drop lines)
    * add a FIXME comment about the order of tags that is not preserved
      (and a passing test, but with the expected tags now out-of-order)
  - **Issue:** None
  - **Dependencies:** None
  - **Tag maintainer:** @rlancemartin 
  - **Twitter handle:** `peter_v`

Please make sure your PR is passing linting and testing before
submitting.

=> OK: I ran `make format`, `make test` (passing after install of
beautifulsoup4) and `make lint`.
2023-10-29 16:24:47 -07:00
Bagatur
d31d705407 update contributing (#12532) 2023-10-29 16:22:18 -07:00
Bagatur
0b4b9e61fc Bagatur/fix doc ci (#12529) 2023-10-29 16:15:18 -07:00
Bagatur
2424fff3f1 notebook fmt (#12498) 2023-10-29 15:50:09 -07:00
Harrison Chase
56cc5b847c Harrison/add descriptions (#12522) 2023-10-29 15:11:37 -07:00
Anirudh Gautam
b257e6a4e8 Mask API key for AI21 LLM (#12418)
- **Description:** Added masking of the API Key for AI21 LLM when
printed and improved the docstring for AI21 LLM.
- Updated the AI21 LLM to utilize SecretStr from pydantic to securely
manage API key.
- Made improvements in the docstring of AI21 LLM. It now mentions that
the API key can also be passed as a named parameter to the constructor.
    - Added unit tests.
  - **Issue:** #12165 
  - **Tag maintainer:** @eyurtsev

---------

Co-authored-by: Anirudh Gautam <anirudh@Anirudhs-Mac-mini.local>
2023-10-29 14:53:41 -07:00
Nico Baier
35d726dc15 docs(prompt_templates): fix typo in prompt template (#12497)
- **Description:** Fixes a small typo in the [Prompt template
document](https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/)
  - **Dependencies:** none
2023-10-29 14:52:37 -07:00
silvhua
9dead1034c _dalle_image_url returns list of urls if n>1 (#11800)
- **Description:** Updated the `_dalle_image_url` method to return a
list of URLs if self.n>1,
  - **Issue:** #10691,
  - **Dependencies:** unsure,
  - **Tag maintainer:** @eyurtsev,
  - **Twitter handle:** @silvhua
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-29 14:23:23 -07:00
Bagatur
1815ea2fdb OpenAI runnable constructor (#12455) 2023-10-29 13:40:30 -07:00
William FH
a830b809f3 Patch forward ref bug (#12508)
Currently this gives a bug:
```
from langchain.schema.runnable import RunnableLambda

bound = RunnableLambda(lambda x: x).with_config({"callbacks": []})

# ConfigError: field "callbacks" not yet prepared so type is still a ForwardRef, you might need to call RunnableConfig.update_forward_refs().
```

Rather than deal with cyclic imports and extra load time, etc., I think
it makes sense to just have a separate Callbacks definition here that is
a relaxed typehint.
2023-10-29 00:53:01 -07:00
William FH
36204c2baf Evaluation Callback Multi Response (#12505)
1. Allow run evaluators to return {"results": [list of evaluation
results]} in the evaluator callback.
2. Allows run evaluators to pick the target run ID to provide feedback
to

(1) means you could do something like a function call that populates a
full rubric in one go (not sure how reliable that is in general though)
rather than splitting off into separate LLM calls - cheaper and less
code to write
(2) means you can provide feedback to runs on subsequent calls.
Immediate use case is if you wanted to add an evaluator to a chat bot
and assign to assign to previous conversation turns


have a corresponding one in the SDK
2023-10-28 23:18:29 -07:00
Harrison Chase
9e0ae56287 various templates improvements (#12500) 2023-10-28 22:13:22 -07:00
Harrison Chase
d85d4d7822 add cookbook for selectins llms based on context length (#12486) 2023-10-28 21:50:14 -07:00
Harrison Chase
0660c06cf1 add gha for cli (#12492) 2023-10-28 21:49:28 -07:00
0xC9
79cf01366e Update tool.py (#12472)
In the GoogleSerperResults class, the name field is defined as
'google_serrper_results_json'. This looks like a typo, and perhaps
should be 'google_serper_results_json'.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-28 21:49:01 -07:00
Harrison Chase
61f5ea4b5e Sphinxbio nls/add plate chain template (#12502)
Co-authored-by: Nicholas Larus-Stone <7347808+nlarusstone@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-28 21:48:17 -07:00
Harrison Chase
221134d239 Harrison/quick start (#12491)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-28 16:26:52 -07:00
Bagatur
e130680d74 Bagatur/self query doc update (#12461) 2023-10-28 14:37:14 -07:00
Piyush Jain
689853902e Added a rag template for Kendra (#12470)
## Description
Adds a rag template for Amazon Kendra with Bedrock.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-28 08:58:28 -07:00
Harrison Chase
eb903e211c bump to 36 (#12487) 2023-10-28 08:51:23 -07:00
Tyler Hutcherson
4209457bdc Redis langserve template (#12443)
Add Redis langserve template! Eventually will add semantic caching to
this too. But I was struggling to get that to work for some reason with
the LCEL implementation here.

- **Description:** Introduces the Redis LangServe template. A simple RAG
based app built on top of Redis that allows you to chat with company's
public financial data (Edgar 10k filings)
  - **Issue:** None
- **Dependencies:** The template contains the poetry project
requirements to run this template
  - **Tag maintainer:** @baskaryan @Spartee 
  - **Twitter handle:** @tchutch94

**Note**: this requires the commit here that deletes the
`_aget_relevant_documents()` method from the Redis retriever class that
wasn't implemented. That was breaking the langserve app.

---------

Co-authored-by: Sam Partee <sam.partee@redis.com>
2023-10-28 08:31:12 -07:00
Erick Friis
9adaa78c65 cli improvements (#12465)
Features
- add multiple repos by their branch/repo
- generate `pip install` commands and `add_route()` code
![Screenshot 2023-10-27 at 4 49 52
PM](https://github.com/langchain-ai/langchain/assets/9557659/3aec4cbb-3f67-4f04-8370-5b54ea983b2a)

Optimizations:
- group installs by repo/branch to avoid duplicate cloning
2023-10-28 08:25:31 -07:00
Piyush Jain
5545de0466 Updated the Bedrock rag template (#12462)
Updates the bedrock rag template.
- Removes pinecone and replaces with FAISS as the vector store
- Fixes the environment variables, setting defaults
- Adds a `main.py` test file quick sanity testing
- Updates README.md with correct instructions
2023-10-27 17:02:28 -07:00
Lance Martin
5c2243ee91 Update llama.cpp and Ollama templates (#12466) 2023-10-27 16:54:54 -07:00
Lance Martin
f10c17c6a4 Update SQL templates (#12464) 2023-10-27 16:34:37 -07:00
Lance Martin
a476147189 Add Weaviate RAG template (#12460) 2023-10-27 15:19:34 -07:00
Adam Law
df4960a6d8 add reranking to azuresearch (#12454)
-**Description** Adds returning the reranking score when using semantic
search
-**Issue:* #12317

---------

Co-authored-by: Adam Law <adamlaw@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-27 14:14:09 -07:00
dependabot[bot]
389459af8f Bump @babel/traverse from 7.22.8 to 7.23.2 in /docs (#12453)
Bumps
[@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse)
from 7.22.8 to 7.23.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/babel/babel/releases"><code>@​babel/traverse</code>'s
releases</a>.</em></p>
<blockquote>
<h2>v7.23.2 (2023-10-11)</h2>
<p><strong>NOTE</strong>: This release also re-publishes
<code>@babel/core</code>, even if it does not appear in the linked
release commit.</p>
<p>Thanks <a
href="https://github.com/jimmydief"><code>@​jimmydief</code></a> for
your first PR!</p>
<h4>🐛 Bug Fix</h4>
<ul>
<li><code>babel-traverse</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16033">#16033</a>
Only evaluate own String/Number/Math methods (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16022">#16022</a>
Rewrite <code>.tsx</code> extension when using
<code>rewriteImportExtensions</code> (<a
href="https://github.com/jimmydief"><code>@​jimmydief</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16017">#16017</a>
Fix: fallback to typeof when toString is applied to incompatible object
(<a href="https://github.com/JLHwung"><code>@​JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16025">#16025</a>
Avoid override mistake in namespace imports (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h4>Committers: 5</h4>
<ul>
<li>Babel Bot (<a
href="https://github.com/babel-bot"><code>@​babel-bot</code></a>)</li>
<li>Huáng Jùnliàng (<a
href="https://github.com/JLHwung"><code>@​JLHwung</code></a>)</li>
<li>James Diefenderfer (<a
href="https://github.com/jimmydief"><code>@​jimmydief</code></a>)</li>
<li>Nicolò Ribaudo (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
<li><a
href="https://github.com/liuxingbaoyu"><code>@​liuxingbaoyu</code></a></li>
</ul>
<h2>v7.23.1 (2023-09-25)</h2>
<p>Re-publishing <code>@babel/helpers</code> due to a publishing error
in 7.23.0.</p>
<h2>v7.23.0 (2023-09-25)</h2>
<p>Thanks <a
href="https://github.com/lorenzoferre"><code>@​lorenzoferre</code></a>
and <a
href="https://github.com/RajShukla1"><code>@​RajShukla1</code></a> for
your first PRs!</p>
<h4>🚀 New Feature</h4>
<ul>
<li><code>babel-plugin-proposal-import-wasm-source</code>,
<code>babel-plugin-syntax-import-source</code>,
<code>babel-plugin-transform-dynamic-import</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15870">#15870</a>
Support transforming <code>import source</code> for wasm (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-module-transforms</code>,
<code>babel-helpers</code>,
<code>babel-plugin-proposal-import-defer</code>,
<code>babel-plugin-syntax-import-defer</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15878">#15878</a>
Implement <code>import defer</code> proposal transform support (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>, <code>babel-parser</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15845">#15845</a>
Implement <code>import defer</code> parsing support (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
<li><a
href="https://redirect.github.com/babel/babel/pull/15829">#15829</a> Add
parsing support for the &quot;source phase imports&quot; proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>,
<code>babel-helper-module-transforms</code>, <code>babel-parser</code>,
<code>babel-plugin-transform-dynamic-import</code>,
<code>babel-plugin-transform-modules-amd</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-plugin-transform-modules-systemjs</code>,
<code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15682">#15682</a> Add
<code>createImportExpressions</code> parser option (<a
href="https://github.com/JLHwung"><code>@​JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15671">#15671</a>
Pass through nonce to the transformed script element (<a
href="https://github.com/JLHwung"><code>@​JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-function-name</code>,
<code>babel-helper-member-expression-to-functions</code>,
<code>babel-helpers</code>, <code>babel-parser</code>,
<code>babel-plugin-proposal-destructuring-private</code>,
<code>babel-plugin-proposal-optional-chaining-assign</code>,
<code>babel-plugin-syntax-optional-chaining-assign</code>,
<code>babel-plugin-transform-destructuring</code>,
<code>babel-plugin-transform-optional-chaining</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15751">#15751</a> Add
support for optional chain in assignments (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-proposal-decorators</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15895">#15895</a>
Implement the &quot;decorator metadata&quot; proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15893">#15893</a> Add
<code>t.buildUndefinedNode</code> (<a
href="https://github.com/liuxingbaoyu"><code>@​liuxingbaoyu</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/babel/babel/blob/main/CHANGELOG.md"><code>@​babel/traverse</code>'s
changelog</a>.</em></p>
<blockquote>
<h2>v7.23.2 (2023-10-11)</h2>
<h4>🐛 Bug Fix</h4>
<ul>
<li><code>babel-traverse</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16033">#16033</a>
Only evaluate own String/Number/Math methods (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16022">#16022</a>
Rewrite <code>.tsx</code> extension when using
<code>rewriteImportExtensions</code> (<a
href="https://github.com/jimmydief"><code>@​jimmydief</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16017">#16017</a>
Fix: fallback to typeof when toString is applied to incompatible object
(<a href="https://github.com/JLHwung"><code>@​JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/16025">#16025</a>
Avoid override mistake in namespace imports (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h2>v7.23.0 (2023-09-25)</h2>
<h4>🚀 New Feature</h4>
<ul>
<li><code>babel-plugin-proposal-import-wasm-source</code>,
<code>babel-plugin-syntax-import-source</code>,
<code>babel-plugin-transform-dynamic-import</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15870">#15870</a>
Support transforming <code>import source</code> for wasm (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-module-transforms</code>,
<code>babel-helpers</code>,
<code>babel-plugin-proposal-import-defer</code>,
<code>babel-plugin-syntax-import-defer</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15878">#15878</a>
Implement <code>import defer</code> proposal transform support (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>, <code>babel-parser</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15845">#15845</a>
Implement <code>import defer</code> parsing support (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
<li><a
href="https://redirect.github.com/babel/babel/pull/15829">#15829</a> Add
parsing support for the &quot;source phase imports&quot; proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-generator</code>,
<code>babel-helper-module-transforms</code>, <code>babel-parser</code>,
<code>babel-plugin-transform-dynamic-import</code>,
<code>babel-plugin-transform-modules-amd</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-plugin-transform-modules-systemjs</code>,
<code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15682">#15682</a> Add
<code>createImportExpressions</code> parser option (<a
href="https://github.com/JLHwung"><code>@​JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-standalone</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15671">#15671</a>
Pass through nonce to the transformed script element (<a
href="https://github.com/JLHwung"><code>@​JLHwung</code></a>)</li>
</ul>
</li>
<li><code>babel-helper-function-name</code>,
<code>babel-helper-member-expression-to-functions</code>,
<code>babel-helpers</code>, <code>babel-parser</code>,
<code>babel-plugin-proposal-destructuring-private</code>,
<code>babel-plugin-proposal-optional-chaining-assign</code>,
<code>babel-plugin-syntax-optional-chaining-assign</code>,
<code>babel-plugin-transform-destructuring</code>,
<code>babel-plugin-transform-optional-chaining</code>,
<code>babel-runtime-corejs2</code>, <code>babel-runtime-corejs3</code>,
<code>babel-runtime</code>, <code>babel-standalone</code>,
<code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15751">#15751</a> Add
support for optional chain in assignments (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-helpers</code>,
<code>babel-plugin-proposal-decorators</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15895">#15895</a>
Implement the &quot;decorator metadata&quot; proposal (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-traverse</code>, <code>babel-types</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15893">#15893</a> Add
<code>t.buildUndefinedNode</code> (<a
href="https://github.com/liuxingbaoyu"><code>@​liuxingbaoyu</code></a>)</li>
</ul>
</li>
<li><code>babel-preset-typescript</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15913">#15913</a> Add
<code>rewriteImportExtensions</code> option to TS preset (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
<li><code>babel-parser</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15896">#15896</a>
Allow TS tuples to have both labeled and unlabeled elements (<a
href="https://github.com/yukukotani"><code>@​yukukotani</code></a>)</li>
</ul>
</li>
</ul>
<h4>🐛 Bug Fix</h4>
<ul>
<li><code>babel-plugin-transform-block-scoping</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15962">#15962</a>
fix: <code>transform-block-scoping</code> captures the variables of the
method in the loop (<a
href="https://github.com/liuxingbaoyu"><code>@​liuxingbaoyu</code></a>)</li>
</ul>
</li>
</ul>
<h4>💅 Polish</h4>
<ul>
<li><code>babel-traverse</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15797">#15797</a>
Expand evaluation of global built-ins in <code>@babel/traverse</code>
(<a
href="https://github.com/lorenzoferre"><code>@​lorenzoferre</code></a>)</li>
</ul>
</li>
<li><code>babel-plugin-proposal-explicit-resource-management</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15985">#15985</a>
Improve source maps for blocks with <code>using</code> declarations (<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h4>🔬 Output optimization</h4>
<ul>
<li><code>babel-core</code>,
<code>babel-helper-module-transforms</code>,
<code>babel-plugin-transform-async-to-generator</code>,
<code>babel-plugin-transform-classes</code>,
<code>babel-plugin-transform-dynamic-import</code>,
<code>babel-plugin-transform-function-name</code>,
<code>babel-plugin-transform-modules-amd</code>,
<code>babel-plugin-transform-modules-commonjs</code>,
<code>babel-plugin-transform-modules-umd</code>,
<code>babel-plugin-transform-parameters</code>,
<code>babel-plugin-transform-react-constant-elements</code>,
<code>babel-plugin-transform-react-inline-elements</code>,
<code>babel-plugin-transform-runtime</code>,
<code>babel-plugin-transform-typescript</code>,
<code>babel-preset-env</code>
<ul>
<li><a
href="https://redirect.github.com/babel/babel/pull/15984">#15984</a>
Inline <code>exports.XXX =</code> update in simple variable declarations
(<a
href="https://github.com/nicolo-ribaudo"><code>@​nicolo-ribaudo</code></a>)</li>
</ul>
</li>
</ul>
<h2>v7.22.20 (2023-09-16)</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b4b9942a6c"><code>b4b9942</code></a>
v7.23.2</li>
<li><a
href="b13376b346"><code>b13376b</code></a>
Only evaluate own String/Number/Math methods (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/16033">#16033</a>)</li>
<li><a
href="ca58ec15cb"><code>ca58ec1</code></a>
v7.23.0</li>
<li><a
href="0f333dafcf"><code>0f333da</code></a>
Add <code>createImportExpressions</code> parser option (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15682">#15682</a>)</li>
<li><a
href="3744545649"><code>3744545</code></a>
Fix linting</li>
<li><a
href="c7e6806e21"><code>c7e6806</code></a>
Add <code>t.buildUndefinedNode</code> (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15893">#15893</a>)</li>
<li><a
href="38ee8b4dd6"><code>38ee8b4</code></a>
Expand evaluation of global built-ins in <code>@babel/traverse</code>
(<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15797">#15797</a>)</li>
<li><a
href="9f3dfd9021"><code>9f3dfd9</code></a>
v7.22.20</li>
<li><a
href="3ed28b29c1"><code>3ed28b2</code></a>
Fully support <code>||</code> and <code>&amp;&amp;</code> in
<code>pluginToggleBooleanFlag</code> (<a
href="https://github.com/babel/babel/tree/HEAD/packages/babel-traverse/issues/15961">#15961</a>)</li>
<li><a
href="77b0d73599"><code>77b0d73</code></a>
v7.22.19</li>
<li>Additional commits viewable in <a
href="https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@babel/traverse&package-manager=npm_and_yarn&previous-version=7.22.8&new-version=7.23.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/langchain-ai/langchain/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-27 14:13:58 -07:00
Eugene Yurtsev
60d009f75a Add security note to API chain (#12452)
Add security note
2023-10-27 17:09:42 -04:00
Matvey Arye
11505f95d3 Improve handling of empty queries for timescale vector (#12393)
**Description:** Improve handling of empty queries in timescale-vector.
For timescale-vector it is more efficient to get a None embedding when
the embedding has no semantic meaning. It allows timescale-vector to
perform more optimizations. Thus, when the query is empty, use a None
embedding.

 Also pass down constructor arguments to the timescale vector client.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-27 13:55:16 -07:00
Erick Friis
38cee5fae0 cli updates 2 (#12447)
- extras group
- readme
- another readme

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-27 13:37:03 -07:00
Lance Martin
3afa68e30e Update AWS Bedrock README.md (#12451) 2023-10-27 13:21:54 -07:00
Lance Martin
5c564e62e1 AWS Bedrock RAG template (#12450) 2023-10-27 13:15:54 -07:00
William FH
5d40e36c75 Trace if run tree set (#12444)
This code path is hit in the following case:
- Start in langchain code and manually provide a tracer
- Handoff to the traceable
- Hand back to langchain code.

Which happens for evaluating `@traceable` functions unfortunately
2023-10-27 12:29:18 -07:00
Bagatur
c2a0a6b6df make doc utils public (#12394) 2023-10-27 12:08:08 -07:00
Henter
d6888a90d0 Fix the missing temperature parameter for Baichuan-AI chat_model (#12420)
**Description:** the missing `temperature` parameter for Baichuan-AI
chat_model

Baichuan-AI api doc: https://platform.baichuan-ai.com/docs/api
2023-10-27 12:07:21 -07:00
Erick Friis
6908634428 cli updates oct27 (#12436) 2023-10-27 12:06:46 -07:00
Uxywannasleep
3fd9f2752f Fix Typo in clickhouse.ipynb file (#12429) 2023-10-27 11:55:15 -07:00
HwangJohn
d38c8369b3 added rrf argument in ApproxRetrievalStrategy class __init__() (#11987)
- **Description: To handle the hybrid search with RRF(Reciprocal Rank
Fusion) in the Elasticsearch, rrf argument was added for adjusting
'rank_constant' and 'window_size' to combine multiple result sets with
different relevance indicators into a single result set. (ref:
https://www.elastic.co/kr/blog/whats-new-elastic-enterprise-search-8-9-0),
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** No dependencies changed,
  - **Tag maintainer:** @baskaryan,

Nice to meet you,
I'm a newbie for contributions and it's my first PR.

I only changed the langchain/vectorstores/elasticsearch.py file.
I did make format&lint 
I got this message,
```shell
make lint_diff  
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "langchain/vectorstores/elasticsearch.py" = "" ] || poetry run black langchain/vectorstores/elasticsearch.py --check
All done!  🍰 
1 file would be left unchanged.
[ "langchain/vectorstores/elasticsearch.py" = "" ] || poetry run mypy langchain/vectorstores/elasticsearch.py
langchain/__init__.py: error: Source file found twice under different module names: "mvp.nlp.langchain.libs.langchain.langchain" and "langchain"
Found 1 error in 1 file (errors prevented further checking)
make: *** [lint_diff] Error 2
```

Thank you

---------

Co-authored-by: 황중원 <jwhwang@amorepacific.com>
2023-10-27 11:53:19 -07:00
Roman Vasilyev
2c58dca5f0 optional reusable connection (#12051)
My postgres out of connections after continuous PGVector usage, and the
reason because it constantly creates new connections, so adding a
reusable pre established connection seems like solves an issue

---------

Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-27 11:52:42 -07:00
Ennio Pastore
48fde2004f Update long_context_reorder.py (#12422)
The function comment was confusing and inaccurate
2023-10-27 11:52:28 -07:00
Bagatur
a8c68d4ffa Type LLMChain.llm as runnable (#12385) 2023-10-27 11:52:01 -07:00
Prakul
224ec0cfd3 Mongo db $vector search doc update (#12404)
**Description:** 
Updates the documentation for MongoDB Atlas Vector Search
2023-10-27 11:50:29 -07:00
Bagatur
d12b88557a Bagatur/bump 325 (#12440) 2023-10-27 11:49:09 -07:00
Eugene Yurtsev
cadfce295f Deprecate PythonRepl tools and Pandas/Xorbits/Spark DataFrame/Python/CSV agents (#12427)
See discussion here:
https://github.com/langchain-ai/langchain/discussions/11680

The code is available for usage from langchain_experimental. The reason
for the deprecation is that the agents are relying on a Python REPL. The
code can only be run safely with appropriate sandboxing.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-27 14:16:42 -04:00
Lance Martin
68e12d34a9 Add invoke example to LLaMA2 function template notebook (#12437) 2023-10-27 10:58:24 -07:00
Harrison Chase
0ca539eb85 Clean up deprecated agents and update __init__ in experimental (#12231)
Update init paths in experimental
2023-10-27 13:52:50 -04:00
Lance Martin
05bbf943f2 LLaMA2 with JSON schema support template (#12435) 2023-10-27 10:34:00 -07:00
Holt Skinner
134f085824 feat: Add Google Speech to Text API Document Loader (#12298)
- Add Document Loader for Google Speech to Text
  - Similar Structure to [Assembly AI Document Loader][1]

[1]:
https://python.langchain.com/docs/integrations/document_loaders/assemblyai
2023-10-27 09:34:26 -07:00
David Duong
52c194ec3a Fix templates typos (#12428) 2023-10-27 09:32:57 -07:00
Massimiliano Pronesti
c8195769f2 fix(openai-callback): completion count logic (#12383)
The changes introduced in #12267 and #12190 broke the cost computation
of the `completion` tokens for fine-tuned models because of the early
return. This PR aims at fixing this.
@baskaryan.
2023-10-27 09:08:54 -07:00
Stefan Langenbach
b22da81af8 Mask API key for Aleph Alpha LLM (#12377)
- **Description:** Add masking of API Key for Aleph Alpha LLM when
printed.
- **Issue**: #12165
- **Dependencies:** None
- **Tag maintainer:** @eyurtsev

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-27 11:32:43 -04:00
Lance Martin
d6acb3ed7e Clean-up template READMEs (#12403)
Normalize, and update notebooks.
2023-10-26 22:23:03 -07:00
William FH
4254028c52 Str Evaluator Mapper (#12401) 2023-10-26 21:38:47 -07:00
William FH
fcad1d2965 Add space (#12395) 2023-10-26 20:32:23 -07:00
William FH
922d7910ef Wfh/json schema evaluation (#12389)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2023-10-26 20:32:05 -07:00
Erick Friis
afcc12d99e Templates CI (#12313)
Adds a `langchain-location` param to lint, so we can properly locate it.

Regular langchain and experimental lint steps are passing, so default
value seems to be working.
2023-10-26 20:29:36 -07:00
Christian Kasim Loan
a35445c65f johnsnowlabs embeddings support (#11271)
- **Description:** Introducing the
[JohnSnowLabsEmbeddings](https://www.johnsnowlabs.com/)
  - **Dependencies:** johnsnowlabs
  - **Tag maintainer:** @C-K-Loan
- **Twitter handle:** https://twitter.com/JohnSnowLabs
https://twitter.com/ChristianKasimL

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-26 20:22:50 -07:00
SteveLiao
c08b622b2d Add HTML Title and Page Language into metadata for AsyncHtmlLoader (#11326)
**Description:** 
Revise `libs/langchain/langchain/document_loaders/async_html.py` to
store the HTML Title and Page Language in the `metadata` of
`AsyncHtmlLoader`.
2023-10-26 20:22:31 -07:00
Erick Friis
4b16601d33 Format Templates (#12396) 2023-10-26 19:44:30 -07:00
Shorthills AI
25c98dbba9 Fixed some grammatical and Exception types issues (#12015)
Fixed some grammatical issues and Exception types.

@baskaryan , @eyurtsev

---------

Co-authored-by: Sanskar Tanwar <142409040+SanskarTanwarShorthillsAI@users.noreply.github.com>
Co-authored-by: UpneetShorthillsAI <144228282+UpneetShorthillsAI@users.noreply.github.com>
Co-authored-by: HarshGuptaShorthillsAI <144897987+HarshGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: AdityaKalraShorthillsAI <143726711+AdityaKalraShorthillsAI@users.noreply.github.com>
Co-authored-by: SakshiShorthillsAI <144228183+SakshiShorthillsAI@users.noreply.github.com>
2023-10-26 21:12:38 -04:00
William FH
923696b664 Wfh/json edit dist (#12361)
Compare predicted json to reference. First canonicalize (sort keys, rm
whitespace separators), then return normalized string edit distance.

Not a silver bullet but maybe an easy way to capture structure
differences in a less flakey way

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2023-10-26 18:10:28 -07:00
Harrison Chase
56ee56736b add template for hyde (#12390) 2023-10-26 17:38:35 -07:00
Erick Friis
4db8d82c55 CLI CI 2 (#12387)
Will run all CI because of _test change, but future PRs against CLI will
only trigger the new CLI one

Has a bunch of file changes related to formatting/linting.

No mypy yet - coming soon
2023-10-26 17:01:31 -07:00
Tyler Hutcherson
231d553824 Update broken redis tests (#12371)
Update broken redis tests -- tiny PR :) 
- **Description:** Fixes Redis tests on master (look like it was broken
by https://github.com/langchain-ai/langchain/pull/11257)
  - **Issue:** None,
  - **Dependencies:** No
  - **Tag maintainer:** @baskaryan @Spartee 
  - **Twitter handle:** N/A

Co-authored-by: Sam Partee <sam.partee@redis.com>
2023-10-26 16:13:14 -07:00
Lance Martin
b8af5b0a8e Minor updates to ReRank template (#12388) 2023-10-26 16:05:17 -07:00
Bagatur
7cadf00570 better lint triggering (#12376) 2023-10-26 15:31:20 -07:00
Erick Friis
03e79e62c2 cli fix (#12380) 2023-10-26 15:29:49 -07:00
Lance Martin
237026c060 Cohere re-rank template (#12378) 2023-10-26 15:29:10 -07:00
Bagatur
76230d2c08 fireworks scheduled integration tests (#12373) 2023-10-26 14:24:42 -07:00
Josh Phillips
01c5cd365b Fix SupbaseVectoreStore write operation timeout (#12318)
**Description**
This small change will make chunk_size a configurable parameter for
loading documents into a Supabase database.

**Issue**
https://github.com/langchain-ai/langchain/issues/11422

**Dependencies**
No chanages

**Twitter**
@ j1philli

**Reminder**
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Greg Richardson <greg.nmr@gmail.com>
2023-10-26 14:19:17 -07:00
Bagatur
b10cefb160 lint fix: rm init (#12374) 2023-10-26 14:16:25 -07:00
William FH
f65067b1da Mention other function calling/grammar support (#12369)
In our extraction doc
2023-10-26 13:59:28 -07:00
Chris Lucas
e88fdbba29 Fix langsmith walkthrough doc dataset (#12027) 2023-10-26 13:57:15 -07:00
Jacob Lee
7e5e5e87d8 Adds linter in templates (#12321)
Did not actually run/fix errors yet @efriis
2023-10-26 13:55:07 -07:00
Harrison Chase
b43996e553 Harrison/improve cli (#12368) 2023-10-26 13:53:59 -07:00
Harrison Chase
9ce38726a2 fix some stuff (#12292)
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-26 13:30:36 -07:00
Cynthia Yang
6ce276e099 Support Fireworks batching (#8) (#12052)
Description

* Add _generate and _agenerate to support Fireworks batching.
* Add stop words test cases
* Opt out retry mechanism

Issue - Not applicable
Dependencies - None
Tag maintainer - @baskaryan
2023-10-26 16:01:08 -04:00
Bagatur
3fbb2f3e52 update chains how to (#12362) 2023-10-26 12:21:03 -07:00
Tyler Hutcherson
2f0c9d8269 Fix redis vectorfield schema defaults (#12223)
- **Description:** refactors the redis vector field schema to properly
handle default values, includes a new unit test suite.
  - **Issue:** N/A
  - **Dependencies:** nothing new.
  - **Tag maintainer:** @baskaryan @Spartee 
  - **Twitter handle:** this is a tiny fix/improvement :) 

This issue was causing some clients/cuatomers issues when building a
vector index on Redis on smaller db instances (due to fault default
values in index configuration). It would raise an error like:

```redis.exceptions.ResponseError: Vector index initial capacity 20000 exceeded server limit (852 with the given parameters)```

This PR will address this moving forward.
2023-10-26 12:17:58 -07:00
Jakub Novák
9544d64ad8 E2B tool - Improve description wuth uploaded files info (#12355) 2023-10-26 11:44:24 -07:00
Bagatur
dad16af711 langserve doc (#12357) 2023-10-26 11:40:57 -07:00
Lance Martin
0af6e64ad9 Update multi query template README, ntbk (#12356) 2023-10-26 11:24:44 -07:00
Bagatur
f3449ccd20 Docs: Add lcel to combine_docs chains (#12310) 2023-10-26 11:05:36 -07:00
Lance Martin
bc6f6e968e Add template for Pinecone + Multi-Query (#12353) 2023-10-26 10:12:23 -07:00
Bagatur
c6a733802b bump 324 and 35 (#12352) 2023-10-26 10:10:26 -07:00
Nuno Campos
683e97766d Fix json key output parser in partial (streaming) mode (#12332)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-26 17:45:04 +01:00
Nikhil Jha
dff24285ea Comprehend Moderation 0.2 (#11730)
This PR replaces the previous `Intent` check with the new `Prompt
Safety` check. The logic and steps to enable chain moderation via the
Amazon Comprehend service, allowing you to detect and redact PII, Toxic,
and Prompt Safety information in the LLM prompt or answer remains
unchanged.
This implementation updates the code and configuration types with
respect to `Prompt Safety`.


### Usage sample

```python
from langchain_experimental.comprehend_moderation import (BaseModerationConfig, 
                                 ModerationPromptSafetyConfig, 
                                 ModerationPiiConfig, 
                                 ModerationToxicityConfig
)

pii_config = ModerationPiiConfig(
    labels=["SSN"],
    redact=True,
    mask_character="X"
)

toxicity_config = ModerationToxicityConfig(
    threshold=0.5
)

prompt_safety_config = ModerationPromptSafetyConfig(
    threshold=0.5
)

moderation_config = BaseModerationConfig(
    filters=[pii_config, toxicity_config, prompt_safety_config]
)

comp_moderation_with_config = AmazonComprehendModerationChain(
    moderation_config=moderation_config, #specify the configuration
    client=comprehend_client,            #optionally pass the Boto3 Client
    verbose=True
)

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

responses = [
    "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", 
    "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]
llm = FakeListLLM(responses=responses)

llm_chain = LLMChain(prompt=prompt, llm=llm)

chain = ( 
    prompt 
    | comp_moderation_with_config 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | comp_moderation_with_config 
)

try:
    response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})
except Exception as e:
    print(str(e))
else:
    print(response['output'])

```

### Output

```python
> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...

> Finished chain.


> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...

> Finished chain.
Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like XXXXXXXXXXXX John Doe's phone number is (999)253-9876.
```

---------

Co-authored-by: Jha <nikjha@amazon.com>
Co-authored-by: Anjan Biswas <anjanavb@amazon.com>
Co-authored-by: Anjan Biswas <84933469+anjanvb@users.noreply.github.com>
2023-10-26 09:42:18 -07:00
Blake (Yung Cher Ho)
b9410f2b6f Takeoff pro support (#12070)
**Description:**
This PR adds support for the [Pro version of Titan Takeoff
Server](https://docs.titanml.co/docs/category/pro-features). Users of
the Pro version will have to import the TitanTakeoffPro model, which is
different from TitanTakeoff.

**Issue:**
Also minor fixes to docs for Titan Takeoff (Community version)

**Dependencies:**
No additional dependencies

 **Twitter handle:** @becoming_blake

@baskaryan @hwchase17
2023-10-26 09:39:32 -07:00
Leonid Kuligin
4e47fe1dce fixed error message and a check for processor name (#12200)
Replace this entire comment with:
- **Description:** a small fix on error description / a check for
processor name
  - **Issue:** the issue #11407
2023-10-26 09:38:25 -07:00
Nir Kopler
9298aff783 Finetuned openai azure models cost calculation (#12267)
**Description:**
Add cost calculation for fine tuned **Azure** with relevant unit tests.
see
https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo&pivots=programming-language-studio
for more information.
this PR is the result of this PR:
https://github.com/langchain-ai/langchain/pull/12190

Twitter handle: @nirkopler
2023-10-26 09:38:10 -07:00
Ken
3c168d4d2a Update code_understanding.ipynb (#12309)
- **Description:** Super simple fix for colab link on
code_understanding.ipynb,
  - **Issue:** not applicable
  - **Dependencies:** none,
  - **Tag maintainer:** ,
  - **Twitter handle:** @kengoodridge
2023-10-26 09:35:38 -07:00
Season Saw
4e4b8805d6 Fix a typo in the summarization use case. (#12316)
- **Description:** Fix a tiny typo in the summarization use case Jupyter
notebook.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** @seasonsaw
2023-10-26 09:35:11 -07:00
gnakw
20fe515f20 Fix the exception from langchain.utilities import ArceeWrapper (#12342)
- **Description:** Fix the exception from langchain.utilities import
ArceeWrapper
2023-10-26 09:19:43 -07:00
ZC Wong
374f4cd2bf fix typo (#12338)
fixed a typo in docs/docs/integrations/toolkits/github.ipynb
2023-10-26 09:18:47 -07:00
Qihui Xie
6720458c7d add allowed_operators property in QdrantTranslator (#12328)
- **Description:** 
This PR adds `allowd_operators` property to `QdrantTranslator` to fix
the `TypeError: can only join an iterable` bug. This property is
required in `get_query_constructor_prompt` in
`query_constructor\base.py`:
```
allowed_operators=" | ".join(allowed_operators),
```
  - **Issue:** 
#12061

---------

Co-authored-by: XIE Qihui <qihui.xie@bopufund.com>
2023-10-26 09:18:29 -07:00
Bagatur
f5a57fc1ef fix self query constructor (#12349) 2023-10-26 09:18:15 -07:00
Laurent AJDNIK
f05c29180d Fix typos in quickstart.mdx (#12333)
- **Description:** Fixes a few typos in quickstart.mdx
2023-10-26 09:14:49 -07:00
Kishan Kumar Rai
cae6f611d3 Fix Typo in CONTRIBUTING.md (#12320)
I have corrected the typos, grammar, and formatting issues.
2023-10-26 08:56:28 -07:00
Vasek Mlejnsky
cdd75b687e e2b tool - fix initialization and improve tool description (#12345) 2023-10-26 08:47:50 -07:00
Harrison Chase
8ec7aade9f add docs for templates (#12346) 2023-10-26 08:28:01 -07:00
Jacob Lee
28c39503eb Allow index name customization via env var in rag-conversation (#12315) 2023-10-25 22:11:13 -07:00
Leonid Ganeline
869a49a0ab removed CardLists for LLMs and ChatModels (#12307)
Problem statement: 
In the `integrations/llms` and `integrations/chat` pages, we have a
sidebar with ToC, and we also have a ToC at the end of the page.
The ToC at the end of the page is not necessary, and it is confusing
when we mix the index page styles; moreover, it requires manual work.
So, I removed ToC at the end of the page (it was discussed with and
approved by @baskaryan)
2023-10-25 19:13:44 -07:00
Erick Friis
ebf998acb6 Templates (#12294)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Jacob Lee <jacoblee93@gmail.com>
2023-10-25 18:47:42 -07:00
Erick Friis
43257a295c CLI Git Improvements (#12311)
- delete repo sources like pip
- git dep fixes
- error messaging
2023-10-25 18:30:02 -07:00
William FH
1d568e1add Better wrap traceable (#12303)
If user function is wrapped as a traceable function, this will help hand
off the trace between the two.

Also update handling fields to reflect optional values
2023-10-25 16:34:23 -07:00
Eugene Yurtsev
5a71b81609 Relax type annotation for custom input/output types (#12300)
This is needed to be able to do stuff like:

```python
runnable.with_types(input_type=List[str])
```
2023-10-25 19:00:22 -04:00
William FH
988f6d9912 Rm langchain server (#12305) 2023-10-25 15:26:46 -07:00
wemysschen
3f16acc538 Add baidu cloud vector search in vectorstore and fix some unit test in vectorstores (#11605)
**Description:** 
Add baidu cloud vector search in vectorstore

---------

Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-25 13:44:19 -07:00
mrbean
b7e559c7e1 use snippet search optionally (#12236)
Add an additional flag which allows for hitting our new endpoint.
2023-10-25 13:37:28 -07:00
felixocker
cce132d146 fix sparql queries for relations in schema description (#9136)
- **Description**: Fix for the SPARQL QA chain: fixed SPARQL queries for
retrieving information about relations in the graph to create a textual
description of the schema for the language model. This should resolve
#8907
- **Issue**: #8907
- **Dependencies**: None
- **Tag maintainer**: @baskaryan, @hwchase17
2023-10-25 13:36:57 -07:00
Donato Azevedo
d9f1bcf366 Strips leading/trailing whitespace before parsing xml (#12297)
**Description:** When llms output leading or trailing whitespace for xml
(when using XMLOutputParser) the parser would raise a `ValueError: Could
not parse output: ...`. However, leading or trailing whitespace are
"ignorable" in the sense of XML standard.

**Issue:** I did not find an issue related.

**Dependencies:** None

**Tag maintainer:**

**Twitter handle:** donatoaz

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

Done, updated unit test and ran `make docker_test`.
2023-10-25 13:34:58 -07:00
Rohan Sharma
3da1a65fa0 Update README.md (#12286) 2023-10-25 12:59:30 -07:00
Bagatur
ab3c124ffb Add dev guide to docs(#12291)
copy CONTRIBUTING.md to docs
2023-10-25 12:28:43 -07:00
Bagatur
aa212c3d0e rm .html from local doc links (#12293) 2023-10-25 12:09:41 -07:00
Silva
04d58018e1 Update vectorstore.mdx[Make an improvement] (#12252)
correct some grammatical errors
2023-10-25 12:00:53 -07:00
Bagatur
3d74d5e24d chat loader doc titles (#12289) 2023-10-25 11:47:50 -07:00
Erick Friis
47070b8314 CLI (#12284)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-25 11:06:58 -07:00
Shwu Ku
07c2649753 response parser for ArceeRetriever (#12270)
- **Description:** Response parser for arcee retriever, 
- **Issue:** follow-up pr on #11578 and
[discussion](https://github.com/arcee-ai/arcee-python/issues/15#issuecomment-1759874053),
  - **Dependencies:** NA

This pr implements a parser for the response from ArceeRetreiver to
convert to langchain `Document`. This closes the loop of generation and
retrieval for Arcee DALMs in langchain.

The reference for the response parser is
[api-docs:retrieve](https://api.arcee.ai/docs#/v2/retrieve_model)

Attaching screenshot of working implementation:
<img width="1984" alt="Screenshot 2023-10-25 at 7 42 34 PM"
src="https://github.com/langchain-ai/langchain/assets/65639964/026987b9-34b2-4e4b-b87d-69fcd0c6641a">
\*api key deleted

---
Successful tests, lints, etc.
```shell
Re-run pytest with --snapshot-update to delete unused snapshots.
==================================================================================================================== slowest 5 durations =====================================================================================================================
1.56s call     tests/unit_tests/schema/runnable/test_runnable.py::test_retrying
0.63s call     tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream
0.33s call     tests/unit_tests/schema/runnable/test_runnable.py::test_map_stream_iterator_input
0.30s call     tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream_iterator_input
0.20s call     tests/unit_tests/indexes/test_indexing.py::test_cleanup_with_different_batchsize
======================================================================================================= 1265 passed, 270 skipped, 32 warnings in 6.55s =======================================================================================================
[ "." = "" ] || poetry run black .
All done!  🍰 
1871 files left unchanged.
[ "." = "" ] || poetry run ruff --select I --fix .
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run black . --check
All done!  🍰 
1871 files would be left unchanged.
[ "." = "" ] || poetry run mypy .
Success: no issues found in 1868 source files
poetry run codespell --toml pyproject.toml
poetry run codespell --toml pyproject.toml -w
```

Co-authored-by: Shubham Kushwaha <shwu@Shubhams-MacBook-Pro.local>
2023-10-25 10:55:13 -07:00
Johanna Appel
c26ec7789f CohereEmbeddings: Add max_retries and request_timeout (#12275)
Add max_retries and request_timeout to CohereEmbeddings, akin to how it
works in OpenAIEmbeddings.

Since the Cohere client already implements these parameters, we can
simply pass them down.

Uses parameters from these two cohere client objects:

https://github.com/cohere-ai/cohere-python/blob/main/cohere/client.py

https://github.com/cohere-ai/cohere-python/blob/main/cohere/client_async.py
2023-10-25 10:37:25 -07:00
Nuno Campos
7108084947 Remove CLI (#12283)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-25 10:33:52 -07:00
Nuno Campos
b5b2d07681 Pop max concurrency when recursing (#12281)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-25 18:03:58 +01:00
Bagatur
69f4e402e4 bump 323 (#12278) 2023-10-25 09:06:12 -07:00
David Duong
c25b174db5 Add serialisation props to Fireworks and ChatFireworks (#12255) 2023-10-25 11:41:33 +01:00
Richard Adams
fd5f549a9e demonstrate use of RetrievalQAWithSourcesChain.from_chain (#12235)
**Description:** 
Documents further usage of RetrievalQAWithSourcesChain in an existing
test. I'd not found much documented usage of RetrievalQAWithSourcesChain
and how to get the sources out. This additional code will hopefully be
useful to other potential users of this retriever.

 **Issue:** No raised issue
 
**Dependencies:** No new dependencies needed to run the test (it already
needs `open-ai`, `faiss-cpu` and `unstructured`).

Note - `make lint` showed 8 linting errors  in unrelated files

---------

Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>
2023-10-24 21:33:34 -07:00
James Braza
53f35c5f5c Adding STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS missing backticks (#12238)
This PR fixes the fact that `STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS` was
missing backticks at the end
2023-10-24 21:30:25 -07:00
Adam Ji
9fc28d50c3 fix: typo in pgvector.ipynb (#12243)
fix: typo in docs/docs/integrations/vectorstores/pgvector.ipynb
2023-10-24 21:26:44 -07:00
William FH
276c6ba115 Check for ls project in run tree context (#12242)
If I go traceable -> runnable when the project is manually specified,
the runnable wont be logged. This makes sure the session/project is
threaded through appropriately.
2023-10-24 17:18:59 -07:00
Vasek Mlejnsky
1f8094938f Integrate E2B's data analysis/code interpreter (#12011)
This PR adds a data [E2B's](https://e2b.dev/) analysis/code interpreter
sandbox as a tool

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Jakub Novak <jakub@e2b.dev>
2023-10-24 16:04:02 -07:00
Bagatur
d2cb95c39d Docs: add lcel to sequential chain (#12234) 2023-10-24 15:15:35 -07:00
Holt Skinner
e7e670805c docs: Google Cloud Documentation Cleanup (#12224)
- Move Document AI provider to the Google provider page
- Change Vertex AI Matching Engine to Vector Search
- Change references from GCP to Google Cloud
- Add Gmail chat loader to Google provider page
- Change Serper page title to "Serper - Google Search API" since it is
not a Google product.
2023-10-24 14:54:43 -07:00
Bagatur
286a29a49e bump 322 and 34 (#12228) 2023-10-24 13:52:17 -07:00
Bagatur
2008a6438c add experimental test release gha (#12229) 2023-10-24 13:49:16 -07:00
Eugene Yurtsev
583dc49477 Add type to Generation and sub-classes, handle root validator (#12220)
* Add a type literal for the generation and sub-classes for serialization purposes.
* Fix the root validator of ChatGeneration to return ValueError instead of KeyError or Attribute error if intialized improperly.
* This change is done for langserve to make sure that llm related callbacks can be serialized/deserialized properly.
2023-10-24 16:21:00 -04:00
Eugene Yurtsev
81052ee18e Fix code block in runnable doc (#12221)
Fix code block syntax in runnable doc-string
2023-10-24 16:11:58 -04:00
Mikelarg
46e28b9613 Added GigaChat chat model support (#12201)
- **Description:** Added integration with
[GigaChat](https://developers.sber.ru/portal/products/gigachat) language
model.
- **Twitter handle:** @dvoshansky
2023-10-24 12:53:51 -07:00
Dayuan Jiang
9c2c9c5274 fix typo in langchain/cookbook/stepback-qa.ipynb (#12204) 2023-10-24 12:51:51 -07:00
Bagatur
87af2360df mv old integration docs (#12217) 2023-10-24 12:38:16 -07:00
Bagatur
6e3f39963f Docs: consolidate top nav (#12219) 2023-10-24 12:28:08 -07:00
Anurag Wagh
d5c2ce7c2e [fix] create redis vector index before adding docs, add prefix to doc… (#11257)
Fix Description: 
For Redis Vector integration in add_texts method, there were two issues
that lead to this bug.
1. Vector index is not being created leading to no such_index error 
2. `doc:index` prefix was also missing for Redis Keys. 

resolves #11197 
Maintainer: @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-24 10:51:25 -07:00
Eugene Yurtsev
079d1f3b8e Expose handle_event and ahandle_events as public API (#12181)
Expose functionality to handle generic events.
2023-10-24 13:42:28 -04:00
William FH
67c4fd0ad0 Update deprecation (#12178)
in runner_utils
2023-10-24 10:37:28 -07:00
Nir Kopler
d3744175bf Finetuned OpenAI models cost calculation #11715 (#12190)
**Description:**
Add cost calculation for fine tuned models (new and legacy), this is
required after OpenAI added new models for fine tuning and separated the
costs of I/O for fine tuned models.
Also I updated the relevant unit tests
see https://platform.openai.com/docs/guides/fine-tuning for more
information.
issue: https://github.com/langchain-ai/langchain/issues/11715

  - **Issue:** 11715
  - **Twitter handle:** @nirkopler
2023-10-24 10:22:05 -07:00
Spyros
a2840a2b42 fix vertexai codey models (#12173)
**Description:**

This PR fixes issue #12156 by checking for Codey models appropriately
before result parsing.


Maintainer: @hwchase17 , @agola11
2023-10-24 10:20:05 -07:00
Leonid Ganeline
386ea48432 updated integrations/providers/microsoft (#12177)
Added several missed tools, utilities, toolkits to the `Microsoft` page.
2023-10-24 10:19:06 -07:00
Hech
d76f026d72 Fix flexible dimension and doc for DingoDB (#12187) 2023-10-24 10:16:19 -07:00
Erick Friis
95ae40ff90 Fix Anthropic Functions ainvoke (#12215)
Removes custom `NotImplementedError` in experimental anthropic
functions, allowing it to fallback on default `ainvoke` implementation.
2023-10-24 10:07:01 -07:00
Iskren Ivov Chernev
d5d7ba582a Improvements to llm/deepinfra (#10846)
- replace `requests` package with `langchain.requests`
- add `_acall` support
- add `_stream` and `_astream`
- freshen up the documentation a bit
- update vendor doc
2023-10-24 09:54:23 -07:00
sudranga
f09f82541b Expose configuration options in GraphCypherQAChain (#12159)
Allows for passing arguments into the LLM chains used by the
GraphCypherQAChain. This is to address a request by a user to include
memory in the Cypher creating chain. Will keep the prompt variables
as-is to be backward compatible. But, would be a good idea to deprecate
them and use the **kwargs variables. Added a test case.

In general, I think it would be good for any chain to automatically pass
in a readonlymemory(of its input) to its subchains whilist allowing for
an override. But, this would be a different change.
2023-10-24 09:52:55 -07:00
Leonid Ganeline
11f13aed53 docstrings update (#12093)
Added missed docstrings. Added missed Args:, Returns: Raises:
2023-10-24 09:34:10 -07:00
Johnny Oshika
ba20c14e28 Fix typo in stuff_prompt's system_template (#12063)
- **Description:** 

Add missing apostrophe in `user's` in stuff_prompt's system_template.
The first sentence in the system template went from:

> Use the following pieces of context to answer the users question.

to

> Use the following pieces of context to answer the user's question.

- **Issue:** 
- **Dependencies:** none
- **Tag maintainer:** @baskaryan
- **Twitter handle:** ojohnnyo
2023-10-24 09:21:28 -07:00
Bagatur
deb8168329 fix note callout (#12214) 2023-10-24 09:17:18 -07:00
Bagatur
8ba97cb408 separate compile integration tests (#12171)
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
2023-10-24 08:55:19 -07:00
Bagatur
44dae6936b Docs: Add LCEL to chains/foundational/llm (#12213) 2023-10-24 08:53:55 -07:00
Bagatur
922193475a Docs: Add LCEL to chains/foundational/transform (#12212) 2023-10-24 08:52:47 -07:00
Bagatur
55f0f8dae8 Docs: add LCEL to chains/foundational/router (#12211) 2023-10-24 08:51:12 -07:00
Holt Skinner
69d9eae5cd feat: Add Client Info to available Google Cloud Clients (#12168)
- This is used internally to gather aggregate usage metrics for the
LangChain integrations

- Note: This cannot be added to some of the Vertex AI integrations at
this time because the SDK doesn't allow overriding the
[`ClientInfo`](https://googleapis.dev/python/google-api-core/latest/client_info.html#module-google.api_core.client_info)

- Added to:
  - BigQuery
  - Google Cloud Storage
  - Document AI
  - Vertex AI Model Garden
  - Document AI Warehouse
  - Vertex AI Search
  - Vertex AI Matching Engine (Cloud Storage Client)
 
@baskaryan, @eyurtsev, @hwchase17

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-24 08:49:11 -07:00
Lukas Wolf
69f5f82804 Update extraction.py (#12207)
Description: Pass tags as argument to create_extraction_chain
Issue: create_extraction_chain does not pass tags to chain yet 

@baskaryan
2023-10-24 08:25:14 -07:00
Nuno Campos
34ffb94770 Remove GetLocal, PutLocal (#12133)
Do you agree?
2023-10-24 10:16:46 +01:00
Eric Hartford
8c150ad7f6 Add COBOL parser and splitter (#11674)
- **Description:** Add COBOL parser and splitter
  - **Issue:** n/a
  - **Dependencies:** n/a
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** erhartford

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 15:44:31 -04:00
Ikko Eltociear Ashimine
bb137fd6e7 Fix typo in jsonformer_experimental.ipynb (#12099)
HuggingFace -> Hugging Face

\
2023-10-23 15:35:54 -04:00
Eugene Yurtsev
ace2234391 Update security.md (#11942)
Update security.md
2023-10-23 15:35:33 -04:00
John Mai
ebf749c40c Baichuan & Hunyuan set default api_base (#12059)
### Description
Baichuan & Hunyuan set default api_base env
2023-10-23 15:33:35 -04:00
Priyanshu Prajapati
283a3ecc9c Create CODE_OF_CONDUCT.md (#12105)
code of conduct.md file is missing it is generally present in good repos
which have large community

Replace this entire comment with:
- **Description:** Added a `code_of_conduct.md` file to the repository
to establish community standards and guidelines for contributors.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** N/A

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 15:15:24 -04:00
Shilong Dai
99afc1b4f8 Fixed hardcoded "vector" and replaced with vector_query_field variable (#12126)
- **Description:** In the max_marginal_relevance_search function of the
ElasticsearchStore vector store, the name of the field corresponding to
the vector embedding of the document is hard coded in the delete
statement that drops the field from the document metadata. This results
in an exception if the vector embedding field is customized. This PR
changes the hard-coded "vector" into the vector_query_field variable.
  - **Issue:** None
  - **Dependencies:** None
  - **Tag maintainer:** @hwchase17

Co-authored-by: Shilong Dai <sdai@viperfish.net>
2023-10-23 15:08:55 -04:00
Vikram Shitole
0d44746430 10634: Added the capability to inject boto3 client in SagemakerEndpointEmbeddings (#12146)
**Description: Allow to inject boto3 client for Cross account access
type of scenarios in using SagemakerEndpointEmbeddings and also updated
the documentation for same in the sample notebook**

**Issue:SagemakerEndpointEmbeddings cross account capability #10634
#10184**

Dependencies: None
Tag maintainer:
Twitter handle:lethargicoder

Co-authored-by: Vikram(VS) <vssht@amazon.com>
2023-10-23 15:08:26 -04:00
Deepanshu
ff79a99825 Fix Typo in CONTRIBUTING.md file (#12145)
Fix Type & add suitable pronoun in CONTRIBUTING.md file


Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 14:53:03 -04:00
aubin_mzt
66f8cb015d Add connection args for pgvector vector store (#11930)
- **Description:** sqlalchemy create_engine() does not take into account
connect_args which are mandatory for managed PGSQL instances on cloud
providers (ssl_context for example).
Also re-enabled create_vector_extension at post_init for using pgvector
class seamlessly
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Sami Bargaoui <bargaoui.sam@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 14:43:44 -04:00
NuODaniel
4d6243fa87 fix: doc string of default params in chat_models, llm qianfan (#12153)
- **Description:** a fix of the doc string in Qianfan
  - **Issue:** no
  - **Dependencies:** no
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** no
2023-10-23 14:03:18 -04:00
Predrag Gruevski
f82bdf4613 Update deprecated langchain imports with suggested new paths. (#12164)
Let's help our users find the proper import to use instead of the
deprecated top-level ones.
2023-10-23 13:52:08 -04:00
Bagatur
963ff93476 bump 321 (#12161) 2023-10-23 12:49:38 -04:00
Nuno Campos
d0505c0d47 Update default recursion_limit, update docs (#12134)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-23 16:29:17 +01:00
William FH
4f23aa677a Fix Pickle Error (#12141)
If non-pickleable objects (like locks) get passed to the tracing
callback, they'll fail in the deepcopy. Fallback to a shallow copy in
these instances .
2023-10-23 08:22:47 -07:00
Predrag Gruevski
95a1b598fe Update to actions/checkout@v4. (#11951)
We don't use any of the new functionality at the moment. Just making
sure we don't fall back on versions and fail to benefit from new
patches. This is an easy upgrade and it's always harder to upgrade
across multiple major versions at once.
2023-10-23 10:01:33 -04:00
William FH
7c4f340cc0 Include Parent Run ID (#12139)
If you set local callbacks
2023-10-22 17:19:11 -07:00
Sanyam Jain
3df0f03928 Improved readability of Docs (#12136)
Replace this entire comment with:
  - **Description:** a description of the change, 
 improved grammar and readability of DOCS
 
@hwchase17
2023-10-22 17:16:30 -07:00
omahs
f3cc9bba5b Fix typos (#12128)
Fix typos
2023-10-22 17:16:03 -07:00
Nuno Campos
1afdb40b48 Add optional config arg to RunnablePassthrough func arg (#12131)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 19:57:16 +01:00
Nuno Campos
325fdde8b4 Fix bug where types were lost when calling with_cconfig or bind (#12137)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 19:26:13 +01:00
Nuno Campos
2719e49718 Add how-to guide on runnable generators (#12135)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 19:02:17 +01:00
Nuno Campos
02dce74b97 Fix type hint for older py versions (#12132)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 18:01:09 +01:00
Nuno Campos
d0ce374731 Allow specifying custom input/output schemas for runnables with .with_types() (#12083)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 17:26:48 +01:00
Harrison Chase
6fcba975d0 add rag fusion notebook (#12121) 2023-10-21 15:37:11 -07:00
Harrison Chase
dd0374560a fix up notebook (#12119) 2023-10-21 14:06:16 -07:00
Harrison Chase
ee69116761 move csv agent to langchain experimental (#12113) 2023-10-21 10:26:02 -07:00
Harrison Chase
03bf6ef473 add missing init files (#12114) 2023-10-21 10:25:50 -07:00
Harrison Chase
acb82cf25e add step back notebook (#11953) 2023-10-21 10:05:52 -07:00
Harrison Chase
9d9198de0b rewrite (#12111) 2023-10-21 09:31:10 -07:00
Bagatur
ef8b180d6d bump 320 (#12108) 2023-10-21 11:52:52 -04:00
Rotem Weiss
c4f8fefe74 Update Tavily API key link (#12109)
fix broken link to generate tavily api key
2023-10-21 11:44:57 -04:00
Rotem Weiss
78d186fb44 Add Tavily Search API as a Tool (#12103)
Adding Tavily Search API as a tool. I will be the maintainer and
assaf_elovic is the twitter handler.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-21 11:23:21 -04:00
Bagatur
85302a9ec1 Add CI check that integration tests compile (#12090) 2023-10-21 10:52:18 -04:00
verlocks
5dbe456aae Bug fix tongyi.py to be compatible with DashScope API (#11956)
Current ChatTongyi is not compatible with DashScope API, which will
cause error when passing api key to chat model directly.
- **Description:** Update tongyi.py to be compatible with DashScope API.
Specifically, update parameter name "dashscope_api_key" to "api_key".
  - **Issue:** None.
- **Dependencies:** Nothing new, Tongyi would require DashScope as
before.
2023-10-20 18:46:41 -04:00
Abhay Kaushik
39f65fb1c9 Fix typos in whatsapp.ipynb and telegram.ipynb (#12075)
- **Description:** 
    - Replace Telegram with Whatsapp in whatsapp.ipynb
    - Add # to mark the telegram as heading in telegram.ipynb
 
  - **Issue:** None
  - **Dependencies:** None
2023-10-20 18:45:33 -04:00
Tomaz Bratanic
82f4c0589c Add neo4j graph environment variables (#12080) 2023-10-20 14:43:01 -07:00
Mohammad Mohtashim
d5400f6502 Google Scholar Search Tool using serpapi (#11513)
- **Description:** Implementing the Google Scholar Tool as requested in
PR #11505. The tool will be using the [serpapi python
package](https://serpapi.com/integrations/python#search-google-scholar).
The main idea of the tool will be to return the results from a Google
Scholar search given a query as an input to the tool.

- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17
2023-10-20 17:35:55 -04:00
Ofer Mendelevitch
e542bf1b6b Minor update to doc/text in IPYNB example (#12089)
- **Description:** changed sign-up link in IPYNB example
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ofermend
2023-10-20 17:17:36 -04:00
Shreyas S
2e8637da2f Minor typo fix (#11804)
remove redundant a
langchain > LangChain
2023-10-20 17:11:53 -04:00
Shinya Maeda
89bc73c6c3 Fix superfluous Auto-fixing parser documents (#12062)
Replace this entire comment with:
- **Description:** Fix superfluous [Auto-fixing
parser](https://python.langchain.com/docs/modules/model_io/output_parsers/output_fixing_parser)
docs. Also switching to `langchain.pydantic_v1` from the direct
reference to `pydantic`,
  - **Issue:** N/A,
  - **Dependencies:** N/A,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
  - **Twitter handle:** @dosuken123 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
2023-10-20 16:07:03 -04:00
Holt Skinner
f5be2d525a fix: Add _serving_config property to GoogleVertexAISearchRetriever (#12084)
- Fixes error:

```
ValueError: "GoogleVertexAISearchRetriever" object has no field "_serving_config"
```

Introduced in #11736

@baskaryan, @eyurtsev, @hwchase17 if you could review and merge quickly,
that would be appreciated :)
2023-10-20 15:16:42 -04:00
Nuno Campos
5fee61a207 Support runnable factories in .configurable_alts() (#12065)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-20 15:22:09 +01:00
Lance Martin
b01a443ee5 Update figures in multi-modal Cookbooks (#12060) 2023-10-19 19:51:36 -07:00
Jacob Lee
34ec2da701 Fix typo in google vertex ai palm notebook documentation (#12056) 2023-10-19 21:46:35 -04:00
Bagatur
56c279015e clear nb img output (#12055) 2023-10-19 15:28:54 -07:00
Bagatur
54a8d70eb5 Bagatur/mv singlestore doc (#12053) 2023-10-19 15:06:26 -07:00
Leonid Ganeline
52b103dd13 update interface notebook (#12042)
Added a use case with parallelise on batches. Simplified text.
2023-10-19 17:06:14 -04:00
Bagatur
8cabb4ee8e add cookbook table (#12043) 2023-10-19 14:05:24 -07:00
Zhitao Xu
a4c3a44712 Fix documentation typo in Clickhouse Class (#12047)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
- **Description:** The return info in the documentation for
similarity_search_by_vector and similarity_search_with_relevance_scores
is wrong
2023-10-19 17:00:22 -04:00
William FH
25418b9b4d Always add run ID (#12046)
in eval callback handler.

Useful if you're using a custom run evaluator and don't want to thread
things through.
2023-10-19 12:38:07 -07:00
Eugene Yurtsev
44d7763580 Add zapier deprecation warning (#12045)
Add zapier deprecation
2023-10-19 15:27:56 -04:00
John Mai
4188f046ec Add Tencent Hunyuan chat model (#12022)
### Description:
The Tencent Hunyuan model, developed by Tencent, is a large language
model by robust Chinese text generation capabilities, adeptness in
logical reasoning within complex contexts, and reliable task execution
proficiency.For more information, see
[https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)
2023-10-19 15:10:12 -04:00
Eugene Yurtsev
68599d98c2 More security notes (#12040)
Add more security notes
2023-10-19 14:49:09 -04:00
Bagatur
0006075b08 bump 319 (#12041) 2023-10-19 11:45:27 -07:00
John Mai
8eb40b5fe2 baichuan_secret_key use pydantic.types.SecretStr & Add Baichuan tests (#12031)
### Description
- `baichuan_secret_key` use pydantic.types.SecretStr
- Add Baichuan tests
2023-10-19 14:37:41 -04:00
Nuno Campos
85bac75729 nc/runnable-dynamic-schemas-from-config (#12038)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-19 19:34:35 +01:00
Nuno Campos
85eaa4ccee Revert "nc/runnable-dynamic-schemas-from-config" (#12037)
This reverts commit a46eef64a7.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-19 19:27:02 +01:00
Nuno Campos
a46eef64a7 nc/runnable-dynamic-schemas-from-config 2023-10-19 19:17:48 +01:00
Nuno Campos
d392e030be Add default value (#12032)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-19 18:30:05 +01:00
Kenneth Choe
62efe1ffb9 support add_embeddings for elasticsearch (#11002)
- **Description:** Provide a way to use different text for embedding.
- For example, if you are ingesting stack-overflow Q&As for RAG, you
would want to embed the questions and return the answer(s) for the hits.
With this change, the consumer of langchain can implement that easily.
- I noticed the similar function is added on faiss.py with #1912 which
was for performance reason, but I see the same function can be used to
achieve what I thought. So instead of changing Document class to have
embedding_content, I mimicked the implementation of faiss.py.
- The test should provide some guidance on how to use it. It would be
more intuitive if I just pass texts and embedding_texts as separate
arguments, but I chose to use `zip`-ed object for the consistency with
faiss.py implementation.
      - I plan to make similar pull request for OpenSearch.
  - **Issue:** N/A
  - **Dependencies:** None other than the existing ones.

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-19 09:43:51 -07:00
Bagatur
76d3afaef0 bump 318 (#12030) 2023-10-19 09:33:39 -07:00
Dmitry Tyumentsev
5dd2161c4b add _acall method to YandexGPT (#12029)
- **Description:** Add async support for YandexGPT LLM model

Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
2023-10-19 09:15:26 -07:00
Palau
720ecacb1c Add notebook for kay.ai press release data (#11575)
- **Description:** Adding a notebook for Press Release data from Kay.ai,
as discussed offline
  - **Tag maintainer:** @baskaryan @hwchase17 
- **Twitter handle:** https://twitter.com/kaydotai
https://twitter.com/vishalrohra_

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-19 08:06:56 -07:00
Peter Krenesky
8425f33363 Pydantic v2 support for OpenAPI Specs (#11936)
- **Description:** Adding Pydantic v2 support for OpenAPI Specs 

- **Issue:**
- OpenAPI spec support was disabled because `openapi-schema-pydantic`
doesn't support Pydantic v2:
     #9205
     
     - Caused errors in `get_openapi_chain`
   
    - This may be the cause of #9520.

- **Tag maintainer:** @eyurtsev
- **Twitter handle:** kreneskyp


The root cause was that `openapi-schema-pydantic` hasn't been updated in
some time but
[openapi-pydantic](https://github.com/mike-oakley/openapi-pydantic)
forked and updated the project.
2023-10-19 11:06:11 -04:00
volodymyr-memsql
4adabd33ac Add example of retriever usage with SingleStoreDB vector store (#12021)
Added a notebook with examples of the creation of a retriever from the
SingleStoreDB vector store, and further usage.

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
2023-10-19 09:48:35 -04:00
Joe McElroy
c9f1768cb9 Elasticsearch Query Retriever: Use match + fuzziness for LIKE (#12023)
Updated the elasticsearch self query retriever to use the match clause
for LIKE operator instead of the non-analyzed fuzzy search clause.

Other small updates include:
- fixing the stack inference integration test where the index's default
pipeline didn't use the inference pipeline created
- adding a user-agent to the old implementation to track usage
- improved the documentation for ElasticsearchStore filters
2023-10-19 09:47:21 -04:00
maks-operlejn-ds
84d250f781 Docs: QA Privacy Nit (#12025)
Resize image in docs for QA Privacy
2023-10-19 09:43:47 -04:00
Nuno Campos
7db6aabf65 Update chat model output type (#11833)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-19 00:55:15 -07:00
Simon Dai
ed62984cb2 update Weaviate to support multi tenancy (#11842)
- **Description:** update Weaviate to support multi tenancy
  - **Issue:** 9956
  - **Dependencies:** 
  - **Tag maintainer:** hwchase17
  - **Twitter handle:** dsx1986_
2023-10-19 00:49:30 -07:00
hiigao
f818ec49b8 Encapsulate alicloud pai-eas access method for chatmodels and llms (#11852)
### Description: 
To provide an eas llm service access methods in this pull request by
impletementing `PaiEasEndpoint` and `PaiEasChatEndpoint` classes in
`langchain.llms` and `langchain.chat_models` modules. Base on this pr,
langchain users can build up a chain to call remote eas llm service and
get the llm inference results.

### About EAS Service
EAS is a Alicloud product on Alibaba Cloud Machine Learning Platform for
AI which is short for AliCloud PAI. EAS provides model inference
deployment services for the users. We build up a llm inference services
on EAS with a general llm docker images. Therefore, end users can
quickly setup their llm remote instances to load majority of the
hugginface llm models, and serve as a backend for most of the llm apps.

### Dependencies
This pr does't involve any new dependencies.

---------

Co-authored-by: 子洪 <gaoyihong.gyh@alibaba-inc.com>
2023-10-19 00:20:18 -07:00
Shinya Maeda
1da6d92369 fix: superfluous List Parser doc (#12014) 2023-10-19 00:14:38 -07:00
John Mai
a6b483dcbc Supported RetryOutputParser & RetryWithErrorOutputParser max_retries (#11903)
Description: Supported RetryOutputParser & RetryWithErrorOutputParser
max_retries
- max_retries: Maximum number of retries to parser.

Issue: None
Dependencies: None
Tag maintainer: @baskaryan 
Twitter handle:
2023-10-18 23:57:16 -07:00
Hugues Chocart
008c7df80d [LLMonitorCallbackHandler] Refactor + add llmonitor-py dependency (#11948)
We now require uses to have the pip package `llmonitor` installed. It
allows us to have cleaner code and avoid duplicates between our library
and our code in Langchain.
2023-10-18 23:54:10 -07:00
Sian Cao
77fc2f7644 fix: impl missing embeddings method (#10823)
FAISS does not implement embeddings method and use embed_query to
embedding texts which is wrong for some embedding models.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-18 23:51:28 -07:00
Holt Skinner
2661dc94f3 feat: Google Vertex AI Search Retriever - Add support for Website Data Stores (#11736)
- Only works for Data stores with Advanced Website Indexing
-
https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features
- Minor restructuring - Follow up to #10513
- Remove outdated docs (readded in
https://github.com/langchain-ai/langchain/pull/11620)
  - Move legacy class into new py file to clean up the directory
- Shouldn't cause backwards compatibility issues as the import works the
same way for users
2023-10-18 23:41:48 -07:00
Shorthills AI
4b6fdd7bf0 Update modal.py (#11588)
feat: Raise KeyError when 'prompt' key is missing in JSON response

This commit updates the error handling in the code to raise a KeyError
when the 'prompt' key is not found in the JSON response. This change
makes the code more explicit about the nature of the error, helping to
improve clarity and debugging.

@baskaryan, @eyurtsev.
2023-10-18 23:40:37 -07:00
Surav Shrestha
2038c7fd5d fix typo in multi_language.ipynb (#12009)
exprience -> experience
2023-10-18 23:33:25 -07:00
William FH
dfb4baa3f9 Fix Fireworks Callbacks (#12003)
I may be missing something but it seems like we inappropriately overrode
the 'stream()' method, losing callbacks in the process. I don't think
(?) it gave us anything in this case to customize it here?

See new trace:

https://smith.langchain.com/public/fbb82825-3a16-446b-8207-35622358db3b/r

and confirmed it streams.

Also fixes the stopwords issues from #12000
2023-10-18 23:33:09 -07:00
Lance Martin
12f8e87a0e LLaMA2 SQL cookbook clean (#12007) 2023-10-18 21:16:58 -07:00
Harrison Chase
bdecc5bade Harrison/lcel configuration (#11997) 2023-10-18 16:01:38 -07:00
Lance Martin
26d0858a60 Update LLaMA2 SQL notebook (#11995) 2023-10-18 15:01:37 -07:00
Wang Wei
e26559f512 Add ERNIE-Bot-4 model support for ErnieBotChat. (#11969)
- **Description:** According to the document
https://cloud.baidu.com/doc/WENXINWORKSHOP/s/clntwmv7t, add ERNIE-Bot-4
model support for ErnieBotChat.
- **Dependencies:** Before using the ERNIE-Bot-4, you should have the
model's access authority.
2023-10-18 14:55:29 -07:00
Alfrick Opidi
71b0f51003 Update clarifai.mdx (#11964)
Corrected broken link
2023-10-18 13:05:59 -07:00
Alfrick Opidi
5ba7a7d2bc Update clarifai.ipynb (#11963)
documents=docs not required when making a vector search on an existing
Clarifai application
2023-10-18 13:05:43 -07:00
Bagatur
642d2e4b67 caps not title for cookbooks descriptions (#11993) 2023-10-18 12:56:18 -07:00
Bagatur
fd7ab539c8 add cookbook readme (#11992) 2023-10-18 12:36:34 -07:00
Eugene Yurtsev
f4bec9686d Add more security notes (#11990)
Add more security notes
2023-10-18 15:00:56 -04:00
Eugene Yurtsev
3d81c76160 Add security notes to agent toolkits (#11989)
Add more security notes to agent toolkits.
2023-10-18 14:36:29 -04:00
Leonid Ganeline
b81a4c1d94 docstrings added (#11988)
Added docstrings. Some docsctrings formatting.
2023-10-18 13:05:49 -04:00
Bagatur
35c7c1f050 bump 317 (#11986) 2023-10-18 09:25:18 -07:00
Bagatur
122af2effe fix chroma from_texts bug (#11984) 2023-10-18 09:24:04 -07:00
Erick Friis
c149954cc5 Hub Runnable (#11946)
Adds `langchain.runnables.hub.HubRunnable` for pulling configurable
objects from the hub
2023-10-18 09:21:45 -07:00
Owen
9e24626e87 chore: remove duplicated export variables (#11962)
- **Description:** remove duplicated `__all__` variables
2023-10-18 12:08:50 -04:00
Nuno Campos
6bd9c1d2b3 Make prompt validation opt-in (#11973)
By default replace input_variables with the correct value

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-18 16:28:47 +01:00
Nuno Campos
9bc7e1851a Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save() (#11970)
.dict() is a Pydantic method that cannot raise exceptions, as it is used
eg. in `__eq__`

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-18 16:28:33 +01:00
Nuno Campos
653cf56e0e Lint 2023-10-18 16:02:00 +01:00
Predrag Gruevski
debcf053eb Fix invalid escape sequence warnings by using raw strings for regexes. (#11943)
This code also generates warnings when our users' apps hit it, which is
annoying and doesn't look great. Let's fix it.
2023-10-18 10:55:17 -04:00
Nuno Campos
e4ae690244 Sort order 2023-10-18 15:42:13 +01:00
Bagatur
8e1b1db90d bearly api key docs (#11981) 2023-10-18 07:26:10 -07:00
Nuno Campos
b753bf3323 Make prompt validation opt-in
By default replace input_variables with the correct value
2023-10-18 10:46:22 +01:00
Nuno Campos
202acce0c9 Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save() 2023-10-18 09:44:41 +01:00
Predrag Gruevski
392df7b2e3 Type hints on varargs and kwargs that take anything should be Any. (#11950)
Type hinting `*args` as `List[Any]` means that each positional argument
should be a list. Type hinting `**kwargs` as `Dict[str, Any]` means that
each keyword argument should be a dict of strings.

This is almost never what we actually wanted, and doesn't seem to be
what we want in any of the cases I'm replacing here.
2023-10-17 21:31:44 -04:00
volodymyr-memsql
7f17ce3742 SingleStoreDBChatMessageHistory: Add jupiter notebook with usage example (#11941)
The Docs folder changed its structure, and the notebook example for
SingleStoreDChatMessageHistory has not been copied to the new place due
to a merge conflict. Adding the example to the correct place.

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
2023-10-17 21:31:19 -04:00
Eugene Yurtsev
908c7bf33e Add documentation to tools (#11938)
Add security notes to tools

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
2023-10-17 21:27:59 -04:00
Eugene Yurtsev
43dc669332 Update playwright documentation (#11949)
Add security note to playwright tool
2023-10-17 21:22:26 -04:00
Daniel Chalef
2beb767ae5 zep: Memory Retriever MMR Support & Docs Updates (#11954)
- Update Zep Memory and Retriever docstrings
- Zep Memory Retriever: Add support for native MMR
- Add MMR example to existing ZepRetriever Notebook

@baskaryan
2023-10-17 16:35:11 -07:00
William FH
a27fa9bf10 Use traceable context (#11896)
Example

```
from langchain.schema.runnable import RunnableLambda
from langsmith import traceable

chain = RunnableLambda(lambda x: x)

@traceable(run_type = "chain")
def my_traceable(a):
    chain.invoke(a)
my_traceable(5)
```

Would have a nested result.

This would NOT work for interleaving chains and traceables. E.g., things
like thiswould still not work well

```
from langchain.schema.runnable import RunnableLambda
from langsmith import traceable

@traceable()
def other_traceable(a):
    return a

def foo(x):
    return other_traceable(x)
    
chain = RunnableLambda(foo)

@traceable(run_type = "chain")
def my_traceable(a):
    chain.invoke(a)
my_traceable(5)
```
2023-10-17 15:10:20 -07:00
Predrag Gruevski
dcd0392423 Upgrade to newer black (23.10) and ruff (first 0.1.x!) versions. (#11944)
Minor lint dependency version upgrade to pick up latest functionality.

Ruff's new v0.1 version comes with lots of nice features, like
fix-safety guarantees and a preview mode for not-yet-stable features:
https://astral.sh/blog/ruff-v0.1.0
2023-10-17 17:24:51 -04:00
Trayan Azarov
1fd21ed21c Chroma batching (#11203)
- **Description:** Chroma >= 0.4.10 added support for batch sizes
validation of add/upsert. This batch size is dependent on the SQLite
limits of the target system and varies. In this change, for
Chroma>=0.4.10 batch splitting was added as the aforementioned
validation is starting to surface in the Chroma community (users using
LC)
 - **Issue:** N/A
 - **Dependencies:** N/A
 - **Tag maintainer:** @eyurtsev
 - **Twitter handle:** t_azarov
2023-10-17 13:59:42 -07:00
Guy Korland
9373b9c004 Add Graph interface (#11012)
Replace this entire comment with:
  - **Description:** Add a Graph interface
  - **Tag maintainer:** @baskaryan @hwchase17 
  - **Twitter handle:** @g_korland
2023-10-17 13:54:05 -07:00
DanielZzz
b647505280 feat: support ChatModels Qianfan QianfanChatEndpoint function_call (#11107)
- **Description:** 
* feature for `QianfanChatEndpoint` function_call ability, add
integration_test for it
    * add `model`, `endpoint` supported in calling params
    * add raw response in ChatModel Message
- **Issue:** 
    * #10867 
    * #11105 
    * #10215
- **Dependencies:** no
- **Tag maintainer:** @baskaryan 
- **Twitter handle:** no
2023-10-17 13:33:55 -07:00
M Bharat lal
67300567d3 GCSFileLoader retrieve blob custom metadata and append to document metadata (#11066)
- **Description:** GCSFileLoader retrieve blob's custom metadata and
append to document's metadata
- **Issue:** #9975,
- **Tag maintainer:** @baskaryan please review

Co-authored-by: b0l00ib <bharat.lal@walmart.com>
2023-10-17 12:17:59 -07:00
staoxiao
23c261ba57 Update bge_huggingface.ipynb (#8960)
- Description: Considering the similarity computation method of
[BGE](https://github.com/FlagOpen/FlagEmbedding) model is cosine
similarity, set normalize_embeddings to be True.
- Tag maintainer: @baskaryan

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-17 11:58:29 -07:00
billytrend-cohere
f4742dce50 Add Cohere retrieval augmented generation to retrievers (#11483)
Add Cohere retrieval augmented generation to retrievers

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-17 11:51:04 -07:00
刘 方瑞
0a24ac7388 Revised notebook and add delete to MyScale vector store (#11848)
- **Description:** 
  - Add `.delete` to myscale vector store. 
  - Revised vector store notebooks
- **Tag maintainer:** @baskaryan 
- **Twitter handle:** @myscaledb @mpsk_liu
2023-10-17 11:42:21 -07:00
John Mai
3fb5e4d185 Add Baichuan chat model (#11923)
Description: A large language models developed by Baichuan Intelligent
Technology,https://www.baichuan-ai.com/home
Issue: None
Dependencies: None
Tag maintainer:
Twitter handle:
2023-10-17 11:30:57 -07:00
Eugene Yurtsev
9ecb7240a4 Add security note to recursive url loader (#11934)
Add security note to recursive loader
2023-10-17 13:41:43 -04:00
maks-operlejn-ds
42dcc502c7 Anonymizer small fixes (#11915) 2023-10-17 10:27:29 -07:00
Eugene Yurtsev
90e9ec6962 Sitemap specify default filter url (#11925)
Specify default filter URL in sitemap loader and add a security note

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
2023-10-17 13:19:27 -04:00
Bagatur
ba0d729961 bump 316 (#11928) 2023-10-17 09:47:57 -07:00
Eugene Yurtsev
83162649bb Add runnables to api reference (#11520)
Need to look at preview whether this works.
2023-10-17 11:46:08 -04:00
Eugene Yurtsev
12d7eaa0c2 Add security notices to toolkits (#11900)
This adds security notices to toolkits init, and to several toolkits.
We'll need to continue documenting the rest of the toolkits.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-17 11:45:09 -04:00
Eugene Yurtsev
5f4a697ce3 Add deprecation warnings (#11899)
Add deprecation warnings

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-17 10:59:38 -04:00
Nuno Campos
8b79cf9566 Add lock for using global config enum weak map (#11920)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-17 15:50:35 +01:00
Nuno Campos
2a8ded6c8c Export merge_configs function (#11916)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-17 15:36:11 +01:00
Nuno Campos
57a02929d5 Add validation for configurable keys passed to .with_config() (#11910)
- Fix some typing issues found while doing that

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-17 15:34:49 +01:00
Nuno Campos
42cd2ef329 Ensure that configurable fields with enums support deduplication (#11909)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-17 15:30:38 +01:00
Nuno Campos
778e7c526e Add comment 2023-10-17 15:29:39 +01:00
Nuno Campos
19319e1746 Allow configs with None values 2023-10-17 15:23:58 +01:00
Nuno Campos
b0d5882fe1 Export merge_configs function 2023-10-17 13:22:07 +01:00
Nuno Campos
12596b9a9b Add validation for configurable keys passed to .with_config()
- Fix some typing issues found while doing that
2023-10-17 08:50:31 +01:00
Nuno Campos
754aca794f remove print 2023-10-17 08:46:07 +01:00
Nuno Campos
cf448a6314 Ensure that configurable fields with enums support deduplication 2023-10-17 08:25:21 +01:00
Leonid Ganeline
31f264169d evaluation criteria (#11681)
the updated value was:
` Criteria.MISOGYNY: "Is the submission misogynistic? If so, respond Y."
`
The " If so, respond Y." should not be here. This sub-string is not
presented in any other criteria and should not be presented here.
I also added a synonym to "misogynistic" as it done in many other
criteria.
2023-10-16 21:05:08 -07:00
Lance Martin
eca8a5e5b8 Flesh out semi-structured cookbook (#11904) 2023-10-16 20:50:15 -07:00
Dmitry Tyumentsev
e8c1850369 Add YandexGPT LLM and Chat model (#11703)
**Description:** Introducing an ability to work with the
[YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) language
model.
2023-10-16 20:30:07 -07:00
eryk-dsai
c4341463e8 Include information on the tools for creating gbnf grammar files in the llama-cpp notebook (#11764)
Hi,

I recently experimented with grammar-based sampling and discovered two
methods for speeding up the creation of gbnf grammar files:
1. [Online grammar generator
app](https://github.com/ggerganov/llama.cpp/discussions/2494) introduced
[here](https://github.com/ggerganov/llama.cpp/discussions/2494)
2.
[Script](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py)
for parsing json schema to gbnf grammar

I believe it is a good idea to include the information that leads to
them in the `llama-cpp` notebook.

***

Codespell check fails but due to the unrelated script

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-16 20:28:32 -07:00
Bagatur
c15701eebf Revert "Add baichuan model" (#11901)
cc @cloudscool, apologies your PR wasn't actually passing CI
2023-10-16 20:01:12 -07:00
cloudscool
c1d811c4bc Add baichuan model 2023-10-16 19:27:35 -07:00
John Mai
0169d45ba8 Supported OutputFixingParser max_retries (#11754)
Description: Supported OutputFixingParser max_retries
 - max_retries: Maximum number of retries to parser.

Issue: None
Dependencies: None
Tag maintainer: @baskaryan
Twitter handle: @JohnMai95

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-16 19:25:47 -07:00
Leonid Ganeline
c87b5c209d docs safety update (#11789)
The current ToC on the index page and on navbar don't match. Page titles
and Titles in ToC doesn't match
Changes:
- made ToCs equal
- made titles equal
- updated some page formattings.
2023-10-16 19:14:21 -07:00
Surav Shrestha
321506fcd1 fix typos in cookbook/sales_agent_with_context.ipynb (#11790)
I have fixed some typos in file
`cookbook/sales_agent_with_context.ipynb`. I kindly request the repo
maintainers to review and merge it. Thanks!
2023-10-16 19:10:40 -07:00
Surav Shrestha
be04695554 fix typos in cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb (#11791)
I have fixed some typos in file
`cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb`. I kindly
request the repo maintainers to review and merge it. Thanks!
2023-10-16 19:09:20 -07:00
Surav Shrestha
e69218504b fix typos in cookbook/self_query_hotel_search.ipynb (#11792)
I have fixed some typos in file
`cookbook/self_query_hotel_search.ipynb`. I kindly request the repo
maintainers to review and merge it. Thanks!
2023-10-16 19:09:05 -07:00
Surav Shrestha
7f0145315a fix typos in cookbook/Semi_structured_and_multi_modal_RAG.ipynb (#11794)
I have fixed some typos in file
`cookbook/Semi_structured_and_multi_modal_RAG.ipynb`. I kindly request
the repo maintainers to review and merge it. Thanks!
2023-10-16 19:07:21 -07:00
Surav Shrestha
ab145d85ec fix typos in docs/docs/expression_language/cookbook/prompt_llm_parser.ipynb (#11796)
trasform -> transform
2023-10-16 19:07:03 -07:00
volodymyr-memsql
ff8e6981ff SingleStoreDBChatMessageHistory: Add singlestoredb support for ChatMessageHistory (#11705)
**Description**

- Added the `SingleStoreDBChatMessageHistory` class that inherits
`BaseChatMessageHistory` and allows to use of a SingleStoreDB database
as a storage for chat message history.
- Added integration test to check that everything works (requires
`singlestoredb` to be installed)
- Added notebook with usage example
- Removed custom retriever for SingleStoreDB vector store (as it is
useless)

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
2023-10-16 21:59:45 -04:00
Mohammad Mohtashim
634ccb8ccd test_stream_log_retriever Unit Test + Tool names fix (#11808)
## Description



| Tool         | Original Tool Name       |
|-----------------------------|---------------------------|
| open-meteo-api              | Open Meteo API            |
| news-api                    | News API                  |
| tmdb-api                    | TMDB API                  |
| podcast-api                 | Podcast API               |
| golden_query                | Golden Query              |
| dall-e-image-generator      | Dall-E Image Generator    |
| twilio                      | Text Message              |
| searx_search_results        | Searx Search Results      |
| dataforseo                  | DataForSeo Results JSON   |

When using these tools through `load_tools`, I encountered the following
validation error:

```console
openai.error.InvalidRequestError: 'TMDB API' does not match '^[a-zA-Z0-9_-]{1,64}$' - 'functions.0.name'
```

In order to avoid this error, I replaced spaces with hyphens in the tool
names:

| Tool           | Corrected Tool Name       |
|-----------------------------|---------------------------|
| open-meteo-api              | Open-Meteo-API            |
| news-api                    | News-API                  |
| tmdb-api                    | TMDB-API                  |
| podcast-api                 | Podcast-API               |
| golden_query                | Golden-Query              |
| dall-e-image-generator      | Dall-E-Image-Generator    |
| twilio                      | Text-Message              |
| searx_search_results        | Searx-Search-Results      |
| dataforseo                  | DataForSeo-Results-JSON   |

This correction resolved the validation error.

Additionally, a unit test,
`tests/unit_tests/schema/runnable/test_runnable.py::test_stream_log_retriever`,
was failing at random. Upon further investigation, I confirmed that the
failure was not related to the above-mentioned changes. The `stream_log`
variable was generating the order of logs in two ways at random The
reason for this behavior is unclear, but in the assertion, I included
both possible orders to account for this variability.
2023-10-16 18:46:19 -07:00
VAS
a1120e2685 Fixed a typo in bittensor.ipynb (#11821)
Fixed a typo : 

benifits -> benefits

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
2023-10-16 18:43:29 -07:00
VAS
2a6d4acc9d Fixed a typo in anyscale.ipynb (#11822)
Fixed a typo : 

"asyncrhonized" > "asynchronized"

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-16 18:43:15 -07:00
Predrag Gruevski
7c0f1bf23f Upgrade experimental package dependencies and use Poetry 1.6.1. (#11339)
Part of upgrading our CI to use Poetry 1.6.1.
2023-10-16 21:13:31 -04:00
Eugene Yurtsev
c2c0814a94 Add security notice to file management tool (#11878)
Add security notice to file management tool

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
2023-10-16 21:12:13 -04:00
zhaoshengbo
cb7e12f6ba Adapt to the latest version of Alibaba Cloud OpenSearch vector store API (#11849)
Hello Folks,

Alibaba Cloud OpenSearch has released a new version of the vector
storage engine, which has significantly improved performance compared to
the previous version. At the same time, the sdk has also undergone
changes, requiring adjustments alibaba opensearch vector store code to
adapt.

This PR includes:

Adapt to the latest version of Alibaba Cloud OpenSearch API.
More comprehensive unit testing.
Improve documentation.

I have read your contributing guidelines. And I have passed the tests
below

- [x] make format
- [x]  make lint
- [x]  make coverage
- [x]  make test

---------

Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>
2023-10-16 18:07:24 -07:00
Javier Aranda Santos
96e3e06d50 Fix HuggingFace notebook link (#11863)
- **Description:** While reading the docs
(https://python.langchain.com/docs/integrations/providers/huggingface),
I noticed the notebook linked in
https://python.langchain.com/docs/use_cases/evaluation/huggingface_datasets.html
was giving back 404. I made a search in the docs to see whether it was
available, so this PR updates the link in the docs.
  - **Issue:** I haven't opened an issue for this change.
  - **Dependencies:** -
  - **Tag maintainer:** -,
  - **Twitter handle:** -

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-16 18:03:47 -07:00
standby24x7
40d188948e Fix spelling typos in learned_prompt_optimization.ipynb (#11862)
This patch fixes some spelling typo in
learned_prompt_optimization.ipynb.
It only changed messages, no logic changed.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2023-10-16 18:01:48 -07:00
Lee
e669f9d731 Fix: Sitemap Document Loader Tests and Documentation (#11866)
**Description:**
While working on the Docusaurus site loader #9138, I noticed some
outdated docs and tests for the Sitemap Loader.

**Issue:** 
This is tangentially related to #6691 in reference to doc links. I plan
on digging in to a few of these issue when I find time next.
2023-10-16 17:42:10 -07:00
DJZevenbergen
8bb8c56f74 Fix missing word (#11868)
- **Description:** added one missing word to a doc, 
  - **Dependencies:** N/A
2023-10-16 17:10:31 -07:00
Nuno Campos
9fdf1059a4 Fix issues in runnable docs examples (#11883) 2023-10-16 17:08:28 -07:00
Jean-Louis Queguiner
8b697ff0ee feat(llm): add together.xyz as an LLM provider (#11892)
- **Description:** added together.xyz as an LLM provider, 
  - **Issues:** fix some linting issues
  - twitter handle @jilijeanlouis 

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-16 17:08:04 -07:00
Leonid Kuligin
d269dd2e2f added a multiturn search based on Vertex AI Search (#11885)
Replace this entire comment with:
- **Description:** Added a retriever based on multi-turn Vertex AI
Search
  - **Twitter handle:** lkuligin
2023-10-16 17:05:12 -07:00
Leonid Kuligin
38ed55245f added Vertex examples as attributes (#11890)
- **Description:** added examples to Vertex chat models as optional
class attributes, so that a model with examples can be used inside a
chain
  - **Twitter handle:** lkuligin
2023-10-16 16:55:45 -07:00
eryk-dsai
5019f59724 fix: more robust check whether the HF model is quantized (#11891)
Removes the check of `model.is_quantized` and adds more robust way of
checking for 4bit and 8bit quantization in the `huggingface_pipeline.py`
script. I had to make the original change on the outdated version of
`transformers`, because the models had this property before. Seems
redundant now.

Fixes: https://github.com/langchain-ai/langchain/issues/11809 and
https://github.com/langchain-ai/langchain/issues/11759
2023-10-16 16:54:20 -07:00
Bagatur
efa9ef75c0 add LCEL to retriever doc (#11888) 2023-10-16 16:44:25 -07:00
Bagatur
d62369f478 Add LCEL to chain doc (#11895) 2023-10-16 16:44:12 -07:00
Harrison Chase
52bf03d786 add how to configure documentation (#11889) 2023-10-16 16:01:47 -07:00
Eugene Yurtsev
3be76ee2fa Add security.md (#11881)
Add security markdown file
2023-10-16 17:41:21 -04:00
Leonid Ganeline
ea0982eede update CONTRIBUTING.md (#11872)
Adding description of the `View deployment` button on the PR page. This
nice feature was not documented.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2023-10-16 14:21:36 -07:00
Lance Martin
18a4fdded6 Add deps and minor cleaning to cookbooks (#11886) 2023-10-16 13:37:51 -07:00
Bagatur
e3664272f0 Add LCEL to output parser doc (#11880) 2023-10-16 12:35:18 -07:00
Bagatur
049a0357e7 Add LCEL to prompt doc (#11875) 2023-10-16 11:34:31 -07:00
Eugene Yurtsev
210a48cfb5 Add security considerations (#11869)
Add security considerations to existing graph tools.
2023-10-16 12:23:48 -04:00
Lance Martin
201b7ce9af Update SQL cookbook (#11870) 2023-10-16 09:12:03 -07:00
Bagatur
25b1d65305 bump 315 (#11850) 2023-10-16 00:50:54 -07:00
Bagatur
ece22b6b6a Add LCEL to LLM intro (#11835) 2023-10-15 14:59:45 -07:00
Bagatur
ffa1b3a758 Add LCEL to chat model intro (#11834) 2023-10-15 14:59:36 -07:00
Nuno Campos
4321d192ea Use a less specific return type for | on Runnables (#11762)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-15 21:15:06 +01:00
Bagatur
6c5bb1b2e1 RM snippets (#11798) 2023-10-15 12:20:58 -07:00
Lance Martin
ccd1400423 Update multi-modal notebooks (#11827) 2023-10-15 09:00:07 -07:00
Lance Martin
8bf16d5275 LLaMA2 SQL Chat cookbook (#11685) 2023-10-15 08:54:09 -07:00
Harrison Chase
a506302772 bearly tool (#11812) 2023-10-14 16:03:58 -07:00
Harrison Chase
4a2f0c51a1 use get_llm_cache and set_llm_cache (#11741)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-14 09:29:30 -07:00
Harrison Chase
f3ad22e64a pipe default key (#11788) 2023-10-14 08:39:23 +01:00
Bagatur
6e78dacd78 customize rtd build (#11797)
customize readthedocs config so that we can parallelize the api docs
build
2023-10-13 19:50:22 -07:00
Eugene Yurtsev
0d37b4c27d Add python,pandas,xorbits,spark agents to experimental (#11774)
See for contex
https://github.com/langchain-ai/langchain/discussions/11680
2023-10-13 17:36:44 -04:00
Bagatur
d6e34ca2ee fix recent docs integrations file loc (#11782) 2023-10-13 13:58:26 -07:00
Michael Feil
233a904f2e GradientLLM Docs update and model_id renaming. (#10963)
Related to #10800 

- Errors in the Docstring of GradientLLM / Gradient.ai LLM
- Renamed the `model_id` to `model` and adapting this in all tests.
Reason to so is to be in Sync with `GradientEmbeddings` and other LLM's.
- inmproving tests so they check the headers in the sent request.
- making the aiosession a private attribute in the docs, as in the
future `pip install gradientai` will be replacing aiosession.
- adding a example how to fine-tune on the Prompt Template as suggested
in #10800
2023-10-13 13:57:58 -07:00
David
6876b02c87 Move EverlyAI python notebook to the right location (#11779)
Hi,

After submitting https://github.com/langchain-ai/langchain/pull/11357,
we realized that the notebooks are moved to a new location. Sending a
new PR to update the doc.

---------

Co-authored-by: everly-studio <127131037+everly-studio@users.noreply.github.com>
2023-10-13 13:34:27 -07:00
Bagatur
1559ba4bfc fix upstash test import (#11781) 2023-10-13 13:31:36 -07:00
Leonid Kuligin
9f0a718198 added candidate_count for Vertex models (#11729)
- **Description:** added support for `candidate_count` parameter on
Vertex
2023-10-13 13:31:20 -07:00
David
9d200e6cbe Create ChatEverlyAI (#11357)
- Description: Adds the ChatEverlyAI class with llama-2 7b on [EverlyAI
Hosted
Endpoints](https://everlyai.xyz/)
- It inherits from ChatOpenAI and requires openai (probably unnecessary
but it made for a quick and easy implementation)

---------

Co-authored-by: everly-studio <127131037+everly-studio@users.noreply.github.com>
2023-10-13 12:25:11 -07:00
Hristo G
7fb25b4154 Add graceful fallback for ES vectorstore when content field is missing (#11726)
- **Description:**
- If the Elasticsearch field used for Langchain > Document.page_content
is missing because the specific document is
        somehow malformed fail gracefully.

  - **Tag maintainer:** 
    - @joemcelroy
2023-10-13 12:03:32 -07:00
Bagatur
f06fcde0d7 rm duplicate zilliz import (#11777) 2023-10-13 12:01:22 -07:00
Bagatur
a3330c4258 bump 314 (#11773) 2023-10-13 11:09:54 -07:00
Erick Friis
1861cc7100 General anthropic functions, steps towards experimental integration tests (#11727)
To match change in js here
https://github.com/langchain-ai/langchainjs/pull/2892

Some integration tests need a bit more work in experimental:
![Screenshot 2023-10-12 at 12 02 49
PM](https://github.com/langchain-ai/langchain/assets/9557659/262d7d22-c405-40e9-afef-669e8d585307)

Pretty sure the sqldatabase ones are an actual regression or change in
interface because it's returning a placeholder.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-13 09:48:24 -07:00
Lance Martin
98c8516ef1 Semi-structured and Multi-modal RAG cookbooks (#11582) 2023-10-13 08:45:54 -07:00
Nuno Campos
17c69678ab Revert "New add Baichuan Model" (#11761)
Reverts langchain-ai/langchain#11714

This has linting and formatting issues, plus it's added to chat models
folder but doesn't subclass Chat Model base class
2023-10-13 08:23:15 -07:00
cloudscool
56653c53aa New add Baichuan Model (#11714)
Motivation and Context
At present, the Baichuan Large Language Model is relatively popular and
efficient in performance. Due to widespread market recognition, this
model has been added to enhance the scalability of Langchain's ability
to access the big language model, so as to facilitate application access
and usage for interested users.

System Info
langchain: 0.0.295
python:3.8.3
IDE:vs code

Description
Add the following files:

1. Add baichuan_baichuaninc_endpoint.py in the
libs/langchain/langchain/chat_models
2. Modify the __init__.py file,which is located in the
libs/langchain/langchain/chat_models/__init__.py:
a. Add "from langchain.chat_models.baichuan_baichuaninc_endpoint import
BaichuanChatEndpoint"
    b. Add "BaichuanChatEndpoint" In the file's __ All__  method

Your contribution
I am willing to help implement this feature and submit a PR, but I would
appreciate guidance from the maintainers or community to ensure the
changes are made correctly and in line with the project's standards and
practices.
2023-10-12 23:04:28 -07:00
Shreyas S
694d768174 Minor fix (#11748)
changed > to over
2023-10-12 22:36:31 -07:00
Bagatur
8e6fa5f1d7 mv self-query docs to integrations (#11744) 2023-10-12 22:36:07 -07:00
Yang, Bo
9e1e0f54d2 Add TrainableLLM (#11721)
- **Description:** Add `TrainableLLM` for those LLM support fine-tuning
  - **Tag maintainer:** @hwchase17

This PR add training methods to `GradientLLM`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 17:38:33 -07:00
Burak Yılmaz
63e516c2b0 Upstash redis integration (#10871)
- **Description:** Introduced Upstash provider with following wrappers:
UpstashRedisCache, UpstashRedisEntityStore,
UpstashRedisChatMessageHistory, UpstashRedisStore
  - **Issue:** -,
  - **Dependencies:** upstash-redis python package is needed,
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** @BurakY744

---------

Co-authored-by: Burak Yılmaz <burakyilmaz@Buraks-MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 17:36:51 -07:00
Bagatur
a9db2b0b92 fix tongyi import (#11745) 2023-10-12 17:24:06 -07:00
Aaron Pham
6c61315067 fix(openllm): update with newer remote client implementation (#11740)
cc @baskaryan

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-10-12 17:01:18 -07:00
Richy Wang
11cdfe44af Implement Alibaba Tongyi chat model apis. (#10922)
Hi there
This PR is aim to implement chat model for Alibaba Tongyi LLM model. It
contains work below:
1.Implement ChatTongyi chat model in langchain.chat_models.tongyi. Note
this is different with tongyi llm model to another PR
https://github.com/langchain-ai/langchain/pull/10878.
For detail it implements _generate() and _stream() function in
ChatTongyi.
2. Add some examples in chat/tongyi.ipynb. 
3. Add integration test in chat_models/test_tongyi.py 

Note async completion for the Text API is not yet supported.
Dependencies: dashscope. It will be installed manually cause it is not
need by everyone.
2023-10-12 16:59:37 -07:00
Adam Demjen
008348ce71 Add ElasticsearchChatMessageHistory (#10932)
**Description**

This PR adds the `ElasticsearchChatMessageHistory` implementation that
stores chat message history in the configured
[Elasticsearch](https://www.elastic.co/elasticsearch/) deployment.

```python
from langchain.memory.chat_message_histories import ElasticsearchChatMessageHistory

history = ElasticsearchChatMessageHistory(
    es_url="https://my-elasticsearch-deployment-url:9200", index="chat-history-index", session_id="123"
)

history.add_ai_message("This is me, the AI")
history.add_user_message("This is me, the human")
```

**Dependencies**
- [elasticsearch client](https://elasticsearch-py.readthedocs.io/)
required

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 16:51:38 -07:00
Bagatur
d3a5090e12 mv semadb docs (#11743) 2023-10-12 16:31:09 -07:00
Bagatur
acdbdbddb1 clean up doc (#11742)
committed old doc in wrong place
2023-10-12 16:26:55 -07:00
Jonathan Soma
48cf978391 Allow placeholders in OpenAPI endpoints #2938 (#2940)
Use regex matches when checking endpoints instead of exact matches.
`{varname}` becomes `.*`

Fixes #2938

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 16:20:32 -07:00
Mateusz Kozak
e42a576cb2 update Qdrant documentation (#3105)
fix `from_documents` method usage for Qdrant in documentation as
previous example doesn't work

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 16:20:18 -07:00
Predrag Gruevski
9e32120cbb Deprecate direct access to globals like debug and verbose. (#11311)
Instead of accessing `langchain.debug`, `langchain.verbose`, or
`langchain.llm_cache`, please use the new getter/setter functions in
`langchain.globals`:
- `langchain.globals.set_debug()` and `langchain.globals.get_debug()`
- `langchain.globals.set_verbose()` and
`langchain.globals.get_verbose()`
- `langchain.globals.set_llm_cache()` and
`langchain.globals.get_llm_cache()`

Using the old globals directly will now raise a warning.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-12 15:48:04 -07:00
Bagatur
01b7b46908 reorder eval docs (#11738)
cc @leo-gan
2023-10-12 15:46:55 -07:00
Richard Adams
35965df20d Rspace doc loader (#11511)
**Description:**

Add a document loader for the RSpace Electronic Lab Notebook
(www.researchspace.com), so that scientific documents and research notes
can be easily pulled into Langchain pipelines.

**Issue**

This is an new contribution, rather than an issue fix.

 **Dependencies:** 
  
There are no new required dependencies.
In order to use the loader, clients will need to install rspace_client
SDK using `pip install rspace_client`

---------

Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 15:05:38 -07:00
Ryan Zotti
9d1867c77f Update docs to specify Indexing-API-compatible vectorstores (#11581)
**Description:** Update Indexing API docs to specify vectorstores that
are compatible with the Indexing API. I add a unit test to remind
developers to update the documentation whenever they add or change a
vectorstore in a way that affects compatibility. For the unit test I
repurposed existing code from
[here](https://github.com/langchain-ai/langchain/blob/v0.0.311/libs/langchain/langchain/indexes/_api.py#L245-L257).

This is my first PR to an open source project. This is a trivially
simple PR whose main purpose is to make me more comfortable submitting
Langchain PRs. If this PR goes through I plan to submit PRs with more
substantive changes in the near future.

**Issue:** Resolves
[10482](https://github.com/langchain-ai/langchain/discussions/10482).

**Dependencies:** No new dependencies.

**Twitter handle:** None.
2023-10-12 15:17:44 -04:00
Richard Wang
6402c33299 Let Notion document loader support utf-8 and make it default. (#10613)
Use utf-8 encoding by default
2023-10-12 15:13:41 -04:00
Tomaz Bratanic
3759a34229 Add graph construction to neo4j docs (#11716)
Add graph construction section to Neo4j provider docs
2023-10-12 11:37:42 -07:00
Bagatur
bd74eba152 add azure openai sched tests (#11723) 2023-10-12 10:48:45 -07:00
Nuno Campos
b54727fbad Nc/why lcel (#11717)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-12 17:52:20 +01:00
Bagatur
9c0584be74 bump 313 (#11718) 2023-10-12 09:48:54 -07:00
Johnny Deuss
bb2ed4615c Fix typos (#11663) 2023-10-12 11:44:03 -04:00
sudranga
361f8e1bc6 Add MMR functionality to elasticsearch retriever (#11633)
Allows MMR functionality only for the case where we have access to the
embedding function. Also allows for users to request for fields from
elasticsearch store. These are added to the document metadata.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 08:42:32 -07:00
Dmitry Tyumentsev
ead9d5b55c Add yandex stt parser (#11435)
Description: Introducing an ability to load a transcription document of
audio file using [Yandex
SpeechKit](https://cloud.yandex.com/en-ru/services/speechkit)
Issue: None
Dependencies: yandex-speechkit
Tag maintainer: @rlancemartin, @eyurtsev
2023-10-12 08:42:03 -07:00
Janos Tolgyesi
15687a28d5 Use correct tokenizer for Bedrock/Anthropic LLMs (#11561)
**Description**

This PR implements the usage of the correct tokenizer in Bedrock LLMs,
if using anthropic models.

**Issue:** #11560

**Dependencies:** optional dependency on `anthropic` python library.

**Twitter handle:** jtolgyesi


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 08:41:52 -07:00
kYLe
467b082c34 Modify Anyscale integration to work with Anyscale Endpoint (#11569)
**Description:** Modify Anyscale integration to work with [Anyscale
Endpoint](https://docs.endpoints.anyscale.com/)
and it supports invoke, async invoke, stream and async invoke features

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-12 08:41:25 -07:00
plpycoin
51193309ea Update readthedocs.py (#11110)
Only parse .html files
.svg .png favicon.ico will crash processing phase

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-12 11:32:06 -04:00
Shreyas S
70a793ca9d Update zep_memory.ipynb (#11713)
fixed minor typos;
the your > your
on > upon
2023-10-12 10:41:19 -04:00
Surav Shrestha
e61b528c0e Fix typos in docs/docs/use_cases/question_answering/code_understandin… (#11710)
herarchy -> hierarchy
2023-10-12 10:17:23 -04:00
Surav Shrestha
f386ac3bef Fix typos in docs/docs/use_cases/tagging.ipynb (#11712)
funtion -> function
2023-10-12 10:17:10 -04:00
Surav Shrestha
ac73154005 Fix typos in docs/docs/use_cases/question_answering/conversational_re… (#11709)
neccessary -> necessary
2023-10-12 10:16:52 -04:00
Surav Shrestha
af9ce3c224 Fix typos in docs/docs/use_cases/chatbots.ipynb (#11707)
implemet -> implement
2023-10-12 10:16:34 -04:00
Surav Shrestha
77fcaa410a Fix typos in docs/docs/use_cases/extraction.ipynb (#11708)
This PR has a number of typos correction. I kindly request the repo
maintainers to review this PR and merge it.
2023-10-12 10:16:17 -04:00
Nuno Campos
ca9de26f2b Add callback function to RunnablePassthrough (#11564)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-12 15:10:16 +01:00
Nuno Campos
7f4734c0dd Add deploy command to repos generated by cli template (#11711)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-12 15:09:21 +01:00
Nuno Campos
1c0857b53e Fix default impl of aparse_result (#11702)
Should delegate to parse_result, not to aparse, as parse_result is a
method that some output parsers override

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-12 14:13:59 +01:00
nuric
44da27c07b Add SemaDB VST wrapper (#11484)
- **Description**: Adding vectorstore wrapper for
[SemaDB](https://rapidapi.com/semafind-semadb/api/semadb).
- **Issue**: None
- **Dependencies**: None
- **Twitter handle**: semafind

Checks performed:
- [x] `make format`
- [x] `make lint`
- [x] `make test`
- [x] `make spell_check`
- [x] `make docs_build`

Documentation added:

- SemaDB vectorstore wrapper tutorial
2023-10-11 19:09:38 -07:00
hsuyuming
0b743f005b Feature/enhance huggingfacepipeline to handle different return type (#11394)
**Description:** Avoid huggingfacepipeline to truncate the response if
user setup return_full_text as False within huggingface pipeline.

**Dependencies:** : None
**Tag maintainer:**   Maybe @sam-h-bean ?

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 19:09:03 -07:00
Leonid Kuligin
2aba9ab47e Retriever based on GCP DocAI Warehouse (#11400)
- **Description:** implements a retriever on top of DocAI Warehouse (to
interact with existing enterprise documents)
  https://cloud.google.com/document-ai-warehouse?hl=en
  - **Issue:** new functionality
 
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 19:08:53 -07:00
mvhensbergen
629d9b78fa Make example work during pydantic transition (#11498)
**Description:**

Make the example extraction code on
https://python.langchain.com/docs/use_cases/extraction work again by
importing the langchain.pydantic_v1 lib instead of the v2.

**Issue:**

Solves issue https://github.com/langchain-ai/langchain/issues/11468

Co-authored-by: Martin van Hensbergen <martin@mvhensbergen.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 18:44:47 -07:00
Erick Friis
a477ddda45 Langsmith in readme update (#11497) 2023-10-11 18:43:52 -07:00
Leonid Kuligin
9e81ab47be Added a better error description if processor name is wrong. (#11488)
Replace this entire comment with:
  - **Description:** added a better error description for this error
  - **Issue:** #11407 
  
  @baskaryan
2023-10-11 18:43:40 -07:00
Robert Yi
e75766b759 fix: incorrect arguments in clickhouse docstring (#11693)
fix docstring for clickhouse
2023-10-11 21:41:21 -04:00
Eugene Yurtsev
17b5090c18 Add type to Agent actions (#11682)
Add `type` to agent actions.
2023-10-11 21:33:24 -04:00
April
c14a8df2ee wrap confluence attachment processing with a try-except block (#11503)
Prevents document loading from erroring out when an attachment is not
found at the url.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 18:13:42 -07:00
Bagatur
17439daa6a add plan execute cookbook (#11690) 2023-10-11 18:03:13 -07:00
eajechiloae
4ba2c8ba75 Fix ClearML callback (#11472)
Handle different field names in dicts/dataframes, fixing the ClearML
callback.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 17:09:02 -07:00
ElliotKetchup
7ae8b7f065 Llama doc: add 'language' to the response message (#11543)
- **Description:** add 'language' to the reponse message in the Llama
doc,
  - **Issue:** None,
  - **Dependencies:** None,
  - **Tag maintainer:** None,
  - **Twitter handle:** None

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 17:06:04 -07:00
Lawrence Wu
93bb19f69a Fix chains/loading.py error messages (#11688)
- **Description:** make the error messages consistent in
chains/loading.py
  - **Dependencies:** None
2023-10-11 17:05:42 -07:00
Harrison Chase
18ebce2032 fix tool async (#11689) 2023-10-11 16:40:23 -07:00
sudranga
9beb03e771 11474 (#11519)
No relevant documents may be found for a given question. In some use
cases, we could directly respond with a fixed message instead of doing
an LLM call with an empty context. This PR exposes this as an option:
response_if_no_docs_found.

---------

Co-authored-by: Sudharsan Rangarajan <sudranga@nile-global.com>
2023-10-11 16:30:15 -07:00
Shinya Maeda
1f7edcd08b doc: Fix documentation about n-gram overlap (#11549)
Fix the documentation in
https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/ngram_overlap.
It's currently declaring unrelated variables, for example, `examples`
local variable is declared twice and the first one is overwritten
immediately.
  - **Issue:** N/A
  - **Dependencies:** N/A
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
  - **Twitter handle:** @dosuken123
2023-10-11 16:26:56 -07:00
Joaquin Menendez
ef99b06362 feature: add metadata information into the embedding file before uplo… (#11553)
Replace this entire comment with:
- **Description:** In this modified version of the function, if the
metadatas parameter is not None, the function includes the corresponding
metadata in the JSON object for each text. This allows the metadata to
be stored alongside the text's embedding in the vector store.
  - 
  - **Issue:** #10924
  - **Dependencies:** None
  - **Tag maintainer:** @hwchase17
@agola11
  - **Twitter handle:** @MelliJoaco

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 16:05:13 -07:00
maks-operlejn-ds
3c83779661 Qa with anonymization (#11658)
Added demo for QA system with anonymization. It will be part of
LangChain's privacy webinar.

@hwchase17 @baskaryan @nfcampos 

Twitter handle: @MaksOpp

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 15:38:08 -07:00
Marcin Wątroba
51a3a86022 #11655 Add SQLAlchemyMd5Cache implementation (#11660)
- **Description:** Add SQLAlchemyMd5Cache implementation, 
  - **Issue:** the issue # #11655,
  - **Dependencies:** no deps,
  - **Tag maintainer:** @markowanga

---------

Co-authored-by: Marcin Wątroba <marcin.watroba@pwr.edu.pl>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 15:28:09 -07:00
Suresh Kumar Ponnusamy
70f7558db2 langchain-experimental: Add allow_list support in experimental/data_anonymizer (#11597)
- **Description:** Add allow_list support in langchain experimental
data-anonymizer package
  - **Issue:** no
  - **Dependencies:** no
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:**
2023-10-11 14:50:41 -07:00
wemysschen
2363c02cf3 Bos loader (#11525)
**Description:**
Add  BaiduCloud BOS document loader.

---------

Co-authored-by: chenweixu01 <chenweixu01@baidu.com>
Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 14:43:48 -07:00
Kwanghoon Choi
fbb82608cd Fixed a bug in reporting Python code validation (#11522)
- **Description:** fixed a bug in pal-chain when it reports Python
    code validation errors. When node.func does not have any ids, the
    original code tried to print node.func.id in raising ValueError.
- **Issue:** n/a,
- **Dependencies:** no dependencies,
- **Tag maintainer:** @hazzel-cn, @eyurtsev
- **Twitter handle:** @lazyswamp

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 14:34:28 -07:00
Harrison Chase
9f39c23a13 add input type for convo retrieval chain (#11679) 2023-10-11 17:13:48 -04:00
zhaozhiming
d5e762d328 fix: Change the docs of JSONAgentOutputParser (#11594)
I am merely making some minor adjustments to the function documentation.
I hope to provide a small assistance to LangChain.
- **Description:** Change the docs of JSONAgentOutputParser. It will be
`JSON` better,
  - **Issue:** no,
  - **Dependencies:** no,
  - **Tag maintainer:** @hwchase17,
  - **Twitter handle:** Not worth mentioning.
2023-10-11 14:05:53 -07:00
Shreyas S
3cd0827785 Update kay.ipynb (#11676)
Fixed title display
2023-10-11 14:02:11 -07:00
Vinay Kakade
dd0cd98861 Add support for ChatOpenAI models in Infino callback handler (#11608)
**Description:** This PR adds support for ChatOpenAI models in the
Infino callback handler. In particular, this PR implements
`on_chat_model_start` callback, so that ChatOpenAI models are supported.
With this change, Infino callback handler can be used to track latency,
errors, and prompt tokens for ChatOpenAI models too (in addition to the
support for OpenAI and other non-chat models it has today). The existing
example notebook is updated to show how to use this integration as well.
cc/ @naman-modi @savannahar68

**Issue:** https://github.com/langchain-ai/langchain/issues/11607 

**Dependencies:** None

**Tag maintainer:** @hwchase17 

**Twitter handle:** [@vkakade](https://twitter.com/vkakade)
2023-10-11 14:00:54 -07:00
Israel Ekpo
d0603c86b6 Add Support for Azure Cosmos DB MongoDB vCore Vector Store #11627 (#11632)
This PR adds support for the Azure Cosmos DB MongoDB vCore Vector Store

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search

Summary:
- **Description:** added vector store integration for Azure Cosmos DB
MongoDB vCore Vector Store,
  - **Issue:** the issue # it fixes #11627,
  - **Dependencies:** pymongo dependency,
  - **Tag maintainer:** @hwchase17,
  - **Twitter handle:** @izzyacademy

---------

Co-authored-by: Israel Ekpo <israel.ekpo@gmail.com>
Co-authored-by: Israel Ekpo <44282278+izzyacademy@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-11 13:56:46 -07:00
Erick Friis
28ee6a7c12 Track ChatFireworks time to first_token (#11672) 2023-10-11 13:37:03 -07:00
Erick Friis
2c1e735403 Fix runnable docs link (#11675) 2023-10-11 13:11:23 -07:00
Eugene Yurtsev
539941281d Fix output types for BaseChatModel (#11670)
* Should use non chunked messages for Invoke/Batch
* After this PR, stream output type is not represented, do we want to
use the union?

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-11 16:02:03 -04:00
Ikko Eltociear Ashimine
7d0dda7e41 Fix typo in baidu_qianfan_endpoint.ipynb (#11667)
enviroment -> environment
2023-10-11 16:01:18 -04:00
Bagatur
cf86447623 Start cookbook and move stuff from use cases (#11636) 2023-10-11 12:27:13 -07:00
Eugene Yurtsev
99adcdb1c9 Add dedicated type attribute to be used solely for serialization purposes (#11585)
Adds standard `type` field for all messages that will be
serialized/validated by pydantic.

* The presence of `type` makes it easier for developers consuming
schemas to write client code to serialize/deserialize.
* In LangServe `type` will be used for both validation and will appear
in the generated openapi specs
2023-10-11 15:06:42 -04:00
eryk-dsai
06d5971be9 Fix issue #10985 - Skip model.to(device) if it is instantiated with bitsandbytes config (#11009)
Preventing error caused by attempting to move the model that was already
loaded on the GPU using the Accelerate module to the same or another
device. It is not possible to load model with Accelerate/PEFT to CPU for
now

Addresses:
[#10985](https://github.com/langchain-ai/langchain/issues/10985)
2023-10-11 09:28:27 -07:00
Nuno Campos
64969bc8ae Add patch_config(configurable=) arg, make with_config(configurable=) merge it with existing (#11662)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-11 14:45:31 +01:00
Harrison Chase
ce0019b646 make utils conditional (#11646) 2023-10-11 06:11:32 +01:00
Harrison Chase
8f06085b24 make tools conditional (#11647) 2023-10-11 06:11:05 +01:00
Bassem Yacoube
5451b724fc Adds support for llama2 and fixes MPT-7b url (#11465)
- **Description:** This is an update to OctoAI LLM provider that adds
support for llama2 endpoints hosted on OctoAI and updates MPT-7b url
with the current one.
@baskaryan
Thanks!

---------

Co-authored-by: ML Wiz <bassemgeorgi@gmail.com>
2023-10-10 20:34:35 -07:00
Todd Kerpelman
0bff399af1 Make metadata from the url_selenium loader match that of the web_base loader (#11617)
**Description:** I noticed the metadata returned by the url_selenium
loader was missing several values included by the web_base loader. (The
former returned `{source: ...}`, the latter returned `{source: ...,
title: ..., description: ..., language: ...}`.) This change fixes it so
both loaders return all 4 key value pairs.

Files have been properly formatted and all tests are passing. Note,
however, that I am not much of a python expert, so that whole "Adding
the imports inside the code so that tests pass" thing seems weird to me.
Please LMK if I did anything wrong.
2023-10-10 20:32:45 -07:00
Tarun Thotakura
c9d4d53545 Fixed the assignment of custom_llm_provider argument (#11628)
- **Description:** Assigning the custom_llm_provider to the default
params function so that it will be passed to the litellm
- **Issue:** Even though the custom_llm_provider argument is being
defined it's not being assigned anywhere in the code and hence its not
being passed to litellm, therefore any litellm call which uses the
custom_llm_provider as required parameter is being failed. This
parameter is mainly used by litellm when we are doing inference via
Custom API server.
https://docs.litellm.ai/docs/providers/custom_openai_proxy
  - **Dependencies:** No dependencies are required

@krrishdholakia , @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-10 20:29:24 -07:00
Leonid Ganeline
db67ccb0bb docstrings cleanup (#11640)
Added missed docstrings. Some reformatting.
2023-10-10 19:56:47 -07:00
Bagatur
78b4c7d5a0 collapse sidebar peer items (#11639) 2023-10-10 19:56:21 -07:00
Bagatur
6dd7362a54 start cookbook (#11638) 2023-10-10 17:37:23 -07:00
Yang, Bo
3a82bd7bdb Use raise from statement so that users can find detailed error message (#11461)
- **Description:** Use `raise from` statement so that users can find
detailed error message
  - **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17
2023-10-10 17:25:23 -07:00
Nuno Campos
9a0ed75a95 Add configurable fields with options (#11601)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-10 22:17:22 +01:00
Bagatur
0ca8d4449c add ls guide redirect (#11623) 2023-10-10 12:58:04 -07:00
Bagatur
eedfddac2d Restructure docs (#11620) 2023-10-10 12:55:19 -07:00
Bagatur
7232e082de bump 312 (#11621) 2023-10-10 12:34:49 -07:00
Eugene Yurtsev
58220cda72 Remove LLM Bash and related bash utilities (#11619)
Deprecate LLMBash and related bash utilities
2023-10-10 14:54:09 -04:00
ElliotKetchup
683f4a93b9 Update azureml_chat_endpoint code exemple (#11602)
- **Description:** azureml_chat_endpoint code exemple now takes
endpoint_url and endpoint_api_key parameter into consideration,
  - **Issue:** None),
  - **Dependencies:** None,
  - **Tag maintainer:** None,
  - **Twitter handle:** @ElliotAlladaye
2023-10-10 10:27:28 -07:00
Yong woo Song
fca34eb122 Fix: invalid link to chat model in openai platform docs (#11609)
There is some invalid link in open ai platform
[docs](https://python.langchain.com/docs/integrations/platforms/openai).
So i fixed it to valid links.
- `/docs/integrations/chat_models/openai` ->
`/docs/integrations/chat/openai`
- `/docs/integrations/chat_models/azure_openai` ->
`/docs/integrations/chat/azure_chat_openai`

Thanks! ☺️
2023-10-10 10:22:39 -07:00
Shubham Kushwaha
49de862076 Arcee.ai LLM & Retriever integration (#11579)
- **Description:** This PR introduces a new LLM and Retriever API to
https://arcee.ai for the python client
  - **Issue:** implements the integrations as requested in #11578 ,
  - **Dependencies:** no dependencies are required,
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** shwooobham 


** `make format`, `make lint` and `make test` runs locally.**
```shell
=========== 1245 passed, 277 skipped, 20 warnings in 16.26s ===========
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run black . --check
All done!  🍰 
1818 files would be left unchanged.
[ "." = "" ] || poetry run mypy .
Success: no issues found in 1815 source files
[ "." = "" ] || poetry run black .
All done!  🍰 
1818 files left unchanged.
[ "." = "" ] || poetry run ruff --select I --fix .
poetry run codespell --toml pyproject.toml
poetry run codespell --toml pyproject.toml -w
```


**Contributions**
1. Arcee (langchain/llms), ArceeRetriever (langchain/retrievers),
ArceeWrapper (langchain/utilities)
2. docs for Arcee (llms/arcee.py) and
ArceeRetriever(retrievers/arcee.py)
3.

cc: @jacobsolawetz @ben-epstein

---------

Co-authored-by: Shubham <shubham@sORo.local>
2023-10-10 10:20:45 -07:00
Eugene Yurtsev
b6a2507794 Docs to use LLMSymbolicMath and LLMBash + utilities from experimental (#11614)
Update docs in lieu of:

https://github.com/langchain-ai/langchain/discussions/11352
2023-10-10 13:11:46 -04:00
Eugene Yurtsev
b56ca0c2a4 Deprecate LLMSymbolicMath from langchain core (#11615)
Deprecate LLMSymbolicMath from langchain core package.
2023-10-10 12:33:51 -04:00
Leonid Ganeline
59adeaddb3 docs: update dependents (#11502)
A regular update of dependents.
2023-10-10 09:31:23 -07:00
Eugene Yurtsev
c9bce5bbfb Add version to langchain_experimental (#11613)
Add version to langchain experimental
2023-10-10 11:17:41 -04:00
Predrag Gruevski
22abeb9f6c Disable loading jinja2 PromptTemplate from file. (#10252)
jinja2 templates are not sandboxed and are at risk for arbitrary code
execution. To mitigate this risk:
- We no longer support loading jinja2-formatted prompt template files.
- `PromptTemplate` with jinja2 may still be constructed manually, but
the class carries a security warning reminding the user to not pass
untrusted input into it.

Resolves #4394.
2023-10-10 11:15:42 -04:00
Bagatur
b642d00f9f rm slack from community.md (#11610) 2023-10-10 07:55:26 -07:00
Nuno Campos
c7c03d4709 Fix mutation bugs in callback manager configure (#11603)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-10 14:50:18 +01:00
cccs-eric
e2a9072b80 Fix CohereRerank configuration (#11583)
**Description:** CohereRerank is missing `cohere_api_key` as a field and
since extras are forbidden, it is not possible to pass-in the key. The
only way is to use an env variable named `COHERE_API_KEY`.

For example, if trying to create a compressor like this:
```python
cohere_api_key = "......Cohere api key......"
compressor = CohereRerank(cohere_api_key=cohere_api_key)
```
you will get the following error:
```
  File "/langchain/.venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for CohereRerank
cohere_api_key
  extra fields not permitted (type=value_error.extra)
```
2023-10-09 23:26:34 -07:00
Anar
55fef4b64b implemented add files method in LLMRails (#11518)
This PR provides add files method with LLMRails. Implemented here are:

docs/extras/integrations/vectorstores/llm-rails.ipynb

---------

Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>
2023-10-09 16:29:43 -07:00
unifyh
fd7f129f10 Docs: Fix broken line breaks in snippets (#11523)
**Description:**
This PR fix some code snippets that have raw `\n`'s instead of actual
line breaks.

**Issue:**
Currently some snippets look like this:

![image](https://github.com/langchain-ai/langchain/assets/18213435/355b4911-38e9-4ba4-8570-f928557b6c13)

Affected pages:
-
https://python.langchain.com/docs/integrations/providers/predictionguard#example-usage
-
https://python.langchain.com/docs/modules/agents/how_to/custom_llm_agent#set-up-environment
-
https://python.langchain.com/docs/modules/chains/foundational/llm_chain#get-started
-
https://python.langchain.com/docs/integrations/providers/shaleprotocol#how-to

**Tag maintainer:**
@hwchase17
2023-10-09 15:40:27 -07:00
Stephen Hankinson
316dddc7cd fix wording of query_sql_database_tool_description (#11530)
- **Description:** Fixes minor typo for the
query_sql_database_tool_description in the db toolkit
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** @nfcampos 
  - **Twitter handle:** N/A
2023-10-09 15:32:45 -07:00
Ash Vardanian
1acfe86353 Accelerating Math Utils with SimSIMD (#11566)
LangChain relies on NumPy to compute cosine distances, which becomes a
bottleneck with the growing dimensionality and number of embeddings. To
avoid this bottleneck, in our libraries at
[Unum](https://github.com/unum-cloud), we have created a specialized
package - [SimSIMD](https://github.com/ashvardanian/simsimd), that knows
how to use newer hardware capabilities. Compared to SciPy and NumPy, it
reaches 3x-200x performance for various data types. Since publication,
several LangChain users have asked me if I can integrate it into
LangChain to accelerate their workflows, so here I am 🤗

## Benchmarking

To conduct benchmarks locally, run this in your Jupyter:

```py
import numpy as np
import scipy as sp
import simsimd as simd
import timeit as tt

def cosine_similarity_np(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
    X_norm = np.linalg.norm(X, axis=1)
    Y_norm = np.linalg.norm(Y, axis=1)
    with np.errstate(divide="ignore", invalid="ignore"):
        similarity = np.dot(X, Y.T) / np.outer(X_norm, Y_norm)
    similarity[np.isnan(similarity) | np.isinf(similarity)] = 0.0
    return similarity

def cosine_similarity_sp(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
    return 1 - sp.spatial.distance.cdist(X, Y, metric='cosine')

def cosine_similarity_simd(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
    return 1 - simd.cdist(X, Y, metric='cosine')

X = np.random.randn(1, 1536).astype(np.float32)
Y = np.random.randn(1, 1536).astype(np.float32)
repeat = 1000

print("NumPy: {:,.0f} ops/s, SciPy: {:,.0f} ops/s, SimSIMD: {:,.0f} ops/s".format(
    repeat / tt.timeit(lambda: cosine_similarity_np(X, Y), number=repeat),
    repeat / tt.timeit(lambda: cosine_similarity_sp(X, Y), number=repeat),
    repeat / tt.timeit(lambda: cosine_similarity_simd(X, Y), number=repeat),
))
```

## Results

I ran this on an M2 Pro Macbook for various data types and different
number of rows in `X` and reformatted the results as a table for
readability:

| Data Type | NumPy | SciPy | SimSIMD |
| :--- | ---: | ---: | ---: |
| `f32, 1` | 59,114 ops/s | 80,330 ops/s | 475,351 ops/s |
| `f16, 1` | 32,880 ops/s | 82,420 ops/s | 650,177 ops/s |
| `i8, 1` | 47,916 ops/s | 115,084 ops/s | 866,958 ops/s |
| `f32, 10` | 40,135 ops/s | 24,305 ops/s | 185,373 ops/s |
| `f16, 10` | 7,041 ops/s | 17,596 ops/s | 192,058 ops/s |
| `f16, 10` | 21,989 ops/s | 25,064 ops/s | 619,131 ops/s |
| `f32, 100` | 3,536 ops/s | 3,094 ops/s | 24,206 ops/s |
| `f16, 100` | 900 ops/s | 2,014 ops/s | 23,364 ops/s |
| `i8, 100` | 5,510 ops/s | 3,214 ops/s | 143,922 ops/s |

It's important to note that SimSIMD will underperform if both matrices
are huge.
That, however, seems to be an uncommon usage pattern for LangChain
users.
You can find a much more detailed performance report for different
hardware models here:

- [Apple M2
Pro](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-1-performance-on-apple-m2-pro).
- [4th Gen Intel Xeon
Platinum](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-2-performance-on-4th-gen-intel-xeon-platinum-8480).
- [AWS Graviton
3](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-3-performance-on-aws-graviton-3).
  
## Additional Notes

1. Previous version used `X = np.array(X)`, to repackage lists of lists.
It's an anti-pattern, as it will use double-precision floating-point
numbers, which are slow on both CPUs and GPUs. I have replaced it with
`X = np.array(X, dtype=np.float32)`, but a more selective approach
should be discussed.
2. In numerical computations, it's recommended to explicitly define
tolerance levels, which were previously avoided in
`np.allclose(expected, actual)` calls. For now, I've set absolute
tolerance to distance computation errors as 0.01: `np.allclose(expected,
actual, atol=1e-2)`.

---

  - **Dependencies:** adds `simsimd` dependency
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** @ashvardanian

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-09 14:56:55 -07:00
benchello
5de64e6d60 Add option to specify metadata columns in CSV loader (#11576)
#### Description
This PR adds the option to specify additional metadata columns in the
CSVLoader beyond just `Source`.

The current CSV loader includes all columns in `page_content` and if we
want to have columns specified for `page_content` and `metadata` we have
to do something like the below.:
```
csv = pd.read_csv(
        "path_to_csv"
    ).to_dict("records")

documents = [
        Document(
            page_content=doc["content"],
            metadata={
                "last_modified_by": doc["last_modified_by"],
                "point_of_contact": doc["point_of_contact"],
            }
        ) for doc in csv
    ]
```
#### Usage
Example Usage:
```
csv_test  =  CSVLoader(
      file_path="path_to_csv", 
      metadata_columns=["last_modified_by", "point_of_contact"]
 )
```
Example CSV:
```
content, last_modified_by, point_of_contact
"hello world", "Person A", "Person B"
```

Example Result:
```
Document {
 page_content: "hello world"
 metadata: {
 row: '0',
 source: 'path_to_csv',
 last_modified_by: 'Person A',
 point_of_contact: 'Person B',
 }
```

---------

Co-authored-by: Ben Chello <bchello@dropbox.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-09 14:56:45 -07:00
Stephen Hankinson
447a523662 fix comments in output format (#11536)
- **Description:** Fixes the comments in the ConvoOutputParser. Because
the \\\\ is escaping a single \\, they render something like:
`"action_input": string \ The input to the action` in the prompt.
Changing this to \\\\\\\\ lets it escape two slashes so that it renders
a proper comment: `"action_input": string \\ The input to the action`
  - **Issue:** N/A
  - **Dependencies:** 
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:**
2023-10-09 14:55:44 -07:00
Michael Landis
8e45f720a8 feat: add momento vector index as a vector store provider (#11567)
**Description**:

- Added Momento Vector Index (MVI) as a vector store provider. This
includes an implementation with docstrings, integration tests, a
notebook, and documentation on the docs pages.
- Updated the Momento dependency in pyproject.toml and the lock file to
enable access to MVI.
- Refactored the Momento cache and chat history session store to prefer
using "MOMENTO_API_KEY" over "MOMENTO_AUTH_TOKEN" for consistency with
MVI. This change is backwards compatible with the previous "auth_token"
variable usage. Updated the code and tests accordingly.

**Dependencies**:

- Updated Momento dependency in pyproject.toml.

**Testing**:

- Run the integration tests with a Momento API key. Get one at the
[Momento Console](https://console.gomomento.com) for free. MVI is
available in AWS us-west-2 with a superuser key.
- `MOMENTO_API_KEY=<your key> poetry run pytest
tests/integration_tests/vectorstores/test_momento_vector_index.py`

**Tag maintainer:**

@eyurtsev

**Twitter handle**:

Please mention @momentohq for this addition to langchain. With the
integration of Momento Vector Index, Momento caching, and session store,
Momento provides serverless support for the core langchain data needs.

Also mention @mlonml for the integration.
2023-10-09 14:02:59 -07:00
Eugene Yurtsev
ca2eed36b7 LangChain cli fix a few bugs (#11573)
Code was assuming that `git` and `poetry` exist. In addition, it was not
ignoring pycache files that get generated during run time
2023-10-09 13:30:16 -07:00
MSFTeegarden
923e9f9596 Add Azure Redis example (#11570)
**Description**
This PR adds an additional Example to the Redis integration
documentation. [The
example](https://learn.microsoft.com/azure/azure-cache-for-redis/cache-tutorial-vector-similarity)
is a step-by-step walkthrough of using Azure Cache for Redis and Azure
OpenAI for vector similarity search, using LangChain extensively
throughout.

**Issue**
Nothing specific, just adding an additional example.

**Dependencies**
None.

**Tag Maintainer**
Tagging @hwchase17 :)
2023-10-09 13:27:03 -07:00
Hugues Chocart
258ae1ba5f [LLMonitor Callback Handler]: Add error handling (#11563)
Wraps every callback handler method in error handlers to avoid breaking
users' programs when an error occurs inside the handler.

Thanks @valdo99 for the suggestion 🙂
2023-10-09 13:26:35 -07:00
Eugene Yurtsev
2aabfafe1e Module documentation for langchain runnables (#11550)
Add in code documentation for langchain runnables module.
2023-10-09 16:02:29 -04:00
Eugene Yurtsev
d8fa94e6fa RunnablePassthrough: In code documentation (#11552)
Add in code documentation for a runnable passthrough
2023-10-09 16:02:16 -04:00
Eugene Yurtsev
b42f218cfc RunnableLambda: Add in code docs (#11521)
Add in code docs for Runnable Lambda
2023-10-09 14:37:46 -04:00
maks-operlejn-ds
f64522fbaf Reset deanonymizer mapping (#11559)
@hwchase17 @baskaryan
2023-10-09 11:11:05 -07:00
maks-operlejn-ds
b14b65d62a Support all presidio entities (#11558)
https://microsoft.github.io/presidio/supported_entities/

@baskaryan @hwchase17
2023-10-09 11:10:46 -07:00
maks-operlejn-ds
4d62def9ff Better deanonymizer matching strategy (#11557)
@baskaryan, @hwchase17
2023-10-09 11:10:29 -07:00
Ash Vardanian
a992b9670d Fix: Missing DuckDuckGo package version (#11535)
[The `duckduckgo-search` v3.9.2 was removed from
PyPi](https://pypi.org/project/duckduckgo-search/#history). That breaks
the build.

  - **Description:** refreshes the Poetry dependency to v3.9.3
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ashvardanian
2023-10-09 10:55:46 -07:00
Bagatur
0a754fa286 redirect langsmith guides (#11562) 2023-10-09 09:58:03 -07:00
Nuno Campos
2f2a5fd582 Update Dockerfile.base (#11556) 2023-10-09 16:43:04 +01:00
Bagatur
8932ed3f07 bump 311 (#11555) 2023-10-09 08:17:07 -07:00
Bagatur
e7a0def1bc QoL improvements to query constructor (#11504)
updating query constructor and self query retriever to
- make it easier to pass in examples
- validate attributes used in query
- remove invalid parts of query
- make it easier to get + edit prompt
- make query constructor a runnable
- make self query retriever use as runnable
2023-10-09 08:10:52 -07:00
Taikono-Himazin
eec53fa294 Added autodetect_encoding option to csvLoader (#11327) 2023-10-09 08:06:43 -07:00
Holt Skinner
09c66fe04f feat: Update Google Document AI Parser (#11413)
- **Description:** Code Refactoring, Documentation Improvements for
Google Document AI PDF Parser
  - Adds Online (synchronous) processing option.
  - Adds default field mask to limit payload size.
  - Skips Human review by default.
- **Issue:** Fixes #10589

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-09 08:04:25 -07:00
Nuno Campos
628cc4cce8 Rename RunnableMap to RunnableParallel (#11487)
- keep alias for RunnableMap
- update docs to use RunnableParallel and RunnablePassthrough.assign

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-09 11:22:03 +01:00
Eugene Yurtsev
6a10e8ef31 Add documentation to Runnable (#11516) 2023-10-08 08:09:04 +01:00
William FH
eb572f41a6 Add LangSmith Run Chat Loader (#11458) 2023-10-06 17:02:18 -07:00
David Duong
484947c492 Fetch up-to-date attributes for env-pulled kwargs during serialisation of OpenAI classes (#11499) 2023-10-06 22:43:29 +01:00
Leonid Ganeline
c3d2b01adf docs: integrations/retrievers cleanup (#11388)
fixed several notebooks:
- headers
- formats

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-06 13:40:46 -07:00
Bagatur
5470e730d2 raise openapi import error (#11495) 2023-10-06 12:57:24 -07:00
Erick Friis
29f5f70415 Rename some last hwchase17/langchain links (#11494) 2023-10-06 12:34:30 -07:00
Fabrice Pont
872836c541 feat: add markdown list parser (#11411)
**Description:** add `MarkdownListOutputParser` as a new
`ListOutputParser`
 **Issue:** #11410
2023-10-06 12:25:45 -07:00
Erick Friis
8f50b616c5 Remove optional from vectara source (#11493)
fyi @ofermend

---------

Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
2023-10-06 12:12:44 -07:00
Maciej Dzieżyc
bcd308c368 Fix Open in Colab link for ClearML docs 2 (#11491)
Description: Fixed the Open in Colab link for ClearML docs
Issue: https://github.com/allegroai/clearml/issues/1125
Twitter handle: DziezycMaciej
2023-10-06 12:01:47 -07:00
Bagatur
88ab69c288 mv docs extras (#11399) 2023-10-06 10:09:41 -07:00
Bagatur
53887242a1 bump 310 (#11486) 2023-10-06 09:49:10 -07:00
Bagatur
1bf8ef1a4f rm brave (#11482) 2023-10-06 07:44:19 -07:00
Jesús Vélez Santiago
a1c7532298 Add async sql record manager and async indexing API (#10726)
- **Description:** Add support for a SQLRecordManager in async
environments. It includes the creation of `RecorManagerAsync` abstract
class.
- **Issue:** None
- **Dependencies:** Optional `aiosqlite`.
- **Tag maintainer:** @nfcampos 
- **Twitter handle:** @jvelezmagic

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-06 09:38:44 -04:00
Qihui Xie
57ade13b2b fix llm_inputs duplication problem in intermediate_steps in SQLDatabaseChain (#10279)
Use `.copy()` to fix the bug that the first `llm_inputs` element is
overwritten by the second `llm_inputs` element in `intermediate_steps`.

***Problem description:***
In [line 127](

c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L127C17-L127C17)),
the `llm_inputs` of the sql generation step is appended as the first
element of `intermediate_steps`:
```
            intermediate_steps.append(llm_inputs)  # input: sql generation
```

However, `llm_inputs` is a mutable dict, it is updated in [line
179](https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py#L179)
for the final answer step:
```
                llm_inputs["input"] = input_text
```
Then, the updated `llm_inputs` is appended as another element of
`intermediate_steps` in [line
180](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L180)):
```
                intermediate_steps.append(llm_inputs)  # input: final answer
```

As a result, the final `intermediate_steps` returned in [line
189](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L189C43-L189C43))
actually contains two same `llm_inputs` elements, i.e., the `llm_inputs`
for the sql generation step overwritten by the one for final answer step
by mistake. Users are not able to get the actual `llm_inputs` for the
sql generation step from `intermediate_steps`

Simply calling `.copy()` when appending `llm_inputs` to
`intermediate_steps` can solve this problem.
2023-10-05 21:32:08 -07:00
Florian
d78f418c0d Extract abstracts from Pubmed articles, even if they have no extra label (#10245)
### Description
This pull request involves modifications to the extraction method for
abstracts/summaries within the PubMed utility. A condition has been
added to verify the presence of unlabeled abstracts. Now an abstract
will be extracted even if it does not have a subtitle. In addition, the
extraction of the abstract was extended to books.

### Issue
The PubMed utility occasionally returns an empty result when extracting
abstracts from articles, despite the presence of an abstract for the
paper on PubMed. This issue arises due to the varying structure of
articles; some articles follow a "subtitle/label: text" format, while
others do not include subtitles in their abstracts. An example of the
latter case can be found at:
[https://pubmed.ncbi.nlm.nih.gov/37666905/](url)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 18:56:46 -07:00
Viktor Zhemchuzhnikov
fd9da60aea Add async support to SelfQueryRetriever (#10175)
### Description

SelfQueryRetriever is missing async support, so I am adding it.
I also removed deprecated predict_and_parse method usage here, and added
some tests.

### Issue
N/A

### Tag maintainer
Not yet

### Twitter handle
N/A
2023-10-05 18:54:21 -07:00
Theron Tau
35297ca0d3 Add feature for extracting images from pdf and recognizing text from images. (#10653)
**Description**

It is for #10423 that it will be a useful feature if we can extract
images from pdf and recognize text on them. I have implemented it with
`PyPDFLoader`, `PyPDFium2Loader`, `PyPDFDirectoryLoader`,
`PyMuPDFLoader`, `PDFMinerLoader`, and `PDFPlumberLoader`.
[RapidOCR](https://github.com/RapidAI/RapidOCR.git) is used to recognize
text on extracted images. It is time-consuming for ocr so a boolen
parameter `extract_images` is set to control whether to extract and
recognize. I have tested the time usage for each parser on my own laptop
thinkbook 14+ with AMD R7-6800H by unit test and the result is:

| extract_images | PyPDFParser | PDFMinerParser | PyMuPDFParser |
PyPDFium2Parser | PDFPlumberParser |
| ------------- | ------------- | ------------- | ------------- |
------------- | ------------- |
| False | 0.27s | 0.39s | 0.06s | 0.08s | 1.01s |
| True  | 17.01s  | 20.67s | 20.32s | 19,75s | 20.55s |

**Issue**

#10423 

**Dependencies**

rapidocr_onnxruntime in
[RapidOCR](https://github.com/RapidAI/RapidOCR/tree/main)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 18:51:59 -07:00
Bagatur
8e3fbc97ca Add vowpal_wabbit RL chain (#11462) 2023-10-05 18:39:45 -07:00
Haris Wang
f1269830a0 Fix bug in MarkdownHeaderTextSplitter for codeblock (#10262)
- Description: The previous version of the MarkdownHeaderTextSplitter
did not take into account the possibility of '#' appearing within code
blocks, which caused segmentation anomalies in these situations. This PR
has fixed this issue.
  - Issue: 
  - Dependencies: No
  - Tag maintainer: 
  - Twitter handle: 

cc @baskaryan @eyurtsev  @rlancemartin

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 18:34:42 -07:00
Eddie Cohen
656d2303f7 add in, nin for pinecone (#10303)
Description: Adds the in and nin comparators for pinecone seen
[here](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 18:31:09 -07:00
Bagatur
a3a2ce623e Revise vowpal_wabbit notebook 2023-10-05 18:18:19 -07:00
Bagatur
8fafa1af91 merge 2023-10-05 18:09:35 -07:00
olgavrou
3b07c0cf3d RL Chain with VowpalWabbit (#10242)
- Description: This PR adds a new chain `rl_chain.PickBest` for learned
prompt variable injection, detailed description and usage can be found
in the example notebook added. It essentially adds a
[VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer
before the llm call in order to learn or personalize prompt variable
selections.

Most of the code is to make the API simple and provide lots of defaults
and data wrangling that is needed to use Vowpal Wabbit, so that the user
of the chain doesn't have to worry about it.

- Dependencies:
[vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/),
     - sentence-transformers (already a dep)
     - numpy (already a dep)
  - tagging @ataymano who contributed to this chain
  - Tag maintainer: @baskaryan
  - Twitter handle: @olgavrou


Added example notebook and unit tests
2023-10-05 18:07:22 -07:00
Manikanta5112
56048b909f added ContentFormatter escape special characters for message content (#10319)
---------

Co-authored-by: Manikanta5112 <42089393+mani5112@users.noreply.github.com>
2023-10-05 18:02:29 -07:00
Leonid Ganeline
d17416ec79 docstrings callbacks (#11456)
Added missed docstrings to the `callbacks/`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-05 17:13:14 -07:00
Ofer Mendelevitch
3c7653bf0f "source" argument in constructor of Vectara (#11454)
Replace this entire comment with:
- **Description:** minor update to constructor to allow for
specification of "source"
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ofermend
2023-10-05 17:04:14 -07:00
Eugene Yurtsev
d9018ae5f1 Improve CLI ux (#11452)
Improve UX for cli
2023-10-05 19:40:00 -04:00
Jaikanth J
9f85f7c543 fix(cache): use dumps for RedisCache (#10408)
# Description
Attempts to fix RedisCache for ChatGenerations using `loads` and `dumps`
used in SQLAlchemy cache by @hwchase17 . this is better than pickle
dump, because this won't execute any arbitrary code during
de-serialisation.

# Issues
#7722 & #8666 

# Dependencies
None, but removes the warning introduced in #8041 by @baskaryan

Handle: @jaikanthjay46
2023-10-05 16:34:07 -07:00
rodrigo-clickup
5944c1851b Add ClickUp Toolkit (#10662)
- **Description:** Adds a toolkit to interact with the
[ClickUp](https://clickup.com/) [Public API](https://clickup.com/api/)
- **Dependencies:** None
- **Tag maintainer:** @rodrigo-georgian, @rodrigo-clickup,
@aiswaryasankarwork
- **Twitter handle:** 
- Aiswarya (https://twitter.com/Aiswarya_Sankar,
https://www.linkedin.com/in/sankaraiswarya/)
   - Rodrigo (https://www.linkedin.com/in/rodrigo-ceballos-lentini/)


---------

Co-authored-by: Aiswarya Sankar <aiswaryasankar@Aiswaryas-MacBook-Pro.local>
Co-authored-by: aiswaryasankarwork <143119412+aiswaryasankarwork@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 16:33:05 -07:00
John Reynolds
68901e1e40 Update output_parser.py (#10430)
- Description: Updated output parser for mrkl to remove any
hallucination actions after the final answer; this was encountered when
using Anthropic claude v2 for planning; reopening PR with updated unit
tests
- Issue: #10278 
- Dependencies: N/A
- Twitter handle: @johnreynolds
2023-10-05 15:47:24 -07:00
Joshua Sundance Bailey
790010703b ArcGISLoader: Limit number of results in query (#10615)
Description: this PR changes the `ArcGISLoader` to set
`return_all_records` to `False` when `result_record_count` is provided
as a keyword argument. Previously, `return_all_records` was `True` by
default and this made the API ignore `result_record_count`.

Issue: `ArcGISLoader` would ignore `result_record_count` unless user
also passed `return_all_records=False`.
2023-10-05 15:46:02 -07:00
Beck Bekmyradov
f9df55f7d2 Fix a Typo in Documentation (#11453)
- **Description:** This commit corrects a minor typo in the
documentation. It changes "frum" to "from" in the sentence: "The results
from search are passed back to the LLM for synthesis into an answer" in
the file `docs/extras/use_cases/more/agents/agents.ipynb`. This typo fix
enhances the clarity and accuracy of the documentation.
- **Tag maintainer:** @baskaryan
2023-10-05 15:34:06 -07:00
Bagatur
f5ce286932 fix api docs build (#11445) 2023-10-05 15:33:11 -07:00
mrbean
9903a70379 Add youdotcom retriever (#11304)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 13:48:11 -07:00
ashish-dahal
1655ff2ded Fix PyMuPDFLoader kwargs (#11434)
- **Description:** Fix the `PyMuPDFLoader` to accept `loader_kwargs`
from the document loader's `loader_kwargs` option. This provides more
flexibility in formatting the output from documents.

- **Issue:** The `loader_kwargs` is not passed into the `load` method
from the document loader, which limits configuration options.

- **Dependencies:**  None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 13:25:19 -07:00
Leonid Kuligin
e4a46747dc integration test for DocAI parser (#11424)
- **Description:** added an integration test
  - **Issue:** #11407 

@baskaryan
2023-10-05 12:38:29 -07:00
Aashish Saini
2abbdc6ecb Update bageldb.py (#11421)
I have restructured the code to ensure uniform handling of ImportError.
In place of previously used ValueError, I've adopted the standard
practice of raising ImportError with explanatory messages. This
modification enhances code readability and clarifies that any problems
stem from module importation.
2023-10-05 12:37:56 -07:00
Syed Ather Rizvi
bfd48925e5 Feature/csharp text splitter doc (#10571)
- **Description:** Just docs related to csharp code splitter
   
- **Issue:** It's related to a request made by @baskaryan in a comment
on my previous PR #10350
  - **Dependencies:** None
  - **Twitter handle:** @ather19

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 12:22:54 -07:00
Nuno Campos
2c11302598 Update langchain_release.yml (#11444) 2023-10-05 14:23:27 -04:00
maks-operlejn-ds
2aae1102b0 Instance anonymization (#10501)
### Description

Add instance anonymization - if `John Doe` will appear twice in the
text, it will be treated as the same entity.
The difference between `PresidioAnonymizer` and
`PresidioReversibleAnonymizer` is that only the second one has a
built-in memory, so it will remember anonymization mapping for multiple
texts:

```
>>> anonymizer = PresidioAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Brett Russell. Hi Brett Russell!'
```
```
>>> anonymizer = PresidioReversibleAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
```

### Twitter handle
@deepsense_ai / @MaksOpp

### Tag maintainer
@baskaryan @hwchase17 @hinthornw

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 11:23:02 -07:00
Kyle Pancamo
203258b4d6 Update pdf.py comment for PyPDFLoader (#10495)
PyPDF does not chunk at the character level to my understanding.

Description: PyPDF does not chunk at the character level, but instead
breaks up content by page. Fixup comment

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 11:22:40 -07:00
Juan Daza
4236ae3851 Added Streaming Capability to SageMaker LLMs (#10535)
This PR adds the ability to declare a Streaming response in the
SageMaker LLM by leveraging the `invoke_endpoint_with_response_stream`
capability in `boto3`. It is heavily based on the AWS Blog Post
announcement linked
[here](https://aws.amazon.com/blogs/machine-learning/elevating-the-generative-ai-experience-introducing-streaming-support-in-amazon-sagemaker-hosting/).

It does not add any additional dependencies since it uses the existing
`boto3` version.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 11:08:43 -07:00
Laurentiu Piciu
d9670a5945 openai_functions_multi_agent: solved the case when the "arguments" is valid JSON but it does not contain actions key (#10543)
Description: There are cases when the output from the LLM comes fine
(i.e. function_call["arguments"] is a valid JSON object), but it does
not contain the key "actions". So I split the validation in 2 steps:
loading arguments as JSON and then checking for "actions" in it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 11:08:09 -07:00
Eugene Yurtsev
fcccde406d Add SymbolicMathChain to experiment in preparation for deprecation (#11129)
Move symbolic math chain to experimental
2023-10-05 13:54:43 -04:00
Holt Skinner
9f73fec057 fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513)
- Description: Google Cloud Enterprise Search was renamed to Vertex AI
Search
-
https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available
- This PR updates the documentation and Retriever class to use the new
terminology.
- Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to
`GoogleVertexAISearchRetriever`
- Updated documentation to specify that `extractive_segments` requires
the new [Enterprise
edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features)
to be enabled.
  - Fixed spelling errors in documentation.
- Change parameter for Retriever from `search_engine_id` to
`data_store_id`
- When this retriever was originally implemented, there was no
distinction between a data store and search engine, but now these have
been split.
- Fixed an issue blocking some users where the api_endpoint can't be set
2023-10-05 10:47:47 -07:00
Patrick Randell
1d678f805f Additional Weaviate Filter Comparators (#10522)
### Description
When using Weaviate Self-Retrievers, certain common filter comparators
generated by user queries were unimplemented, resulting in errors. This
PR implements some of them. All linting and format commands have been
run and tests passed.
### Issue
#10474
### Dependencies
timestamp module

---------

Co-authored-by: Patrick Randell <prandell@deloitte.com.au>
2023-10-05 10:40:04 -07:00
Nuno Campos
79011f835f Remove str() from RunnableConfigurableAlternatives (#11446) 2023-10-05 18:40:00 +01:00
Mateusz Wosinski
656480feb6 Add language detection example (#10540)
### Description

Adds language detection examples based on
[langdetect](https://github.com/Mimino666/langdetect/tree/master/langdetect)
and [fasttext](https://github.com/facebookresearch/fastText/) libraries.
These frameworks can be especially useful together with components that
require selection of the language (e.g. data-anonymizer)

### Twitter handle

@deepsense_ai, @matt_wosinski
2023-10-05 10:39:08 -07:00
Harrison Chase
31d5bd84d7 make vectorstores optional (#11393) 2023-10-05 10:14:05 -07:00
Eugene Yurtsev
8aa545901a Update agent type docs (#11137)
In code docs for agent types
2023-10-05 12:51:14 -04:00
Eugene Yurtsev
3e31d6e35f Start deprecation of LLMBashChain (#11300)
In preparation for migration LLMBashChain and related tools add a
derprecation warning to the code.
2023-10-05 12:48:22 -04:00
Bagatur
8b6b8bf68c bump 309 (#11443) 2023-10-05 09:29:14 -07:00
billytrend-cohere
2ff91a46c0 Add cohere /chat integration (#11389)
Add cohere /chat integration and an iPython notebook to demonstrate the
addition.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 09:20:47 -07:00
adrienohana
ca346011b7 added interactive login for azure cognitive search vector store (#11360)
**Description:** Previously if the access to Azure Cognitive Search was
not done via an API key, the default credential was called which doesn't
allow to use an interactive login. I simply added the option to use
"INTERACTIVE" as a key name, and this will launch a login window upon
initialization of the AzureSearch object.
2023-10-05 09:20:18 -07:00
ElliotKetchup
53d4f1554a Update aws.mdx (#11431) 2023-10-05 09:07:16 -07:00
Lance Martin
211a74941a Update QA doc w/ Runnables (#11401)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-05 08:07:38 -07:00
Eugene Yurtsev
5a1f614175 Add docker compose to CLI (#11406)
Add docker compose to cli
2023-10-05 15:58:56 +01:00
Predrag Gruevski
e2d6c41177 Upgrade langchain dependencies. (#11420)
I was hoping this would pick up numpy 1.26, which is required to support
the new Python 3.12 release, but it didn't. It seems that some
transitive dependency requirement on numpy is preventing that, and the
highest we can currently go is 1.24.x.

But to find this out required a 15min `poetry lock`, so I figured we
might as well upgrade the dependencies we can and hopefully make the
next dependency upgrade a bit smaller.
2023-10-05 15:57:20 +01:00
Jacob Lee
71fd6428c5 Remove overridden async not implemented method on embeddings filters and add default async implementation for document compressors (#11415)
@nfcampos @eyurtsev @baskaryan

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
2023-10-05 15:56:03 +01:00
Nuno Campos
2f490be09b Fix .dict() for agent/chain (#11436)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-05 15:51:21 +01:00
Nuno Campos
1e59c44d36 Nc/5oct/runnable release (#11428)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-05 14:27:50 +01:00
Bagatur
58b7a3ba16 Rm bedrock anthropic error (#11403) 2023-10-04 23:31:51 -04:00
Predrag Gruevski
c9986bc3a9 Tweak type hints to match dependency's behavior. (#11355)
Needs #11353 to merge first, and a new `langchain` to be published with
those changes.
2023-10-04 22:36:58 -04:00
William FH
940b9ae30a Normalize Option in Scoring Chain (#11412) 2023-10-04 15:59:28 -07:00
bholagabbar
b9fad28f5e Fix typing imports in extraction usecase (#11402)
The person class here:
https://python.langchain.com/docs/use_cases/extraction#pydantic-1 has
attributes `dog_breed` and `dog_name` that use `Optional` from typing,
but it hasn't been imported. Fixed the import here
2023-10-04 13:55:02 -07:00
Leonid Ganeline
22165cb2fc merge pages into google and AWS pages (#11312)
There are several pages in `integrations/providers/more` that belongs to
Google and AWS `integrations/providers`.
- moved content of these pages into the Google and AWS
`integrations/providers` pages
- removed these individual pages
2023-10-04 13:44:23 -07:00
Eugene Yurtsev
70be04a816 CLI: Readme update (#11404)
Consolidating to a single README for now, will be easier to maintain we
can differentiate between poetry and pip later. Does not seem critical.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-04 16:25:37 -04:00
Nuno Campos
fde19c8667 Add CLI command to create a new project (#7837)
First version of CLI command to create a new langchain project template

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-04 15:43:41 -04:00
mhwang-stripe
9cea796671 Make langchain compatible with SQLAlchemy<1.4.0 (#11390)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

## Description
Currently SQLAlchemy >=1.4.0 is a hard requirement. We are unable to run
`from langchain.vectorstores import FAISS` with SQLAlchemy <1.4.0 due to
top-level imports, even if we aren't even using parts of the library
that use SQLAlchemy. See Testing section for repro. Let's make it so
that langchain is still compatible with SQLAlchemy <1.4.0, especially if
we aren't using parts of langchain that require it.

The main conflict is that SQLAlchemy removed `declarative_base` from
`sqlalchemy.ext.declarative` in 1.4.0 and moved it to `sqlalchemy.orm`.
We can fix this by try-catching the import. This is the same fix as
applied in https://github.com/langchain-ai/langchain/pull/883.

(I see that there seems to be some refactoring going on about isolating
dependencies, e.g.
c87e9fb2ce,
so if this issue will be eventually fixed by isolating imports in
langchain.vectorstores that also works).

## Issue
I can't find a matching issue.

## Dependencies
No additional dependencies

## Maintainer
@hwchase17 since you reviewed
https://github.com/langchain-ai/langchain/pull/883

## Testing
I didn't add a test, but I manually tested this.

1. Current failure:
```
langchain==0.0.305
sqlalchemy==1.3.24
```

``` python
python -i
>>> from langchain.vectorstores import FAISS
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/__init__.py", line 58, in <module>
    from langchain.vectorstores.pgembedding import PGEmbedding
  File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/pgembedding.py", line 10, in <module>
    from sqlalchemy.orm import Session, declarative_base, relationship
ImportError: cannot import name 'declarative_base' from 'sqlalchemy.orm' (/pay/src/zoolander/vendor3/lib/python3.8/site-packages/sqlalchemy/orm/__init__.py)
```

2. This fix:
```
langchain==<this PR>
sqlalchemy==1.3.24
```

``` python
python -i
>>> from langchain.vectorstores import FAISS
<succeeds>
```
2023-10-04 15:41:20 -04:00
Bagatur
91941d1f19 mv LCEL up in docs (#11395) 2023-10-04 15:34:06 -04:00
Nuno Campos
4d66756d93 Improve output of Runnable.astream_log() (#11391)
- Make logs a dictionary keyed by run name (and counter for repeats)
- Ensure no output shows up in lc_serializable format
- Fix up repr for RunLog and RunLogPatch

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-04 20:16:37 +01:00
Lester Solbakken
a30f98f534 Add Vespa vector store (#11329)
Addition of Vespa vector store integration including notebook showing
its use.

Maintainer: @lesters 
Twitter handle: LesterSolbakken
2023-10-04 14:59:11 -04:00
Nuno Campos
58a88f3911 Add optional input_types to prompt template (#11385)
- default MessagesPlaceholder one to list of messages

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-04 18:54:53 +01:00
Tomaz Bratanic
71290315cf Add optional Cypher validation tool (#11078)
LLMs have trouble with consistently getting the relationship direction
accurately. That's why I organized a competition how to best and most
simple to fix it based on the existing schema as a post-processing step.
https://github.com/tomasonjo/cypher-direction-competition

I am adding the winner's code in this PR:
https://github.com/sakusaku-rich/cypher-direction-competition
2023-10-04 12:54:37 -04:00
Bagatur
dd514c2781 bump 308 (#11383) 2023-10-04 12:10:09 -04:00
Leonid Kuligin
4f4e0f38fc a better error description when GCP project is not set (#11377)
- **Description:** a little bit better error description
  - **Issue:** #10879
2023-10-04 11:57:47 -04:00
Nuno Campos
0d80226c64 Add _type to json functions output parser (#11381)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-04 16:56:45 +01:00
Bagatur
106608bc89 add default async (#11141) 2023-10-04 11:40:35 -04:00
Predrag Gruevski
88c5349196 Revert "Rm additional file check for scheduled tests (#11192)" (#11297)
This reverts commit ff90bb59bf.

Requires #11296 to merge first.
2023-10-04 11:35:55 -04:00
Nuno Campos
b0893c7c6a Use an enum for configurable_alternatives to make the generated json schema nicer (#11350) 2023-10-04 11:32:41 -04:00
Bagatur
b499de2926 Anthropic system message fix (#11301)
Removes human prompt prefix before system message for anthropic models

Bedrock anthropic api enforces that Human and Assistant messages must be
interleaved (cannot have same type twice in a row). We currently treat
System Messages as human messages when converting messages -> string
prompt. Our validation when using Bedrock/BedrockChat raises an error
when this happens. For ChatAnthropic we don't validate this so no error
is raised, but perhaps the behavior is still suboptimal
2023-10-04 11:32:24 -04:00
Anatolii Kmetiuk
34a64101cc Add explanations to GoogleDriveLoader how to avoid errors (#11335)
- **Description:** add a paragraph to the GoogleDriveLoader doc on how
to bypass errors on authentication.

For some reason, specifying credential path via `credentials_path`
constructor parameter when creating `GoogleDriveLoader` makes it so that
the oAuth screen is never showing up when first using GoogleDriveLoader.
Instead, the `RefreshError: ('invalid_grant: Bad Request', {'error':
'invalid_grant', 'error_description': 'Bad Request'})` error happens.
Setting it via `os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ...`
solves the problem. Also, `token_path` constructor parameter is
mandatory, otherwise another error happens when trying to `load()` for
the first time.

These errors are tricky and time-consuming to figure out, so I believe
it's good to mention them in the docs.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-04 11:12:54 -04:00
Massimiliano Angelino
2f83350eac Feat bedrock cohere support (#11230)
**Description:**
Added support for Cohere command model via Bedrock.
With this change it is now possible to use the `cohere.command-text-v14`
model via Bedrock API.

About Streaming: Cohere model outputs 2 additional chunks at the end of
the text being generated via streaming: a chunk containing the text
`<EOS_TOKEN>`, and a chunk indicating the end of the stream. In this
implementation I chose to ignore both chunks. An alternative solution
could be to replace `<EOS_TOKEN>` with `\n`

Tests: manually tested that the new model work with both
`llm.generate()` and `llm.stream()`.
Tested with `temperature`, `p` and `stop` parameters.

**Issue:** #11181 

**Dependencies:** No new dependencies

**Tag maintainer:** @baskaryan 

**Twitter handle:** mangelino

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-04 11:12:19 -04:00
Predrag Gruevski
37f2f71156 Trigger Docker release workflow after new langchain release is made. (#11290)
We want to publish a new Docker image after a new langchain Python
package version is published.
2023-10-04 10:27:08 -04:00
MattiaSangermano
cdf5259ca9 Fixed import typo (#11278)
Fixed small import typo in react_docstore documentation

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-04 10:18:10 -04:00
Daniel Butler
939bceccb0 GitHubIssuesLoader Custom API URL Support (#11378)
- **Description:** Adds support for custom API URL in the
GitHubIssuesLoader. This allows it to be used with Github enterprise
instances.
2023-10-04 10:17:46 -04:00
Bagatur
16a80779b9 bump 307 (#11380) 2023-10-04 10:03:17 -04:00
mziru
9e3c1d4463 add HTMLHeaderTextSplitter (#11039)
Description: Similar in concept to the `MarkdownHeaderTextSplitter`, the
`HTMLHeaderTextSplitter` is a "structure-aware" chunker that splits text
at the element level and adds metadata for each header "relevant" to any
given chunk. It can return chunks element by element or combine elements
with the same metadata, with the objectives of (a) keeping related text
grouped (more or less) semantically and (b) preserving context-rich
information encoded in document structures. It can be used with other
text splitters as part of a chunking pipeline.

Dependency: lxml python package

Maintainer: @hwchase17

Twitter handle: @MartinZirulnik

---------

Co-authored-by: PresidioVantage <github@presidiovantage.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-04 09:24:25 -04:00
Predrag Gruevski
289de601c8 Use parameterized queries to select SQL schemas. (#11356) 2023-10-04 05:43:30 +01:00
Nuno Campos
b0097f8908 In ProgressBarCallback update the progress counter also when runs fin… (#11332) 2023-10-04 05:04:59 +01:00
William FH
06f39be1c2 Wfh/eval max concurrency (#11368) 2023-10-03 20:18:14 -07:00
Isaac Chung
1165767df2 Clarifai integration doc improvements (#11251)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
- **Description:** Doc corrections and resolve notebook rendering issue
on GH
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** `@isaacchung1217`

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-03 21:47:57 -04:00
Oleg Sinavski
1ca62b232b Docs: improve similarity search examples (#11298)
**Description:** 

Examples in the "Select by similarity" section were not really
highlighting capabilities of similarity search.
E.g. "# Input is a measurement, so should select the tall/short example"
was still outputting the "mood" example.

I tweaked the inputs a bit and fixed the examples (checking that those
are indeed what the search outputs).

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-03 21:47:08 -04:00
Aashish Saini
4adb2b399d Fixed exception type in py files (#11322)
I've refactored the code to ensure that ImportError is consistently
handled. Instead of using ValueError as before, I've now followed the
standard practice of raising ImportError along with clear and
informative error messages. This change enhances the code's clarity and
explicitly signifies that any problems are associated with module
imports.
2023-10-03 21:46:26 -04:00
니콜라스
c6d7124675 Add 'device' to GPT4All (#11216)
Add device to GPT4All

- **Description:** GPT4All now supports GPU. This commit adds the option
to enable it.
- **Issue:** It closes
https://github.com/langchain-ai/langchain/issues/10486

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-10-03 17:37:30 -07:00
LeeJongBeom
92683262f4 Fix documents for RetrievalQAWithSourcesChain (#11292)
- **Description:** Fix typo about `RetrievalQAWithSourceChain` ->
`RetrievalQAWithSourcesChain`
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-03 17:36:16 -07:00
Harrison Chase
6e848b879a add default for async (#11367) 2023-10-03 17:28:14 -07:00
Predrag Gruevski
d21dd72d64 Upgrade CI workflows to poetry 1.6.1. (#11344) 2023-10-03 19:23:54 -04:00
Predrag Gruevski
6a936488db Upgrade root poetry dependencies and upgrade to poetry 1.6.1. (#11343) 2023-10-03 19:23:36 -04:00
Fynn Flügge
0a4baca291 chore: add kotlin code splitter (#11364)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

- **Description:** Adds Kotlin language to `TextSplitter`

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-03 18:35:36 -04:00
Ofer Mendelevitch
b93a08079e Updates to Vectara Implementation (#11366)
Replace this entire comment with:
  - **Description:** updates to documentation and API headers
  - **Tag maintainer:** @baskarya
  - **Twitter handle:** @ofermend
2023-10-03 18:34:39 -04:00
Erick Friis
745e3e29da add getattr case for llms.type_to_cls_dict (#11362)
For external libraries that depend on `type_to_cls_dict`, adds a
workaround to continue using the old format.

Recommend people use `get_type_to_cls_dict()` instead and only resolve
the imports when they're used.
2023-10-03 14:34:30 -07:00
Vicente Reyes
f3e13e7e5a Use term keyword according to the official python doc glossary (#11338)
- **Description:** use term keyword according to the official python doc
glossary, see https://docs.python.org/3/glossary.html
  - **Issue:** not applicable
  - **Dependencies:** not applicable
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** vreyespue
2023-10-03 12:56:08 -07:00
Leonid Ganeline
39316314fa fallback definition (#10504)
I've added a definition to `fallback` and fixed couple misspells. It was
not really clear what is the "fallback".
2023-10-03 12:38:59 -07:00
Predrag Gruevski
5d6b83d9cf Make a copy of external data instead of mutating another object's attributes. (#11349)
Fix for a bug surfaced as part of #11339. `mypy` caught this since the
types didn't match up.
2023-10-03 15:27:51 -04:00
Predrag Gruevski
42d979efdd Improve type hints and interface for SQL execution functionality. (#11353)
The previous API of the `_execute()` function had a few rough edges that
this PR addresses:
- The `fetch` argument was type-hinted as being able to take any string,
but any string other than `"all"` or `"one"` would `raise ValueError`.
The new type hints explicitly declare that only those values are
supported.
- The return type was type-hinted as `Sequence` but using `fetch =
"one"` would actually return a single result item. This was incorrectly
suppressed using `# type: ignore`. We now always return a list.
- Using `fetch = "one"` would return a single item if data was found, or
an empty *list* if no data was found. This was confusing, and we now
always return a list to simplify.
- The return type was `Sequence[Any]` which was a bit difficult to use
since it wasn't clear what one could do with the returned rows. I'm
making the new type `Dict[str, Any]` that corresponds to the column
names and their values in the query.

I've updated the use of this method elsewhere in the file to match the
new behavior.
2023-10-03 15:19:08 -04:00
Mohammad Mohtashim
3bddd708f7 Add memory to sql chain (#8597)
continuation of PR #8550

@hwchase17 please see and merge. And also close the PR #8550.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-03 12:04:39 -07:00
Harrison Chase
feabf2e0d5 make llm imports optional (#11237) 2023-10-03 09:14:15 -07:00
Harrison Chase
88bad37ec2 fix get_tool_return (#11346) 2023-10-03 09:01:05 -07:00
Ikko Eltociear Ashimine
49b34e2293 Fix typo in agent_structured.ipynb (#11340)
therefor -> therefore

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-03 09:00:38 -07:00
Harrison Chase
bdf865d8e8 better error message on parsing errors (#11342) 2023-10-03 09:00:17 -07:00
Lance Martin
b3c83fdd33 Add prompt hub support for Mistral w/ Ollama (#11315)
Add Mistral example with prompt support
2023-10-03 08:17:46 -07:00
Eugene Yurtsev
2343302fc6 Remove langserve from langchain repo (#11288)
LangServe has been moved to a separate repo
2023-10-03 10:48:35 -04:00
Bagatur
89436de7a7 update sec doc (#11336) 2023-10-03 10:22:53 -04:00
William FH
6950b44bfc Consolidate run collector. Add link helper (#11269)
Instead of:

```
client = Client()
with collect_runs() as cb:
    chain.invoke()
    run = cb.traced_runs[0]
    client.get_run_url(run)
```

it's
```
with tracing_v2_enabled() as cb:
    chain.invoke()
    cb.get_run_url()
```
2023-10-03 06:20:58 -07:00
Nuno Campos
0aedbcf7b2 Pass kwargs in runnable retry (#11324) 2023-10-03 09:55:02 +01:00
Aashish Saini
8a507154ca Update clarifai.mdx (#11318)
@baskaryan , Small typo fix
2023-10-02 22:16:00 -07:00
Jacob Lee
933655b4ac Adds Tavily Search API retriever (#11314)
@baskaryan @efriis
2023-10-02 17:12:17 -07:00
David Duong
3ec970cc11 Mark Vertex AI classes as serialisable (#10484)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-02 16:48:21 -07:00
David Duong
db36a0ee99 Make Google PaLM classes serialisable (#11121)
Similarly to Vertex classes, PaLM classes weren't marked as
serialisable. Should be working fine with LangSmith.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-10-02 15:46:48 -07:00
CG80499
943e4f30d8 Add scoring chain (#11123)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-02 15:15:31 -07:00
Predrag Gruevski
cd2479dfae Upgrade langchain dependency versions to resolve dependabot alerts. (#11307) 2023-10-02 18:06:41 -04:00
Nuno Campos
4df3191092 Add .configurable_fields() and .configurable_alternatives() to expose fields of a Runnable to be configured at runtime (#11282) 2023-10-02 21:18:36 +01:00
Eugene Yurtsev
5e2d5047af add LLMBashChain to experimental (#11305)
Add LLMBashChain to experimental
2023-10-02 16:00:14 -04:00
João Carabetta
29b9a890d4 Fix line break in docs imports (#11270)
It is just a straightforward docs fix.
2023-10-02 15:37:16 -04:00
Oleg Sinavski
0b08a17e31 Fix closing bracket in length-based selector snippet (#11294)
**Description:**

Fix a forgotten closing bracket in the length-based selector snippet

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-02 15:36:58 -04:00
Bagatur
38d5b63a10 Bedrock scheduled tests (#11194) 2023-10-02 15:21:54 -04:00
Eugene Yurtsev
f9b565fa8c Bump min version of numexpr (#11302)
Bump min version
2023-10-02 15:06:32 -04:00
William FH
64febf7751 Make numexpr optional (#11049)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-02 14:42:51 -04:00
Eugene Yurtsev
20b7bd497c Add pending deprecation warning (#11133)
This PR uses 2 dedicated LangChain warnings types for deprecations
(mirroring python's built in deprecation and pending deprecation
warnings).

These deprecation types are unslienced during initialization in
langchain achieving the same default behavior that we have with our
current warnings approach. However, because these warnings have a
dedicated type, users will be able to silence them selectively (I think
this is strictly better than our current handling of warnings).

The PR adds a deprecation warning to llm symbolic math.

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
2023-10-02 13:55:16 -04:00
Predrag Gruevski
6212d57f8c Add Google GitHub Action creds file to gitignore. (#11296)
Should resolve the issue here:
https://github.com/langchain-ai/langchain/actions/runs/6342767671/job/17229204508#step:7:36

After this merges, we can revert
https://github.com/langchain-ai/langchain/pull/11192
2023-10-02 13:53:02 -04:00
Nuno Campos
0638f7b83a Create new RunnableSerializable base class in preparation for configurable runnables (#11279)
- Also move RunnableBranch to its own file

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-02 17:41:23 +01:00
Nuno Campos
1cbe7f5450 Small changes to runnable docs (#11293)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-02 16:27:11 +01:00
Bagatur
8eec43ed91 bump 306 (#11289) 2023-10-02 10:25:08 -04:00
Nuno Campos
32a8b311eb Add base docker image and ci script for building and pushing (#10927) 2023-10-02 15:07:57 +01:00
zhengkai
3d859075d4 Remove extra spaces (#11283)
### Description
When I was reading the document, I found that some examples had extra
spaces and violated "Unexpected spaces around keyword / parameter equals
(E251)" in pep8. I removed these extra spaces.
  
### Tag maintainer
@eyurtsev 
### Twitter handle
[billvsme](https://twitter.com/billvsme)
2023-10-02 10:02:30 -04:00
James Odeyale
61cd83bf96 Update quickstart.mdx to add backtick after ChatMessages (#11241)
While going through the documentation I found this small issue and
wanted to contribute!

<!-- Thank you for contributing to LangChain! -->
2023-10-02 10:02:03 -04:00
Nuno Campos
c6a720f256 Lint 2023-10-02 10:34:13 +01:00
Nuno Campos
1d46ddd16d Lint 2023-10-02 10:29:20 +01:00
Nuno Campos
17708fc156 Lint 2023-10-02 10:28:58 +01:00
Nuno Campos
a3b82d1831 Move RunnableWithFallbacks to its own file 2023-10-02 10:26:10 +01:00
Nuno Campos
01dbfc2bc7 Lint 2023-10-02 10:21:40 +01:00
Nuno Campos
a6afd45c63 Lint 2023-10-02 10:14:56 +01:00
Nuno Campos
f7dd10b820 Lint 2023-10-02 10:13:09 +01:00
Nuno Campos
040bb2983d Lint 2023-10-02 10:11:26 +01:00
Nuno Campos
52e5a8b43e Create new RunnableSerializable class in preparation for configurable runnables
- Also move RunnableBranch to its own file
2023-10-02 10:07:30 +01:00
Yeonji-Lim
61ab1b1266 Fix typo in docstring (#11256)
Description : Remove meaningless 's' in docstring
2023-10-01 15:55:11 -04:00
Kazuki Maeda
a363ab5292 rename repo namespace to langchain-ai (#11259)
### Description
renamed several repository links from `hwchase17` to `langchain-ai`.

### Why
I discovered that the README file in the devcontainer contains an old
repository name, so I took the opportunity to rename the old repository
name in all files within the repository, excluding those that do not
require changes.

### Dependencies
none

### Tag maintainer
@baskaryan

### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)
2023-10-01 15:30:58 -04:00
Dayuan Jiang
17cdeb72ef minor fix: remove redundant code from OpenAIFunctionsAgent (#11245)
minor fix: remove redundant code from OpenAIFunctionsAgent (#11245)
2023-10-01 13:22:15 -04:00
Leonid Ganeline
5e5039dbd2 docs: updated YouTube and tutorial video links (#10897)
updated `YouTube` and `tutorial` videos with new links.
Removed couple of duplicates.
Reordered several links by view counters
Some formatting: emphasized the names of products
2023-09-30 16:37:28 -07:00
Leonid Ganeline
cb84f612c9 docs: document_transformers consistency (#10467)
- Updated `document_transformers` examples: titles, descriptions, links
- Added `integrations/providers` for missed document_transformers
2023-09-30 16:36:23 -07:00
Leonid Ganeline
240190db3f docs: integrations/memory consistency (#10255)
- updated titles and descriptions of the `integrations/memory` notebooks
into consistent and laconic format;
- removed
`docs/extras/integrations/memory/motorhead_memory_managed.ipynb` file as
a duplicate of the
`docs/extras/integrations/memory/motorhead_memory.ipynb`;
- added `integrations/providers` Integration Cards for `dynamodb`,
`motorhead`.
- updated `integrations/providers/redis.mdx` with links
- renamed several notebooks; updated `vercel.json` to reroute new names.
2023-09-30 16:35:55 -07:00
Michael Goin
33eb5f8300 Update DeepSparse LLM (#11236)
**Description:** Adds streaming and many more sampling parameters to the
DeepSparse interface

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-29 13:55:19 -07:00
Eugene Yurtsev
f91ce4eddf Bump deps in langserve (#11234)
Bump deps in langserve lockfile
2023-09-29 16:19:37 -04:00
Haozhe
4c97a10bd0 fix code injection vuln (#11233)
- **Description:** Fix a code injection vuln by adding one more keyword
into the filtering list
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** 
  - **Twitter handle:**

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-09-29 16:16:00 -04:00
Eugene Yurtsev
aebdb1ad01 Ignore aadd (#11235) 2023-09-29 21:10:53 +01:00
Eugene Yurtsev
8b4cb4eb60 Add type to message chunks (#11232) 2023-09-29 20:14:52 +01:00
Nuno Campos
fb66b392c6 Implement RunnablePassthrough.assign(...) (#11222)
Passes through dict input and assigns additional keys

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-29 20:12:48 +01:00
Nuno Campos
1ddf9f74b2 Add a streaming json parser (#11193)
<img width="1728" alt="Screenshot 2023-09-28 at 20 15 01"
src="https://github.com/langchain-ai/langchain/assets/56902/ed0644c3-6db7-41b9-9543-e34fce46d3e5">


<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-29 20:09:52 +01:00
Nuno Campos
ee56c616ff Remove flawed test
- It is not possible to access properties on classes, only on instances, therefore this test is not something we can implement
2023-09-29 20:05:33 +01:00
Nuno Campos
f3f3f71811 Lint 2023-09-29 19:57:40 +01:00
Nuno Campos
f6b0b065d3 Update json.py
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-09-29 19:34:35 +01:00
Nuno Campos
cbe18057b0 Update json.py
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-09-29 19:34:27 +01:00
Nuno Campos
aa8b4120a8 Keep exceptions when not in streaming mode 2023-09-29 19:21:27 +01:00
Nuno Campos
1f30e25681 Lint 2023-09-29 18:03:41 +01:00
Nuno Campos
c9d0f2b984 Combine with existing json output parsers 2023-09-29 17:55:30 +01:00
Eugene Yurtsev
b4354b7694 Make tests stricter, remove old code, fix up pydantic import when using v2 (#11231)
Make tests stricter, remove old code, fix up pydantic import when using v2 (#11231)
2023-09-29 12:47:02 -04:00
Eugene Yurtsev
572968fee3 Using langchain input types (#11204)
Using langchain input type
2023-09-29 12:37:09 -04:00
Bagatur
77c7c9ab97 bump 305 (#11224) 2023-09-29 08:55:00 -07:00
Nuno Campos
4b8442896b Make test deterministic 2023-09-29 16:50:00 +01:00
Ikko Eltociear Ashimine
33884b2184 Fix typo in gradient.ipynb (#11206)
Enviroment -> Environment

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-29 11:45:40 -04:00
Attila Tőkés
ba9371854f OpenAI gpt-3.5-turbo-instruct cost information (#11218)
Added pricing info for `gpt-3.5-turbo-instruct` for OpenAI and Azure
OpenAI.

Co-authored-by: Attila Tőkés <atokes@rws.com>
2023-09-29 08:44:55 -07:00
Eugene Yurtsev
de69ea26e8 Suppress warnings in interactive env that stem from tab completion (#11190)
Suppress warnings in interactive environments that can arise from users 
relying on tab completion (without even using deprecated modules).

jupyter seems to filter warnings by default (at least for me), but
ipython surfaces them all
2023-09-29 11:44:30 -04:00
Jon Saginaw
715ffda28b mongodb doc loader init (#10645)
- **Description:** A Document Loader for MongoDB
  - **Issue:** n/a
  - **Dependencies:** Motor, the async driver for MongoDB
  - **Tag maintainer:** n/a
  - **Twitter handle:** pigpenblue

Note that an initial mongodb document loader was created 4 months ago,
but the [PR ](https://github.com/langchain-ai/langchain/pull/4285)was
never pulled in. @leo-gan had commented on that PR, but given it is
extremely far behind the master branch and a ton has changed in
Langchain since then (including repo name and structure), I rewrote the
branch and issued a new PR with the expectation that the old one can be
closed.

Please reference that old PR for comments/context, but it can be closed
in favor of this one. Thanks!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-09-29 11:44:07 -04:00
Cynthia Yang
523898ab9c Update fireworks features (#11205)
Description
* Update fireworks feature on web page

Issue - Not applicable
Dependencies - None
Tag maintainer - @baskaryan
2023-09-29 08:37:06 -07:00
Nuno Campos
3d8aa88e26 Add async tests and comments 2023-09-29 15:28:46 +01:00
Nuno Campos
4ad0f3de2b Add RunnableGenerator (#11214)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-29 15:21:37 +01:00
Guy Korland
748a757306 Clean warnings: replace type with isinstance and fix syntax (#11219)
Clean warnings: replace type with `isinstance` and fix on notebook
syntax syntax
2023-09-29 10:06:33 -04:00
Nuno Campos
091d8845d5 Backwards compat 2023-09-29 14:18:38 +01:00
Nuno Campos
4e28a7a513 Implement diff 2023-09-29 14:12:48 +01:00
Nuno Campos
5cbe2b7b6a Implement diff 2023-09-29 14:12:18 +01:00
Nuno Campos
6c0a6b70e0 WIP Add tests§ 2023-09-29 14:11:34 +01:00
Nuno Campos
63f2ef8d1c Implement str one 2023-09-29 14:11:34 +01:00
Nuno Campos
f672b39cc9 Add a streaming json parser 2023-09-29 14:11:34 +01:00
Nuno Campos
2387647d30 Lint 2023-09-29 14:11:03 +01:00
Nuno Campos
0318cdd33c Add tests 2023-09-29 12:25:19 +01:00
Nuno Campos
b67db8deaa Add RunnableGenerator 2023-09-29 12:04:32 +01:00
Nuno Campos
ca5293bf54 Enable creating Tools from any Runnable (#11177)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-29 12:03:56 +01:00
Nuno Campos
e35ea565d1 Lint 2023-09-29 12:00:56 +01:00
Nuno Campos
7f589ebbc2 Lint 2023-09-29 11:57:01 +01:00
Nuno Campos
8be598f504 Fix invocation 2023-09-29 11:57:01 +01:00
Nuno Campos
6eb6c45c98 Enable creating Tools from any Runnable 2023-09-29 11:57:01 +01:00
Nuno Campos
61b5942adf Implement better reprs for Runnables (#11175)
```
ChatPromptTemplate(messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a nice assistant.')), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['question'], template='{question}'))])
| RunnableLambda(lambda x: x)
| {
    chat: FakeListChatModel(responses=["i'm a chatbot"]),
    llm: FakeListLLM(responses=["i'm a textbot"])
  }
```

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-29 11:56:28 +01:00
Nuno Campos
e8e2b812c9 Even more 2023-09-29 11:54:22 +01:00
Nuno Campos
fc072100fa skip more 2023-09-29 11:51:48 +01:00
Nuno Campos
7bfee012d5 Skip in py3.8 2023-09-29 11:49:12 +01:00
Nuno Campos
b8e3e1118d Skip for py3.8 2023-09-29 11:45:20 +01:00
William FH
db05ea2b78 Add from_embeddings for opensearch (#10957) 2023-09-29 00:00:58 -07:00
William FH
73693c18fc Add support for project metadata in run_on_dataset (#11200) 2023-09-28 21:26:37 -07:00
James Braza
b11f21c25f Updated LocalAIEmbeddings docstring to better explain why openai (#10946)
Fixes my misgivings in
https://github.com/langchain-ai/langchain/issues/10912
2023-09-28 19:56:42 -07:00
Eugene Yurtsev
2c114fcb5e Fix web-base loader (#11135)
Fix initialization

https://github.com/langchain-ai/langchain/issues/11095
2023-09-28 19:36:46 -07:00
jreinjr
3bc44b01c0 Typo fix to MathpixPDFLoader - changed processed_file_format default … (#10960)
…from mmd to md. https://github.com/langchain-ai/langchain/issues/7282

<!-- 
- **Description:** minor fix to a breaking typo - MathPixPDFLoader
processed_file_format is "mmd" by default, doesn't work, changing to
"md" fixes the issue,
- **Issue:** 7282
(https://github.com/langchain-ai/langchain/issues/7282),
  - **Dependencies:** none,
  - **Tag maintainer:** @hwchase17,
  - **Twitter handle:** none
 -->

Co-authored-by: jare0530 <7915+jare0530@users.noreply.ghe.oculus-rep.com>
2023-09-28 19:03:30 -07:00
Dr. Fabien Tarrade
66415eed6e Support new version of tiktoken that are working with langchain (tag "^0.3.2" => "">=0.3.2,<0.6.0" and python "^3.9" =>">=3.9") (#11006)
- **Description:**
be able to use langchain with other version than tiktoken 0.3.3 i.e
0.5.1
  - **Issue:**
cannot installed the conda-forge version since it applied all optional
dependency:
       https://github.com/conda-forge/langchain-feedstock/pull/85  
replace "^0.3.2" by "">=0.3.2,<0.6.0" and "^3.9" by python=">=3.9"
      Tested with python 3.10, langchain=0.0.288 and tiktoken==0.5.0

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 18:53:24 -07:00
Clément Sicard
1b48d6cb8c LlamaCppEmbeddings: adds verbose parameter, similar to llms.LlamaCpp class (#11038)
## Description

As of now, when instantiating and during inference, `LlamaCppEmbeddings`
outputs (a lot of) verbose when controlled from Langchain binding - it
is a bit annoying when computing the embeddings of long documents, for
instance.

This PR adds `verbose` for `LlamaCppEmbeddings` objects to be able
**not** to print the verbose of the model to `stderr`. It is natively
supported by `llama-cpp-python` and directly passed to the library – the
PR is hence very small.

The value of `verbose` is `True` by default, following the way it is
defined in [`LlamaCpp` (`llamacpp.py`
#L136-L137)](c87e9fb2ce/libs/langchain/langchain/llms/llamacpp.py (L136-L137))

## Issue

_No issue linked_

## Dependencies

_No additional dependency needed_

## To see it in action

```python
from langchain.embeddings import LlamaCppEmbeddings

MODEL_PATH = "<path_to_gguf_file>"

if __name__ == "__main__":
    llm_embeddings = LlamaCppEmbeddings(
        model_path=MODEL_PATH,
        n_gpu_layers=1,
        n_batch=512,
        n_ctx=2048,
        f16_kv=True,
        verbose=False,
    )
```

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 18:37:51 -07:00
Noah Czelusta
a00a73ef18 Add last_edited_time and created_time props to NotionDBLoader (#11020)
# Description

Adds logic for NotionDBLoader to correctly populate `last_edited_time`
and `created_time` fields from [page
properties](https://developers.notion.com/reference/page#property-value-object).

There are no relevant tests for this code to be updated.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 18:37:34 -07:00
Eugene Yurtsev
e06e84b293 LangServe: Relax requirements (#11198)
Relax requirements
2023-09-28 21:27:19 -04:00
PaperMoose
5d7c6d1bca Synthetic Data generation (#9472)
---------

Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 18:16:05 -07:00
Donatas Remeika
a4e0cf6300 SearchApi integration (#11023)
Based on the customers' requests for native langchain integration,
SearchApi is ready to invest in AI and LLM space, especially in
open-source development.

- This is our initial PR and later we want to improve it based on
customers' and langchain users' feedback. Most likely changes will
affect how the final results string is being built.
- We are creating similar native integration in Python and JavaScript.
- The next plan is to integrate into Java, Ruby, Go, and others.
- Feel free to assign @SebastjanPrachovskij as a main reviewer for any
SearchApi-related searches. We will be glad to help and support
langchain development.
2023-09-28 18:08:37 -07:00
Bagatur
8cd18a48e4 fix trubrics lint issue (#11202) 2023-09-28 18:07:50 -07:00
Fynn Flügge
b738ccd91e chore: add support for TypeScript code splitting (#11160)
- **Description:** Adds typescript language to `TextSplitter`

---------

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>
2023-09-28 16:41:51 -07:00
Kenneth Choe
17fcbed92c Support add_embeddings for opensearch (#11050)
- **Description:**
      -  Make running integration test for opensearch easy
- Provide a way to use different text for embedding: refer to #11002 for
more of the use case and design decision.
  - **Issue:** N/A
  - **Dependencies:** None other than the existing ones.
2023-09-28 16:41:11 -07:00
Jeff Kayne
c586f6dc1b Callback integration for Trubrics (#11059)
After contributing to some examples in the
[langsmith-cookbook](https://github.com/langchain-ai/langsmith-cookbook)
with @hinthornw, here is a PR that adds a callback handler to use
LangChain with [Trubrics](https://github.com/trubrics/trubrics-sdk).
2023-09-28 16:20:19 -07:00
Michael Landis
a8db594012 fix: short-circuit black and mypy calls when no changes made (#11051)
Both black and mypy expect a list of files or directories as input.
As-is the Makefile computes a list files changed relative to the last
commit; these are passed to black and mypy in the `format_diff` and
`lint_diff` targets. This is done by way of the Makefile variable
`PYTHON_FILES`. This is to save time by skipping running mypy and black
over the whole source tree.

When no changes have been made, this variable is empty, so the call to
black (and mypy) lacks input files. The call exits with error causing
the Makefile target to error out with:

```bash
$ make format_diff
poetry run black
Usage: black [OPTIONS] SRC ...

One of 'SRC' or 'code' is required.
make: *** [format_diff] Error 1
```

This is unexpected and undesirable, as the naive caller (that's me! 😄 )
will think something else is wrong. This commit smooths over this by
short circuiting when `PYTHON_FILES` is empty.
2023-09-28 16:13:07 -07:00
Michael Kim
fbcd8e02f2 Change type annotations from LLMChain to Chain in MultiPromptChain (#11082)
- **Description:** The types of 'destination_chains' and 'default_chain'
in 'MultiPromptChain' were changed from 'LLMChain' to 'Chain'. and
removed variables declared overlapping with the parent class
- **Issue:** When a class that inherits only Chain and not LLMChain,
such as 'SequentialChain' or 'RetrievalQA', is entered in
'destination_chains' and 'default_chain', a pydantic validation error is
raised.
-  -  codes
```
retrieval_chain = ConversationalRetrievalChain(
        retriever=doc_retriever,
        combine_docs_chain=combine_docs_chain,
        question_generator=question_gen_chain,
    )
    
    destination_chains = {
        'retrieval': retrieval_chain,
    }
    
    main_chain = MultiPromptChain(
        router_chain=router_chain,
        destination_chains=destination_chains,
        default_chain=default_chain,
        verbose=True,
    )
```

 `make format`, `make lint` and `make test`
2023-09-28 15:59:25 -07:00
Nicolas
8ed013d278 docs: Mendable Search Improvements (#11199)
Improvements to the Mendable UI, more accurate responses, and bug fixes.
2023-09-28 15:57:04 -07:00
Piyush Jain
32d09bcd1e Expanded version range for networkx, fixed sample notebook (#11094)
## Description
Expanded the upper bound for `networkx` dependency to allow installation
of latest stable version. Tested the included sample notebook with
version 3.1, and all steps ran successfully.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 15:33:30 -07:00
Piotr Mardziel
b40ecee4b9 FIx eval prompt (#11087)
**Description:** fixes a common typo in some of the eval criteria.
2023-09-28 15:21:15 -07:00
Guy Korland
5564833bd2 Add add_graph_documents support for FalkorDBGraph (#11122)
Adding `add_graph_documents` support for FalkorDBGraph and extending the
`Neo4JGraph` api so it can support `cypher.py`
2023-09-28 15:03:54 -07:00
Tomaz Bratanic
7d25a65b10 add from_existing_graph to neo4j vector (#11124)
This PR adds the option to create a Neo4jvector instance from existing
graph, which embeds existing text in the database and creates relevant
indices.
2023-09-28 15:02:26 -07:00
Noah Stapp
2c952de21a Add support for MongoDB Atlas $vectorSearch vector search (#11139)
Adds support for the `$vectorSearch` operator for
MongoDBAtlasVectorSearch, which was announced at .Local London
(September 26th, 2023). This change maintains breaks compatibility
support for the existing `$search` operator used by the original
integration (https://github.com/langchain-ai/langchain/pull/5338) due to
incompatibilities in the Atlas search implementations.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 15:01:03 -07:00
Hugues
b599f91e33 LLMonitor Callback handler: fix bug (#11128)
Here is a small bug fix for the LLMonitor callback handler. I've also
added user identification capabilities.
2023-09-28 15:00:38 -07:00
William FH
e9b51513e9 Shared Executor (#11028) 2023-09-28 13:30:58 -07:00
Justin Plock
926e4b6bad [Feat] Add optional client-side encryption to DynamoDB chat history memory (#11115)
**Description:** Added optional client-side encryption to the Amazon
DynamoDB chat history memory with an AWS KMS Key ID using the [AWS
Database Encryption SDK for
Python](https://docs.aws.amazon.com/database-encryption-sdk/latest/devguide/python.html)
**Issue:** #7886
**Dependencies:**
[dynamodb-encryption-sdk](https://pypi.org/project/dynamodb-encryption-sdk/)
**Tag maintainer:**  @hwchase17 
**Twitter handle:** [@jplock](https://twitter.com/jplock/)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-28 13:29:46 -07:00
Eugene Yurtsev
4947ac2965 Add langserve version (#11195)
Add langserve version
2023-09-28 16:24:00 -04:00
Bagatur
ef41bcef70 update docs nav (#11146) 2023-09-28 12:44:52 -07:00
Joseph McElroy
822fc590d9 [ElasticsearchStore] Improve migration text to ElasticsearchStore (#11158)
We noticed that as we have been moving developers to the new
`ElasticsearchStore` implementation, we want to keep the
ElasticVectorSearch class still available as developers transition
slowly to the new store.

To speed up this process, I updated the blurb giving them a better
recommendation of why they should use ElasticsearchStore.
2023-09-28 12:40:18 -07:00
Naveen Tatikonda
9b0029b9c2 [OpenSearch] Add Self Query Retriever Support to OpenSearch (#11184)
### Description
Add Self Query Retriever Support to OpenSearch

### Maintainers
@rlancemartin, @eyurtsev, @navneet1v

### Twitter Handle
@OpenSearchProj

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-09-28 12:36:52 -07:00
Arthur Telders
0da484be2c Add source metadata to OutlookMessageLoader (#11183)
Description: Add "source" metadata to OutlookMessageLoader

This pull request adds the "source" metadata to the OutlookMessageLoader
class in the load method. The "source" metadata is required when
indexing with RecordManager in order to sync the index documents with a
source.

Issue: None

Dependencies: None

Twitter handle: @ATelders

Co-authored-by: Arthur Telders <arthur.telders@roquette.com>
2023-09-28 14:58:12 -04:00
Bagatur
ff90bb59bf Rm additional file check for scheduled tests (#11192)
cc @obi1kenobi Causing issues with GHA creds
https://github.com/langchain-ai/langchain/actions/runs/6342674950/job/17228926776
2023-09-28 11:49:26 -07:00
Bagatur
3508e582f1 add anthropic scheduled tests and unit tests (#11188) 2023-09-28 11:47:29 -07:00
Eugene Yurtsev
fd96878c4b Fix anthropic secret key when passed in via init (#11185)
Fixes anthropic secret key when passed via init

https://github.com/langchain-ai/langchain/issues/11182
2023-09-28 14:21:41 -04:00
Bagatur
f201d80d40 temporarily skip embedding empty string test (#11187) 2023-09-28 11:20:00 -07:00
Eugene Yurtsev
b3cf9c8759 LangServe: Update langchain requirement for publishing (#11186)
Update langchain requirement for publishing
2023-09-28 14:11:58 -04:00
Eugene Yurtsev
176d71dd85 LangServe: Add release workflow (#11178)
Add release workflow to langserve
2023-09-28 13:47:55 -04:00
mani2348
89ddc7cbb6 Update Bedrock service name to "bedrock-runtime" and model identifiers (#11161)
- **Description:** Bedrock updated boto service name to
"bedrock-runtime" for the InvokeModel and InvokeModelWithResponseStream
APIs. This update also includes new model identifiers for Titan text,
embedding and Anthropic.

Co-authored-by: Mani Kumar Adari <maniadar@amazon.com>
2023-09-28 09:42:56 -07:00
Eugene Yurtsev
de3e25683e Expose lc_id as a classmethod (#11176)
* Expose LC id as a class method 
* User should not need to know that the last part of the id is the class
name
2023-09-28 17:25:27 +01:00
Nuno Campos
5ca461160b Lint 2023-09-28 17:12:07 +01:00
Nuno Campos
151f27d502 Lint 2023-09-28 16:42:58 +01:00
Eugene Yurtsev
4ba9c16f74 mypy 2023-09-28 11:27:20 -04:00
Eugene Yurtsev
44489e7029 LangServe: Clean up init files (#11174)
Clean up init files
2023-09-28 11:10:42 -04:00
Akio Nishimura
785b9d47b7 Fix stop key of TextGen. (#11109)
The key of stopping strings used in text-generation-webui api is
[`stopping_strings`](https://github.com/oobabooga/text-generation-webui/blob/main/api-examples/api-example.py#L51),
not `stop`.
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-28 11:05:24 -04:00
Eugene Yurtsev
d1d7d0cb27 x 2023-09-28 10:56:50 -04:00
Eugene Yurtsev
c86b2b5e42 x 2023-09-28 10:53:30 -04:00
Eugene Yurtsev
fe4f3b8fdf x 2023-09-28 10:51:28 -04:00
Eugene Yurtsev
a5b15e9d0f x 2023-09-28 10:51:17 -04:00
Nuno Campos
5c1f462bb9 Implement better reprs for Runnables 2023-09-28 15:24:51 +01:00
Aashish Saini
573c846112 Fixed Typo Error in Update get_started.mdx file by addressing a minor typographical error. (#11154)
Fixed Typo Error in Update get_started.mdx file by addressing a minor
typographical error.

This improvement enhances the readability and correctness of the
notebook, making it easier for users to understand and follow the
demonstration. The commit aims to maintain the quality and accuracy of
the content within the repository.
please review the change at your convenience.

@baskaryan , @hwaking
2023-09-28 09:54:43 -04:00
Nan LI
53a9d6115e Xata chat memory FIX (#11145)
- **Description:** Changed data type from `text` to `json` in xata for
improved performance. Also corrected the `additionalKwargs` key in the
`messages()` function to `additional_kwargs` to adhere to `BaseMessage`
requirements.
- **Issue:** The Chathisroty.messages() will return {} of
`additional_kwargs`, as the name is wrong for `additionalKwargs` .
  - **Dependencies:**  N/A
  - **Tag maintainer:** N/A
  - **Twitter handle:** N/A

My PR is passing linting and testing before submitting.
2023-09-28 09:52:15 -04:00
Apurv Agarwal
7bb6d04fc7 milvus collections (#11148)
Description: There was no information about Milvus collections in the
documentation, so I am adding that.
Maintainer: @eyurtsev
2023-09-28 09:47:58 -04:00
William FH
8ae9b71e41 Async support for OpenAIFunctionsAgentOutputParser (#11140) 2023-09-28 09:42:59 -04:00
Bagatur
ce08f436db Expose loads and dumps in load namespace 2023-09-28 09:34:48 -04:00
Nuno Campos
cfa2203c62 Add input/output schemas to runnables (#11063)
This adds `input_schema` and `output_schema` properties to all
runnables, which are Pydantic models for the input and output types
respectively. These are inferred from the structure of the Runnable as
much as possible, the only manual typing needed is
- optionally add type hints to lambdas (which get translated to
input/output schemas)
- optionally add type hint to RunnablePassthrough

These schemas can then be used to create JSON Schema descriptions of
input and output types, see the tests

- [x] Ensure no InputType and OutputType in our classes use abstract
base classes (replace with union of subclasses)
- [x] Implement in BaseChain and LLMChain
- [x] Implement in RunnableBranch
- [x] Implement in RunnableBinding, RunnableMap, RunnablePassthrough,
RunnableEach, RunnableRouter
- [x] Implement in LLM, Prompt, Chat Model, Output Parser, Retriever
- [x] Implement in RunnableLambda from function signature
- [x] Implement in Tool

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-28 11:05:15 +01:00
Eugene Yurtsev
b05bb9e136 LangServe (#11046)
Adds LangServe package

* Integrate Runnables with Fast API creating Server and a RemoteRunnable
client
* Support multiple runnables for a given server
* Support sync/async/batch/abatch/stream/astream/astream_log on the
client side (using async implementations on server)
* Adds validation using annotations (relying on pydantic under the hood)
-- this still has some rough edges -- e.g., open api docs do NOT
generate correctly at the moment
* Uses pydantic v1 namespace

Known issues: type translation code doesn't handle a lot of types (e.g.,
TypedDicts)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2023-09-28 10:52:44 +01:00
Nuno Campos
77ce9ed6f1 Support using async callback handlers with sync callback manager (#10945)
The current behaviour just calls the handler without awaiting the
coroutine, which results in exceptions/warnings, and obviously doesn't
actually execute whatever the callback handler does

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-09-28 10:39:01 +01:00
Bagatur
48a04aed75 bump 304 (#11147) 2023-09-27 19:24:09 -07:00
Jonathan Evans
23065f54c0 Added prompt wrapping for Claude with Bedrock (#11090)
- **Description:** Prompt wrapping requirements have been implemented on
the service side of AWS Bedrock for the Anthropic Claude models to
provide parity between Anthropic's offering and Bedrock's offering. This
overnight change broke most existing implementations of Claude, Bedrock
and Langchain. This PR just steals the the Anthropic LLM implementation
to enforce alias/role wrapping and implements it in the existing
mechanism for building the request body. This has also been tested to
fix the chat_model implementation as well. Happy to answer any further
questions or make changes where necessary to get things patched and up
to PyPi ASAP, TY.
- **Issue:** No issue opened at the moment, though will update when
these roll in.
  - **Dependencies:** None

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-27 19:20:07 -07:00
xiaoyu
b87cc8b31e add 3 property types in metadata for notiondb loader (#8509)
### Description: 
NotionDB supports a number of common property types. I have found three
common types that are not included in notiondb loader. When programs
loaded them with notiondb, which will cause some metadata information
not to be passed to langchain. Therefore, I added three common types:
- date
- created_time
- last_edit_time.

### Issue: 
no
### Dependencies: 
No dependencies added :)
### Tag maintainer: 
@rlancemartin, @eyurtsev
### Twitter handle: 
@BJTUTC
2023-09-27 17:38:05 -07:00
Harrison Chase
258d67b0ac Revert "improve the performance of base.py" (#11143)
Reverts langchain-ai/langchain#8610

this is actually an oversight - this merges all dfs into one df. we DO
NOT want to do this - the idea is we work and manipulate multiple dfs
2023-09-27 17:37:29 -07:00
Mohamad Zamini
9306394078 improve the performance of base.py (#8610)
This removes the use of the intermediate df list and directly
concatenates the dataframes if path is a list of strings. The pd.concat
function combines the dataframes efficiently, making it faster and more
memory-efficient compared to appending dataframes to a list.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-27 17:36:03 -07:00
Mincoolee
05b75f3f13 feat: add support for arxiv identifier in ArxivAPIWrapper() (#9318)
- Description: this PR adds the support for arxiv identifier of the
ArxivAPIWrapper. I modified the `run()` and `load()` functions in
`arxiv.py`, using regex to recognize if the query is in the form of
arxiv identifier (see
[https://info.arxiv.org/help/find/index.html](https://info.arxiv.org/help/find/index.html)).
If so, it will directly search the paper corresponding to the arxiv
identifier. I also modified and added tests in `test_arxiv.py`.
  - Issue: #9047 
  - Dependencies: N/A
  - Tag maintainer: N/A

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-27 17:35:16 -07:00
William FH
d3c2ca5656 Enhanced pairwise error (#11131) 2023-09-27 16:04:43 -07:00
Taqi Jaffri
b7e9db5e73 Stop sequences in fireworks, plus notebook updates (#11136)
The new Fireworks and FireworksChat implementations are awesome! Added
in this PR https://github.com/langchain-ai/langchain/pull/11117 thank
you @ZixinYang

However, I think stop words were not plumbed correctly. I've made some
simple changes to do that, and also updated the notebook to be a bit
clearer with what's needed to use both new models.


---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-09-27 16:01:05 -07:00
William FH
33da8bd711 Add Exact match and Regex Match Evaluators (#11132) 2023-09-27 14:18:07 -07:00
Harrison Chase
e355606b11 add more import checks (#11033) 2023-09-27 11:17:12 -07:00
Dan Bolser
efb7c459a2 Update base.py (#10843)
Fixing a typo in the example code in the docstring...

You have to start somewhere though right?

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-09-27 11:15:58 -07:00
Jeremy Naccache
c59a5bae48 Fix intermediate steps example in docs : replaced json.dumps with Langchain's dumps() (#10593)
The intermediate steps example in docs has an example on how to retrieve
and display the intermediate steps.
But the intermediate steps object is of type AgentAction which cannot be
passed to json.dumps (it raises an error).
I replaced it with Langchain's dumps function (from langchain.load.dump
import dumps) which is the preferred way to do so.
2023-09-27 11:00:29 -07:00
tanujtiwari-at
a79f595543 Support extra tools argument for pandas agent toolkit (#11040)
**Description** 

We support adding new tools in some toolkits already like the [SQLAgent
toolkit](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/agent_toolkits/sql/base.py#L27).

Related
[SO](https://stackoverflow.com/questions/76583163/are-langchain-toolkits-able-to-be-modified-can-we-add-tools-to-a-pandas-datafra)
thread
This replicates the same functionality here, so users can add custom
bespoke tools.
2023-09-27 10:57:04 -07:00
Aashish Saini
c4471d1877 Fixing some spelling mistakes (#10881)
@baskaryan

---------

Co-authored-by: AashutoshPathakShorthillsAI <142410372+AashutoshPathakShorthillsAI@users.noreply.github.com>
Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com>
Co-authored-by: Aashish Saini <141953346+AashishSainiShorthillsAI@users.noreply.github.com>
Co-authored-by: ManpreetShorthillsAI <142380984+ManpreetShorthillsAI@users.noreply.github.com>
Co-authored-by: AryamanJaiswalShorthillsAI <142397527+AryamanJaiswalShorthillsAI@users.noreply.github.com>
Co-authored-by: Adarsh Shrivastav <142413097+AdarshKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: Vishal <141389263+VishalYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: ChetnaGuptaShorthillsAI <142381084+ChetnaGuptaShorthillsAI@users.noreply.github.com>
Co-authored-by: PankajKumarShorthillsAI <142473460+PankajKumarShorthillsAI@users.noreply.github.com>
Co-authored-by: AbhishekYadavShorthillsAI <142393903+AbhishekYadavShorthillsAI@users.noreply.github.com>
Co-authored-by: AmitSinghShorthillsAI <142410046+AmitSinghShorthillsAI@users.noreply.github.com>
Co-authored-by: Md Nazish Arman <142379599+MdNazishArmanShorthillsAI@users.noreply.github.com>
Co-authored-by: KamalSharmaShorthillsAI <142474019+KamalSharmaShorthillsAI@users.noreply.github.com>
Co-authored-by: Lakshya <lakshyagupta87@yahoo.com>
Co-authored-by: AnujMauryaShorthillsAI <142393269+AnujMauryaShorthillsAI@users.noreply.github.com>
Co-authored-by: Saransh Sharma <142397365+SaranshSharmaShorthillsAI@users.noreply.github.com>
Co-authored-by: GhayurHamzaShorthillsAI <136243850+GhayurHamzaShorthillsAI@users.noreply.github.com>
Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com>
Co-authored-by: Riya Rana <142411643+RiyaRanaShorthillsAI@users.noreply.github.com>
Co-authored-by: Akshay Tripathi <142379735+AkshayTripathiShorthillsAI@users.noreply.github.com>
2023-09-27 10:56:51 -07:00
Bagatur
410ac8129d bump 303 (#11120) 2023-09-27 08:30:33 -07:00
Bagatur
8e4dbae428 Add fireworks chat model (#11117) 2023-09-27 08:22:12 -07:00
Bagatur
657581dbdf Fix ChatFireworks typing 2023-09-27 08:15:40 -07:00
Bagatur
12aad659dd add ChatFireworks to chat_models 2023-09-27 08:11:26 -07:00
Bagatur
872ebdaf90 remove FireworksChat from llms 2023-09-27 08:10:41 -07:00
Bagatur
9451240941 Fix fireworks chat linting issues 2023-09-27 08:09:33 -07:00
Harrison Chase
6b4928ad96 fix-lcel-notebooks (#11111)
fix some missing imports/naming
2023-09-27 06:36:11 -07:00
Tomáš Dvořák
865a21938c speed up enforce_stop_tokens helper function (#10984)
**Description:**

As long as `enforce_stop_tokens` returns a first occurrence, we can
speed up the execution by setting the optional `maxsplit` parameter to
1.

Tag maintainer:
@agola11
@hwchase17

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-27 05:29:29 -07:00
Austin Walker
bb41252dab fix: bump min_unstructured_version for UnstructuredAPIFileLoader (#11025)
**Description:** New metadata fields were added to
`unstructured==0.10.15`, and our hosted api has been updated to reflect
this. When users call `partition_via_api` with an older version of the
library, they'll hit a parsing error related to the new fields.
2023-09-27 05:28:06 -07:00
William FH
75b3893daf Fix runnable branch callbacks (#11091)
We aren't calling on_chain_end here unless we use the default option
2023-09-27 11:38:56 +01:00
Bagatur
6c5251feb0 poetry 2023-09-26 20:12:49 -07:00
Bagatur
5310184f96 poetry 2023-09-26 20:12:29 -07:00
Cynthia Yang
6dd44ff1c0 Refactor Fireworks and add ChatFireworks (#3) (#10597)
Description 
* Refactor Fireworks within Langchain LLMs.
* Remove FireworksChat within Langchain LLMs.
* Add ChatFireworks (which uses chat completion api) to Langchain chat
models.
* Users have to install `fireworks-ai` and register an api key to use
the api.

Issue - Not applicable
Dependencies - None
Tag maintainer - @rlancemartin @baskaryan
2023-09-26 20:11:55 -07:00
Bagatur
5514ebe859 Don't type chains in output_parsers (#11092)
Can't use TYPE_CHECKING style imports for pydantic params because it will try to instantiate the typed object by default.
2023-09-26 17:49:35 -07:00
CG80499
64385c4eae Make pairwise comparison chain more like LLM as a judge (#11013)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:**: Adds LLM as a judge as an eval chain
  - **Tag maintainer:** @hwchase17 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-09-26 13:19:04 -07:00
Joseph McElroy
175ef0a55d [ElasticsearchStore] Enable custom Bulk Args (#11065)
This enables bulk args like `chunk_size` to be passed down from the
ingest methods (from_text, from_documents) to be passed down to the bulk
API.

This helps alleviate issues where bulk importing a large amount of
documents into Elasticsearch was resulting in a timeout.

Contribution Shoutout
- @elastic

- [x] Updated Integration tests

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-26 12:53:50 -07:00
Eugene Yurtsev
d19fd0cfae LogEntry/LogStream use str instead of uuid for id (#11080)
Cast the UUID to a string
2023-09-26 20:38:51 +01:00
Bagatur
d85339b9f2 extract sublinks exclude by abs path (#11079) 2023-09-26 12:26:27 -07:00
Bagatur
7ee8b2d1bf exclude dirs in async recursive loading (#11077) 2023-09-26 09:59:04 -07:00
Leonid Ganeline
21199cc7b4 📖 docs: fixed integrations/document loaders toc (#9281)
Fixed navbar:
- renamed several files, so ToC is sorted correctly
- made ToC items consistent: formatted several Titles
- added several links
- reformatted several docs to a consistent format
- renamed several files (removed `_example` suffix)
- added renamed files to the `docs/docs_skeleton/vercel.json`
2023-09-26 09:47:37 -07:00
Bagatur
0ea384d575 fix multiple chains lcel how to (#11074) 2023-09-26 08:39:02 -07:00
olgavrou
3a299b9680 Merge pull request #15 from VowpalWabbit/move_things_around
Move everything into langchain_experimental
2023-09-11 20:46:23 +03:00
olgavrou
32445de365 remove log line 2023-09-11 13:44:24 -04:00
olgavrou
30d02e3a34 fix linting 2023-09-11 13:36:01 -04:00
olgavrou
42d0d485a9 black formatting 2023-09-11 13:33:43 -04:00
olgavrou
ccea1e9147 fix linting error 2023-09-11 13:31:47 -04:00
olgavrou
7185fdc990 check if libcublas is available before running extended tests 2023-09-11 13:26:41 -04:00
olgavrou
248db75cd6 fix linting errors 2023-09-11 13:01:18 -04:00
olgavrou
631289a38d move unit tests into integration tests 2023-09-11 12:46:24 -04:00
olgavrou
a2f29bf595 ignore linting 2023-09-11 12:45:39 -04:00
olgavrou
534f1b63c5 Merge remote-tracking branch 'origin' into move_things_around 2023-09-11 12:23:58 -04:00
olgavrou
3d700aa654 merge from upstream/master 2023-09-11 12:23:03 -04:00
olgavrou
2dba4046fa update experimental poetry lock 2023-09-11 12:20:19 -04:00
olgavrou
b78d672a43 merge from upstream/master 2023-09-11 12:18:23 -04:00
olgavrou
11f20cded1 move everything into experimental 2023-09-11 12:16:08 -04:00
olgavrou
514857c10e Merge pull request #13 from VowpalWabbit/small_dep_fixes
fixes
2023-09-05 13:01:01 -04:00
olgavrou
15d33a144d Merge pull request #14 from VowpalWabbit/notebook_fix
Notebook fix
2023-09-05 12:15:52 -04:00
olgavrou
235dacc74a Merge branch 'langchain-ai:master' into master 2023-09-05 11:14:08 -04:00
olgavrou
3a4c895280 Merge pull request #11 from VowpalWabbit/add_notebook
add random policy and notebook example
2023-09-05 09:36:20 -04:00
olgavrou
327ea43c67 Empty-Commit 2023-09-05 00:14:04 -04:00
olgavrou
1d4e73b9f8 Merge remote-tracking branch 'origin' into small_dep_fixes 2023-09-04 23:55:38 -04:00
olgavrou
d6320cc2c0 .. 2023-09-04 23:47:26 -04:00
olgavrou
7a4387c60d notebook fix 2023-09-04 23:46:04 -04:00
olgavrou
e1791225ae Merge remote-tracking branch 'origin' into small_dep_fixes 2023-09-04 22:49:16 -04:00
olgavrou
fdb611cc42 update poetry 2023-09-04 22:45:50 -04:00
olgavrou
8d3a8fbefe fixes 2023-09-04 22:31:15 -04:00
olgavrou
9c45d5a27e restore hash keys 2023-09-04 20:58:05 -04:00
olgavrou
f22fcb8bcd no cache 2023-09-04 20:52:18 -04:00
olgavrou
8dc5365ee2 no cache key 2023-09-04 20:50:25 -04:00
olgavrou
5b6ebbc825 fixes in notebook 2023-09-04 19:42:43 -04:00
olgavrou
5c2069890f policy fixes 2023-09-04 18:46:45 -04:00
olgavrou
736e0dd46e fix 2023-09-04 18:40:53 -04:00
olgavrou
5b1812f95b fix linting checks 2023-09-04 18:35:59 -04:00
olgavrou
f1d144cd6c run notebook and change location 2023-09-04 18:33:05 -04:00
olgavrou
62cf108700 add random policy and notebook 2023-09-04 18:08:46 -04:00
olgavrou
af4b560b86 fix poetry after merge 2023-09-04 17:28:11 -04:00
olgavrou
00d56fb0fc merge from upstream 2023-09-04 16:48:59 -04:00
olgavrou
b59e2b5afa Merge pull request #10 from VowpalWabbit/dot_prods_auto_embed
Dot prods auto embed
2023-09-05 05:01:42 -04:00
olgavrou
ae5edefdcd cleanup 2023-09-04 16:36:29 -04:00
olgavrou
e10980d445 fix linting error 2023-09-04 08:56:34 -04:00
olgavrou
0f7cde023b fix linting errors 2023-09-04 08:43:48 -04:00
olgavrou
4e9aecda90 formatting 2023-09-04 08:35:29 -04:00
olgavrou
67dc1a9dd2 cleanup 2023-09-04 07:36:47 -04:00
olgavrou
ca163f0ee6 fixes and tests 2023-09-04 07:10:44 -04:00
olgavrou
b162f1c8e1 dot product of encodings as default auto_embed 2023-09-04 05:50:15 -04:00
olgavrou
a9ba6a8cd1 Merge pull request #9 from VowpalWabbit/fix_embedding_w_indexes
proper embeddings and rolling window average
2023-09-01 10:07:53 -04:00
olgavrou
2b90a8afa2 Merge branch 'langchain-ai:master' into master 2023-09-01 04:10:49 -04:00
olgavrou
2c877a4a34 proper embeddings and rolling window average 2023-08-31 20:14:41 -04:00
olgavrou
b7d0e4835e Merge branch 'langchain-ai:master' into master 2023-08-31 08:02:14 -04:00
olgavrou
dfc3295a2c Merge branch 'langchain-ai:master' into master 2023-08-30 04:03:20 -04:00
olgavrou
256849e02a Merge pull request #8 from VowpalWabbit/update_w_score
update score to take entire response object to make it easier for user
2023-08-29 09:18:52 -04:00
olgavrou
d46ad01ee0 Merge pull request #7 from VowpalWabbit/scorer_activate_deactivate
activate and deactivate scorer
2023-08-29 09:12:11 -04:00
olgavrou
5fb781dfde Merge pull request #6 from VowpalWabbit/cb_defaults
cb defaults and some fixes
2023-08-29 08:47:28 -04:00
olgavrou
48aaa27bf7 update score to take entire response object to make it easier for user 2023-08-29 08:46:55 -04:00
olgavrou
c4ccaebbbb activate and deactivate scorer 2023-08-29 08:37:59 -04:00
olgavrou
7eaaad51de cb defaults and some fixes 2023-08-29 07:42:45 -04:00
olgavrou
42bdb003ee Merge pull request #5 from VowpalWabbit/nosockettests
unit tests to use mock encoder
2023-08-29 07:28:03 -04:00
olgavrou
f8b5c2977a restore ci workflow 2023-08-29 07:17:40 -04:00
olgavrou
5727148f2b make sure test don't try to download sentence transformer models 2023-08-29 07:09:58 -04:00
olgavrou
72eab3b37e test 2023-08-29 06:35:27 -04:00
olgavrou
4b930f58e9 test 2023-08-29 06:28:07 -04:00
olgavrou
0a2724d8c7 test 2023-08-29 06:27:56 -04:00
olgavrou
5de212d907 Merge branch 'langchain-ai:master' into master 2023-08-29 05:58:22 -04:00
olgavrou
f7fb083aba Merge pull request #3 from VowpalWabbit/fix_linting
Fix mypy errors
2023-08-29 05:58:03 -04:00
olgavrou
4e6e03ef50 fix mypy complaint 2023-08-29 05:51:52 -04:00
olgavrou
d50c0f139d re order imports 2023-08-29 05:46:56 -04:00
olgavrou
758225dc17 include type 2023-08-29 05:44:09 -04:00
olgavrou
44485c2b26 make input arg type more explicit 2023-08-29 05:42:45 -04:00
olgavrou
8d10a52525 fix linting complaints 2023-08-29 05:36:45 -04:00
olgavrou
b3c0728de2 fix mypy errors in tests 2023-08-29 05:28:43 -04:00
olgavrou
0b8691c6e5 fix all mypy errors and some renaming and refactoring 2023-08-29 05:19:19 -04:00
olgavrou
a11ad11d06 fix all mypy errors 2023-08-29 03:59:01 -04:00
olgavrou
dd6fff1c62 no errors in pick best chain 2023-08-28 08:13:23 -04:00
olgavrou
6a1102d4c0 mypy fixes and formatting 2023-08-28 06:58:33 -04:00
olgavrou
7725192a0d update deps for vw 2023-08-28 04:58:55 -04:00
olgavrou
2bfa73257f sync from upstream master 2023-08-28 04:15:57 -04:00
olgavrou
571ee718ba Merge pull request #2 from VowpalWabbit/fixes
Dependency and import fixes
2023-08-22 13:39:46 -04:00
olgavrou
e9423300d9 Merge pull request #1 from VowpalWabbit/add_rl_chain
Initial commit of rl_chain code
2023-08-22 09:18:23 -04:00
olgavrou
c9e9c0eeae add sentence transformers to extended test deps 2023-08-18 07:56:20 -04:00
olgavrou
44badd0707 add dependency requirements to test file 2023-08-18 07:19:56 -04:00
olgavrou
e276ae2616 linting and formatting 2023-08-18 07:12:39 -04:00
olgavrou
5aafb3bc46 resolving linting and formatting errors 2023-08-18 07:09:30 -04:00
olgavrou
a2f807e055 make vw dependency optional 2023-08-18 05:51:26 -04:00
olgavrou
1ae5a9c7a3 fix lock, imports, deps, test w deps, typo, formatting 2023-08-18 05:45:21 -04:00
olgavrou
a6f9dccc35 rename rl_chain_base to base and update paths and imports 2023-08-18 03:42:17 -04:00
olgavrou
b422dc035f fix imports 2023-08-18 03:23:20 -04:00
olgavrou
c37fd29fd8 move tests to correct directory and cleanup slates examples 2023-08-18 02:22:00 -04:00
olgavrou
56b40beb0e keep only what is needed for first PR 2023-08-18 02:04:35 -04:00
olgavrou
6de1ca4251 Imported changes from repo VowpalWabbit/rl_chain into rl_chain directory 2023-08-18 02:02:01 -04:00
2874 changed files with 264171 additions and 37595 deletions

View File

@@ -5,10 +5,10 @@ This project includes a [dev container](https://containers.dev/), which lets you
You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).
## GitHub Codespaces
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/langchain-ai/langchain)
You may use the button above, or follow these steps to open this repo in a Codespace:
1. Click the **Code** drop-down menu at the top of https://github.com/hwchase17/langchain.
1. Click the **Code** drop-down menu at the top of https://github.com/langchain-ai/langchain.
1. Click on the **Codespaces** tab.
1. Click **Create codespace on master** .
@@ -17,13 +17,16 @@ For more info, check out the [GitHub documentation](https://docs.github.com/en/f
## VS Code Dev Containers
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
Note: If you click this link you will open the main repo and not your local cloned repo, you can use this link and replace with your username and cloned repo name:
Note: If you click the link above you will open the main repo (langchain-ai/langchain) and not your local cloned repo. This is fine if you only want to run and test the library, but if you want to contribute you can use the link below and replace with your username and cloned repo name:
```
https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/<yourusername>/<yourclonedreponame>
```
Then you will have a local cloned repo where you can contribute and then create pull requests.
If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
You can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
Alternatively you can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).

132
.github/CODE_OF_CONDUCT.md vendored Normal file
View File

@@ -0,0 +1,132 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
conduct@langchain.dev.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -1,20 +1,19 @@
# Contributing to LangChain
Hi there! Thank you for even being interested in contributing to LangChain.
As an open source project in a rapidly developing field, we are extremely open
to contributions, whether they be in the form of new features, improved infra, better documentation, or bug fixes.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
## 🗺️ Guidelines
### 👩‍💻 Contributing Code
To contribute to this project, please follow a ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
To contribute to this project, please follow the ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
Please do not try to push directly to this repo unless you are a maintainer.
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant
maintainers.
Pull requests cannot land without passing the formatting, linting and testing checks first. See [Testing](#testing) and
Pull requests cannot land without passing the formatting, linting, and testing checks first. See [Testing](#testing) and
[Formatting and Linting](#formatting-and-linting) for how to run these checks locally.
It's essential that we maintain great documentation and testing. If you:
@@ -27,16 +26,14 @@ It's essential that we maintain great documentation and testing. If you:
- Add a demo notebook in `docs/modules`.
- Add unit and integration tests.
We're a small, building-oriented team. If there's something you'd like to add or change, opening a pull request is the
We are a small, progress-oriented team. If there's something you'd like to add or change, opening a pull request is the
best way to get our attention.
### 🚩GitHub Issues
Our [issues](https://github.com/hwchase17/langchain/issues) page is kept up to date
with bugs, improvements, and feature requests.
Our [issues](https://github.com/langchain-ai/langchain/issues) page is kept up to date with bugs, improvements, and feature requests.
There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help
organize issues.
There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help organize issues.
If you start working on an issue, please assign it to yourself.
@@ -59,12 +56,12 @@ we do not want these to get in the way of getting good code into the codebase.
## 🚀 Quick Start
This quick start describes running the repository locally.
For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer).
This quick start guide explains how to run the repository locally.
For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/langchain-ai/langchain/tree/master/.devcontainer).
### Dependency Management: Poetry and other env/dependency managers
This project uses [Poetry](https://python-poetry.org/) v1.5.1+ as a dependency manager.
This project utilizes [Poetry](https://python-poetry.org/) v1.6.1+ as a dependency manager.
❗Note: *Before installing Poetry*, if you use `Conda`, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
@@ -75,11 +72,11 @@ tell Poetry to use the virtualenv python environment (`poetry config virtualenvs
### Core vs. Experimental
There are two separate projects in this repository:
- `langchain`: core langchain code, abstractions, and use cases
- `langchain.experimental`: see the [Experimental README](../libs/experimental/README.md) for more information.
This repository contains two separate projects:
- `langchain`: core langchain code, abstractions, and use cases.
- `langchain.experimental`: see the [Experimental README](https://github.com/langchain-ai/langchain/tree/master/libs/experimental/README.md) for more information.
Each of these has their own development environment. Docs are run from the top-level makefile, but development
Each of these has its own development environment. Docs are run from the top-level makefile, but development
is split across separate test & release flows.
For this quickstart, start with langchain core:
@@ -105,8 +102,8 @@ make test
If the tests don't pass, you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running
Poetry v1.5.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
If you are still seeing this bug on v1.5.1, you may also try disabling "modern installation"
Poetry v1.6.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
If you are still seeing this bug on v1.6.1, you may also try disabling "modern installation"
(`poetry config installer.modern-installation false`) and re-installing requirements.
See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
@@ -129,7 +126,7 @@ To run unit tests in Docker:
make docker_tests
```
There are also [integration tests and code-coverage](../libs/langchain/tests/README.md) available.
There are also [integration tests and code-coverage](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests/README.md) available.
### Formatting and Linting
@@ -137,14 +134,21 @@ Run these locally before submitting a PR; the CI system will check also.
#### Code Formatting
Formatting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/) and [ruff](https://docs.astral.sh/ruff/rules/).
Formatting for this project is done via [ruff](https://docs.astral.sh/ruff/rules/).
To run formatting for this project:
To run formatting for docs, cookbook and templates:
```bash
make format
```
To run formatting for a library, run the same command from the relevant library directory:
```bash
cd libs/{LIBRARY}
make format
```
Additionally, you can run the formatter only on the files that have been modified in your current branch as compared to the master branch using the format_diff command:
```bash
@@ -155,14 +159,21 @@ This is especially useful when you have made changes to a subset of the project
#### Linting
Linting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/), [ruff](https://docs.astral.sh/ruff/rules/), and [mypy](http://mypy-lang.org/).
Linting for this project is done via a combination of [ruff](https://docs.astral.sh/ruff/rules/) and [mypy](http://mypy-lang.org/).
To run linting for this project:
To run linting for docs, cookbook and templates:
```bash
make lint
```
To run linting for a library, run the same command from the relevant library directory:
```bash
cd libs/{LIBRARY}
make lint
```
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:
```bash
@@ -282,13 +293,20 @@ make docs_build
make api_docs_build
```
Finally, you can run the linkchecker to make sure all links are valid:
Finally, run the link checker to ensure all links are valid:
```bash
make docs_linkcheck
make api_docs_linkcheck
```
### Verify Documentation changes
After pushing documentation changes to the repository, you can preview and verify that the changes are
what you wanted by clicking the `View deployment` or `Visit Preview` buttons on the pull request `Conversation` page.
This will take you to a preview of the documentation changes.
This preview is created by [Vercel](https://vercel.com/docs/getting-started-with-vercel).
## 🏭 Release Process
As of now, LangChain has an ad hoc release process: releases are cut with high frequency by
@@ -300,4 +318,4 @@ even patch releases may contain [non-backwards-compatible changes](https://semve
### 🌟 Recognition
If your contribution has made its way into a release, we will want to give you credit on Twitter (only if you want though)!
If you have a Twitter account you would like us to mention, please let us know in the PR or in another manner.
If you have a Twitter account you would like us to mention, please let us know in the PR or through another means.

View File

@@ -27,4 +27,4 @@ body:
attributes:
label: Your contribution
description: |
Is there any way that you could help, e.g. by submitting a PR? Make sure to read the CONTRIBUTING.MD [readme](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md)
Is there any way that you could help, e.g. by submitting a PR? Make sure to read the CONTRIBUTING.MD [readme](https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md)

View File

@@ -10,7 +10,7 @@ Replace this entire comment with:
Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally.
See contribution guidelines for more information on how to write/run tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on network access,

View File

@@ -0,0 +1,57 @@
name: compile-integration-test
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.6.1"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
name: Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: compile-integration
- name: Install integration dependencies
shell: bash
run: poetry install --with=test_integration
- name: Check integration tests compile
shell: bash
run: poetry run pytest -m compile tests/integration_tests
- name: Ensure the tests did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -7,20 +7,21 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
langchain-location:
required: false
type: string
description: "Relative path to the langchain library folder"
env:
POETRY_VERSION: "1.5.1"
POETRY_VERSION: "1.6.1"
WORKDIR: ${{ inputs.working-directory == '' && '.' || inputs.working-directory }}
# This env var allows us to get inline annotations when ruff has complaints.
RUFF_OUTPUT_FORMAT: github
jobs:
build:
runs-on: ubuntu-latest
env:
# This number is set "by eye": we want it to be big enough
# so that it's bigger than the number of commits in any reasonable PR,
# and also as small as possible since increasing the number makes
# the initial `git fetch` slower.
FETCH_DEPTH: 50
strategy:
matrix:
# Only lint on the min and max supported Python versions.
@@ -34,52 +35,7 @@ jobs:
- "3.8"
- "3.11"
steps:
- uses: actions/checkout@v3
with:
# Fetch the last FETCH_DEPTH commits, so the mtime-changing script
# can accurately set the mtimes of files modified in the last FETCH_DEPTH commits.
fetch-depth: ${{ env.FETCH_DEPTH }}
- name: Restore workdir file mtimes to last-edited commit date
id: restore-mtimes
# This is needed to make black caching work.
# Black's cache uses file (mtime, size) to check whether a lookup is a cache hit.
# Without this command, files in the repo would have the current time as the modified time,
# since the previous action step just created them.
# This command resets the mtime to the last time the files were modified in git instead,
# which is a high-quality and stable representation of the last modification date.
run: |
# Important considerations:
# - These commands run at base of the repo, since we never `cd` to the `WORKDIR`.
# - We only want to alter mtimes for Python files, since that's all black checks.
# - We don't need to alter mtimes for directories, since black doesn't look at those.
# - We also only alter mtimes inside the `WORKDIR` since that's all we'll lint.
# - This should run before `poetry install`, because poetry's venv also contains
# Python files, and we don't want to alter their mtimes since they aren't linted.
# Ensure we fail on non-zero exits and on undefined variables.
# Also print executed commands, for easier debugging.
set -eux
# Restore the mtimes of Python files in the workdir based on git history.
.github/tools/git-restore-mtime --no-directories "$WORKDIR/**/*.py"
# Since CI only does a partial fetch (to `FETCH_DEPTH`) for efficiency,
# the local git repo doesn't have full history. There are probably files
# that were last modified in a commit *older than* the oldest fetched commit.
# After `git-restore-mtime`, such files have a mtime set to the oldest fetched commit.
#
# As new commits get added, that timestamp will keep moving forward.
# If left unchanged, this will make `black` think that the files were edited
# more recently than its cache suggests. Instead, we can set their mtime
# to a fixed date in the far past that won't change and won't cause cache misses in black.
#
# For all workdir Python files modified in or before the oldest few fetched commits,
# make their mtime be 2000-01-01 00:00:00.
OLDEST_COMMIT="$(git log --reverse '--pretty=format:%H' | head -1)"
OLDEST_COMMIT_TIME="$(git show -s '--format=%ai' "$OLDEST_COMMIT")"
find "$WORKDIR" -name '*.py' -type f -not -newermt "$OLDEST_COMMIT_TIME" -exec touch -c -m -t '200001010000' '{}' '+'
echo "oldest-commit=$OLDEST_COMMIT" >> "$GITHUB_OUTPUT"
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
@@ -116,22 +72,11 @@ jobs:
- name: Install langchain editable
working-directory: ${{ inputs.working-directory }}
if: ${{ inputs.working-directory != 'libs/langchain' }}
run: |
pip install -e ../langchain
- name: Restore black cache
uses: actions/cache@v3
if: ${{ inputs.langchain-location }}
env:
CACHE_BASE: black-${{ runner.os }}-${{ runner.arch }}-py${{ matrix.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', env.WORKDIR)) }}
SEGMENT_DOWNLOAD_TIMEOUT_MIN: "1"
with:
path: |
${{ env.WORKDIR }}/.black_cache
key: ${{ env.CACHE_BASE }}-${{ steps.restore-mtimes.outputs.oldest-commit }}
restore-keys:
# If we can't find an exact match for our cache key, accept any with this prefix.
${{ env.CACHE_BASE }}-
LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
run: |
pip install -e "$LANGCHAIN_LOCATION"
- name: Get .mypy_cache to speed up mypy
uses: actions/cache@v3
@@ -144,7 +89,5 @@ jobs:
- name: Analysing the code with our lint
working-directory: ${{ inputs.working-directory }}
env:
BLACK_CACHE_DIR: .black_cache
run: |
make lint

View File

@@ -9,7 +9,7 @@ on:
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.5.1"
POETRY_VERSION: "1.6.1"
jobs:
build:
@@ -26,7 +26,7 @@ jobs:
- "3.11"
name: Pydantic v1/v2 compatibility - Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

View File

@@ -9,13 +9,121 @@ on:
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.5.1"
PYTHON_VERSION: "3.10"
POETRY_VERSION: "1.6.1"
jobs:
if_release:
# Disallow publishing from branches that aren't `master`.
build:
if: github.ref == 'refs/heads/master'
runs-on: ubuntu-latest
outputs:
pkg-name: ${{ steps.check-version.outputs.pkg-name }}
version: ${{ steps.check-version.outputs.version }}
steps:
- uses: actions/checkout@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: release
# We want to keep this build stage *separate* from the release stage,
# so that there's no sharing of permissions between them.
# The release stage has trusted publishing and GitHub repo contents write access,
# and we want to keep the scope of that access limited just to the release job.
# Otherwise, a malicious `build` step (e.g. via a compromised dependency)
# could get access to our GitHub or PyPI credentials.
#
# Per the trusted publishing GitHub Action:
# > It is strongly advised to separate jobs for building [...]
# > from the publish job.
# https://github.com/pypa/gh-action-pypi-publish#non-goals
- name: Build project for distribution
run: poetry build
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v3
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Check Version
id: check-version
shell: bash
working-directory: ${{ inputs.working-directory }}
run: |
echo pkg-name="$(poetry version | cut -d ' ' -f 1)" >> $GITHUB_OUTPUT
echo version="$(poetry version --short)" >> $GITHUB_OUTPUT
test-pypi-publish:
needs:
- build
uses:
./.github/workflows/_test_release.yml
with:
working-directory: ${{ inputs.working-directory }}
secrets: inherit
pre-release-checks:
needs:
- build
- test-pypi-publish
runs-on: ubuntu-latest
steps:
# We explicitly *don't* set up caching here. This ensures our tests are
# maximally sensitive to catching breakage.
#
# For example, here's a way that caching can cause a falsely-passing test:
# - Make the langchain package manifest no longer list a dependency package
# as a requirement. This means it won't be installed by `pip install`,
# and attempting to use it would cause a crash.
# - That dependency used to be required, so it may have been cached.
# When restoring the venv packages from cache, that dependency gets included.
# - Tests pass, because the dependency is present even though it wasn't specified.
# - The package is published, and it breaks on the missing dependency when
# used in the real world.
- uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Test published package
shell: bash
env:
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
VERSION: ${{ needs.build.outputs.version }}
# Here we specify:
# - The test PyPI index as the *primary* index, meaning that it takes priority.
# - The regular PyPI index as an extra index, so that any dependencies that
# are not found on test PyPI can be resolved and installed anyway.
#
# Without the former, we might install the wrong langchain release.
# Without the latter, we might not be able to install langchain's dependencies.
#
# TODO: add more in-depth pre-publish tests after testing that importing works
run: |
pip install \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple/ \
"$PKG_NAME==$VERSION"
# Replace all dashes in the package name with underscores,
# since that's how Python imports packages with dashes in the name.
IMPORT_NAME="$(echo "$PKG_NAME" | sed s/-/_/g)"
python -c "import $IMPORT_NAME; print(dir($IMPORT_NAME))"
publish:
needs:
- build
- test-pypi-publish
- pre-release-checks
runs-on: ubuntu-latest
permissions:
# This permission is used for trusted publishing:
# https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
@@ -24,28 +132,65 @@ jobs:
# https://docs.pypi.org/trusted-publishers/adding-a-publisher/
id-token: write
# This permission is needed by `ncipollo/release-action` to create the GitHub release.
contents: write
defaults:
run:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: "3.10"
python-version: ${{ env.PYTHON_VERSION }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: release
- name: Build project for distribution
run: poetry build
- name: Check Version
id: check-version
run: |
echo version=$(poetry version --short) >> $GITHUB_OUTPUT
- uses: actions/download-artifact@v3
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true
mark-release:
needs:
- build
- test-pypi-publish
- pre-release-checks
- publish
runs-on: ubuntu-latest
permissions:
# This permission is needed by `ncipollo/release-action` to
# create the GitHub release.
contents: write
defaults:
run:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: release
- uses: actions/download-artifact@v3
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Create Release
uses: ncipollo/release-action@v1
if: ${{ inputs.working-directory == 'libs/langchain' }}
@@ -54,11 +199,5 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
draft: false
generateReleaseNotes: true
tag: v${{ steps.check-version.outputs.version }}
tag: v${{ needs.build.outputs.version }}
commit: master
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true

62
.github/workflows/_release_docker.yml vendored Normal file
View File

@@ -0,0 +1,62 @@
name: release_docker
on:
workflow_call:
inputs:
dockerfile:
required: true
type: string
description: "Path to the Dockerfile to build"
image:
required: true
type: string
description: "Name of the image to build"
env:
TEST_TAG: ${{ inputs.image }}:test
LATEST_TAG: ${{ inputs.image }}:latest
jobs:
docker:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Get git tag
uses: actions-ecosystem/action-get-latest-tag@v1
id: get-latest-tag
- name: Set docker tag
env:
VERSION: ${{ steps.get-latest-tag.outputs.tag }}
run: |
echo "VERSION_TAG=${{ inputs.image }}:${VERSION#v}" >> $GITHUB_ENV
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build for Test
uses: docker/build-push-action@v5
with:
context: .
file: ${{ inputs.dockerfile }}
load: true
tags: ${{ env.TEST_TAG }}
- name: Test
run: |
docker run --rm ${{ env.TEST_TAG }} python -c "import langchain"
- name: Build and Push to Docker Hub
uses: docker/build-push-action@v5
with:
context: .
file: ${{ inputs.dockerfile }}
# We can only build for the intersection of platforms supported by
# QEMU and base python image, for now build only for
# linux/amd64 and linux/arm64
platforms: linux/amd64,linux/arm64
tags: ${{ env.LATEST_TAG }},${{ env.VERSION_TAG }}
push: true

View File

@@ -9,7 +9,7 @@ on:
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.5.1"
POETRY_VERSION: "1.6.1"
jobs:
build:
@@ -26,7 +26,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

95
.github/workflows/_test_release.yml vendored Normal file
View File

@@ -0,0 +1,95 @@
name: test-release
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.6.1"
PYTHON_VERSION: "3.10"
jobs:
build:
if: github.ref == 'refs/heads/master'
runs-on: ubuntu-latest
outputs:
pkg-name: ${{ steps.check-version.outputs.pkg-name }}
version: ${{ steps.check-version.outputs.version }}
steps:
- uses: actions/checkout@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: release
# We want to keep this build stage *separate* from the release stage,
# so that there's no sharing of permissions between them.
# The release stage has trusted publishing and GitHub repo contents write access,
# and we want to keep the scope of that access limited just to the release job.
# Otherwise, a malicious `build` step (e.g. via a compromised dependency)
# could get access to our GitHub or PyPI credentials.
#
# Per the trusted publishing GitHub Action:
# > It is strongly advised to separate jobs for building [...]
# > from the publish job.
# https://github.com/pypa/gh-action-pypi-publish#non-goals
- name: Build project for distribution
run: poetry build
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v3
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/
- name: Check Version
id: check-version
shell: bash
working-directory: ${{ inputs.working-directory }}
run: |
echo pkg-name="$(poetry version | cut -d ' ' -f 1)" >> $GITHUB_OUTPUT
echo version="$(poetry version --short)" >> $GITHUB_OUTPUT
publish:
needs:
- build
runs-on: ubuntu-latest
permissions:
# This permission is used for trusted publishing:
# https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
#
# Trusted publishing has to also be configured on PyPI for each package:
# https://docs.pypi.org/trusted-publishers/adding-a-publisher/
id-token: write
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v3
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/
- name: Publish to test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true
repository-url: https://test.pypi.org/legacy/
# We overwrite any existing distributions with the same name and version.
# This is *only for CI use* and is *extremely dangerous* otherwise!
# https://github.com/pypa/gh-action-pypi-publish#tolerating-release-package-file-duplicates
skip-existing: true

View File

@@ -17,8 +17,20 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Install Dependencies
run: |
pip install toml
- name: Extract Ignore Words List
run: |
# Use a Python script to extract the ignore words list from pyproject.toml
python .github/workflows/extract_ignored_words_list.py
id: extract_ignore_words
- name: Codespell
uses: codespell-project/actions-codespell@v2
with:
skip: guide_imports.json
ignore_words_list: ${{ steps.extract_ignore_words.outputs.ignore_words_list }}

View File

@@ -1,11 +1,17 @@
---
name: Documentation Lint
name: Docs, templates, cookbook lint
on:
push:
branches: [master]
branches: [ master ]
pull_request:
branches: [master]
paths:
- 'docs/**'
- 'templates/**'
- 'cookbook/**'
- '.github/workflows/_lint.yml'
- '.github/workflows/doc_lint.yml'
workflow_dispatch:
jobs:
check:
@@ -13,10 +19,17 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v2
uses: actions/checkout@v4
- name: Run import check
run: |
# We should not encourage imports directly from main init file
# Expect for hub
git grep 'from langchain import' docs/{extras,docs_skeleton,snippets} | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
git grep 'from langchain import' {docs/docs,templates,cookbook} | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
lint:
uses:
./.github/workflows/_lint.yml
with:
working-directory: "."
secrets: inherit

View File

@@ -0,0 +1,8 @@
import toml
pyproject_toml = toml.load("pyproject.toml")
# Extract the ignore words list (adjust the key as per your TOML structure)
ignore_words_list = pyproject_toml.get("tool", {}).get("codespell", {}).get("ignore-words-list")
print(f"::set-output name=ignore_words_list::{ignore_words_list}")

View File

@@ -12,6 +12,7 @@ on:
- '.github/workflows/_test.yml'
- '.github/workflows/_pydantic_compatibility.yml'
- '.github/workflows/langchain_ci.yml'
- 'libs/*'
- 'libs/langchain/**'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
@@ -26,7 +27,7 @@ concurrency:
cancel-in-progress: true
env:
POETRY_VERSION: "1.5.1"
POETRY_VERSION: "1.6.1"
WORKDIR: "libs/langchain"
jobs:
@@ -44,6 +45,13 @@ jobs:
working-directory: libs/langchain
secrets: inherit
compile-integration-tests:
uses:
./.github/workflows/_compile_integration_test.yml
with:
working-directory: libs/langchain
secrets: inherit
pydantic-compatibility:
uses:
./.github/workflows/_pydantic_compatibility.yml
@@ -65,7 +73,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }} extended tests
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

47
.github/workflows/langchain_cli_ci.yml vendored Normal file
View File

@@ -0,0 +1,47 @@
---
name: libs/cli CI
on:
push:
branches: [ master ]
pull_request:
paths:
- '.github/actions/poetry_setup/action.yml'
- '.github/tools/**'
- '.github/workflows/_lint.yml'
- '.github/workflows/_test.yml'
- '.github/workflows/_pydantic_compatibility.yml'
- '.github/workflows/langchain_cli_ci.yml'
- 'libs/cli/**'
- 'libs/*'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
# If another push to the same PR or branch happens while this workflow is still running,
# cancel the earlier run in favor of the next run.
#
# There's no point in testing an outdated version of the code. GitHub only allows
# a limited number of job runners to be active at the same time, so it's better to cancel
# pointless jobs early so that more useful jobs can run sooner.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
POETRY_VERSION: "1.6.1"
WORKDIR: "libs/cli"
jobs:
lint:
uses:
./.github/workflows/_lint.yml
with:
working-directory: libs/cli
langchain-location: ../langchain
secrets: inherit
test:
uses:
./.github/workflows/_test.yml
with:
working-directory: libs/cli
secrets: inherit

View File

@@ -0,0 +1,13 @@
---
name: libs/cli Release
on:
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
release:
uses:
./.github/workflows/_release.yml
with:
working-directory: libs/cli
secrets: inherit

View File

@@ -11,7 +11,7 @@ on:
- '.github/workflows/_lint.yml'
- '.github/workflows/_test.yml'
- '.github/workflows/langchain_experimental_ci.yml'
- 'libs/langchain/**'
- 'libs/*'
- 'libs/experimental/**'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
@@ -26,7 +26,7 @@ concurrency:
cancel-in-progress: true
env:
POETRY_VERSION: "1.5.1"
POETRY_VERSION: "1.6.1"
WORKDIR: "libs/experimental"
jobs:
@@ -35,6 +35,7 @@ jobs:
./.github/workflows/_lint.yml
with:
working-directory: libs/experimental
langchain-location: ../langchain
secrets: inherit
test:
@@ -44,6 +45,13 @@ jobs:
working-directory: libs/experimental
secrets: inherit
compile-integration-tests:
uses:
./.github/workflows/_compile_integration_test.yml
with:
working-directory: libs/experimental
secrets: inherit
# It's possible that langchain-experimental works fine with the latest *published* langchain,
# but is broken with the langchain on `master`.
#
@@ -62,7 +70,7 @@ jobs:
- "3.11"
name: test with unpublished langchain - Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
@@ -97,7 +105,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }} extended tests
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

View File

@@ -0,0 +1,13 @@
---
name: Experimental Test Release
on:
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
release:
uses:
./.github/workflows/_test_release.yml
with:
working-directory: libs/experimental
secrets: inherit

View File

@@ -11,3 +11,17 @@ jobs:
with:
working-directory: libs/langchain
secrets: inherit
# N.B.: It's possible that PyPI doesn't make the new release visible / available
# immediately after publishing. If that happens, the docker build might not
# create a new docker image for the new release, since it won't see it.
#
# If this ends up being a problem, add a check to the end of the `_release.yml`
# workflow that prevents the workflow from finishing until the new release
# is visible and installable on PyPI.
release-docker:
needs:
- release
uses:
./.github/workflows/langchain_release_docker.yml
secrets: inherit

View File

@@ -0,0 +1,14 @@
---
name: docker/langchain/langchain Release
on:
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
workflow_call: # Allows triggering from another workflow
jobs:
release:
uses: ./.github/workflows/_release_docker.yml
with:
dockerfile: docker/Dockerfile.base
image: langchain/langchain
secrets: inherit

View File

@@ -0,0 +1,13 @@
---
name: Test Release
on:
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
release:
uses:
./.github/workflows/_test_release.yml
with:
working-directory: libs/langchain
secrets: inherit

View File

@@ -6,7 +6,7 @@ on:
- cron: '0 13 * * *'
env:
POETRY_VERSION: "1.5.1"
POETRY_VERSION: "1.6.1"
jobs:
build:
@@ -24,7 +24,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: "./.github/actions/poetry_setup"
@@ -40,6 +40,13 @@ jobs:
with:
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.AWS_REGION }}
- name: Install dependencies
working-directory: libs/langchain
shell: bash
@@ -47,11 +54,22 @@ jobs:
echo "Running scheduled tests, installing dependencies with poetry..."
poetry install --with=test_integration
poetry run pip install google-cloud-aiplatform
poetry run pip install "boto3>=1.28.57"
if [[ ${{ matrix.python-version }} != "3.8" ]]
then
poetry run pip install fireworks-ai
fi
- name: Run tests
shell: bash
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_DEPLOYMENT_NAME }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
run: |
make scheduled_tests

37
.github/workflows/templates_ci.yml vendored Normal file
View File

@@ -0,0 +1,37 @@
---
name: templates CI
on:
push:
branches: [ master ]
pull_request:
paths:
- '.github/actions/poetry_setup/action.yml'
- '.github/tools/**'
- '.github/workflows/_lint.yml'
- '.github/workflows/templates_ci.yml'
- 'templates/**'
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
# If another push to the same PR or branch happens while this workflow is still running,
# cancel the earlier run in favor of the next run.
#
# There's no point in testing an outdated version of the code. GitHub only allows
# a limited number of job runners to be active at the same time, so it's better to cancel
# pointless jobs early so that more useful jobs can run sooner.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
POETRY_VERSION: "1.6.1"
WORKDIR: "templates"
jobs:
lint:
uses:
./.github/workflows/_lint.yml
with:
working-directory: templates
langchain-location: ../libs/langchain
secrets: inherit

14
.gitignore vendored
View File

@@ -30,6 +30,12 @@ share/python-wheels/
*.egg
MANIFEST
# Google GitHub Actions credentials files created by:
# https://github.com/google-github-actions/auth
#
# That action recommends adding this gitignore to prevent accidentally committing keys.
gha-creds-*.json
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
@@ -168,6 +174,8 @@ docs/api_reference/*/
!docs/api_reference/_static/
!docs/api_reference/templates/
!docs/api_reference/themes/
docs/docs_skeleton/build
docs/docs_skeleton/node_modules
docs/docs_skeleton/yarn.lock
docs/docs/build
docs/docs/node_modules
docs/docs/yarn.lock
_dist
docs/docs/templates

4
.gitmodules vendored
View File

@@ -1,4 +0,0 @@
[submodule "docs/_docs_skeleton"]
path = docs/_docs_skeleton
url = https://github.com/langchain-ai/langchain-shared-docs
branch = main

View File

@@ -9,9 +9,14 @@ build:
os: ubuntu-22.04
tools:
python: "3.11"
jobs:
pre_build:
commands:
- python -mvirtualenv $READTHEDOCS_VIRTUALENV_PATH
- python -m pip install --upgrade --no-cache-dir pip setuptools
- python -m pip install --upgrade --no-cache-dir sphinx readthedocs-sphinx-ext
- python -m pip install --exists-action=w --no-cache-dir -r docs/api_reference/requirements.txt
- python docs/api_reference/create_api_rst.py
- cat docs/api_reference/conf.py
- python -m sphinx -T -E -b html -d _build/doctrees -c docs/api_reference docs/api_reference $READTHEDOCS_OUTPUT/html -j auto
# Build documentation in the docs/ directory with Sphinx
sphinx:
@@ -25,5 +30,3 @@ sphinx:
python:
install:
- requirements: docs/api_reference/requirements.txt
- method: pip
path: .

View File

@@ -5,4 +5,4 @@ authors:
given-names: "Harrison"
title: "LangChain"
date-released: 2022-10-17
url: "https://github.com/hwchase17/langchain"
url: "https://github.com/langchain-ai/langchain"

View File

@@ -15,10 +15,10 @@ docs_build:
docs/.local_build.sh
docs_clean:
rm -r docs/_dist
rm -r _dist
docs_linkcheck:
poetry run linkchecker docs/_dist/docs_skeleton/ --ignore-url node_modules
poetry run linkchecker _dist/docs/ --ignore-url node_modules
api_docs_build:
poetry run python docs/api_reference/create_api_rst.py
@@ -37,6 +37,18 @@ spell_check:
spell_fix:
poetry run codespell --toml pyproject.toml -w
######################
# LINTING AND FORMATTING
######################
lint:
poetry run ruff docs templates cookbook
poetry run black docs templates cookbook --diff
format format_diff:
poetry run black docs templates cookbook
poetry run ruff --select I --fix docs templates cookbook
######################
# HELP
######################
@@ -53,4 +65,4 @@ help:
@echo 'api_docs_linkcheck - run linkchecker on the API Reference documentation'
@echo 'spell_check - run codespell on the project'
@echo 'spell_fix - run codespell on the project and fix the errors'
@echo '-- TEST and LINT tasks are within libs/*/ per-package --'
@echo '-- TEST and LINT tasks are within libs/*/ per-package --'

View File

@@ -16,17 +16,18 @@
[![Open Issues](https://img.shields.io/github/issues-raw/langchain-ai/langchain)](https://github.com/langchain-ai/langchain/issues)
Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).
Looking for the JS/TS version? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
**Production Support:** As you move your LangChains into production, we'd love to offer more hands-on support.
Fill out [this form](https://airtable.com/appwQzlErAS2qiP0L/shrGtGaVBVAz7NcV2) to share more about what you're building, and our team will get in touch.
To help you ship LangChain apps to production faster, check out [LangSmith](https://smith.langchain.com).
[LangSmith](https://smith.langchain.com) is a unified developer platform for building, testing, and monitoring LLM applications.
Fill out [this form](https://airtable.com/appwQzlErAS2qiP0L/shrGtGaVBVAz7NcV2) to get off the waitlist or speak with our sales team
## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28/23
In an effort to make `langchain` leaner and safer, we are moving select chains to `langchain_experimental`.
This migration has already started, but we are remaining backwards compatible until 7/28.
On that date, we will remove functionality from `langchain`.
Read more about the motivation and the progress [here](https://github.com/hwchase17/langchain/discussions/8043).
Read more about the motivation and the progress [here](https://github.com/langchain-ai/langchain/discussions/8043).
Read how to migrate your code [here](MIGRATE.md).
## Quick Install
@@ -49,7 +50,7 @@ This library aims to assist in the development of those types of applications. C
**💬 Chatbots**
- [Documentation](https://python.langchain.com/docs/use_cases/chatbots/)
- End-to-end Example: [Chat-LangChain](https://github.com/hwchase17/chat-langchain)
- End-to-end Example: [Chat-LangChain](https://github.com/langchain-ai/chat-langchain)
**🤖 Agents**
@@ -92,7 +93,7 @@ Memory refers to persisting state between calls of a chain/agent. LangChain prov
**🧐 Evaluation:**
[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is by using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
For more information on these concepts, please see our [full documentation](https://python.langchain.com).

View File

@@ -0,0 +1,398 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "fc935871-7640-41c6-b798-58514d860fe0",
"metadata": {},
"source": [
"## LLaMA2 chat with SQL\n",
"\n",
"Open source, local LLMs are great to consider for any application that demands data privacy.\n",
"\n",
"SQL is one good example. \n",
"\n",
"This cookbook shows how to perform text-to-SQL using various local versions of LLaMA2 run locally.\n",
"\n",
"## Packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81adcf8b-395a-4f02-8749-ac976942b446",
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain replicate"
]
},
{
"cell_type": "markdown",
"id": "8e13ed66-300b-4a23-b8ac-44df68ee4733",
"metadata": {},
"source": [
"## LLM\n",
"\n",
"There are a few ways to access LLaMA2.\n",
"\n",
"To run locally, we use Ollama.ai. \n",
"\n",
"See [here](https://python.langchain.com/docs/integrations/chat/ollama) for details on installation and setup.\n",
"\n",
"Also, see [here](https://python.langchain.com/docs/guides/local_llms) for our full guide on local LLMs.\n",
" \n",
"To use an external API, which is not private, we can use Replicate."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6a75a5c6-34ee-4ab9-a664-d9b432d812ee",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Init param `input` is deprecated, please use `model_kwargs` instead.\n"
]
}
],
"source": [
"# Local\n",
"from langchain.chat_models import ChatOllama\n",
"\n",
"llama2_chat = ChatOllama(model=\"llama2:13b-chat\")\n",
"llama2_code = ChatOllama(model=\"codellama:7b-instruct\")\n",
"\n",
"# API\n",
"from getpass import getpass\n",
"from langchain.llms import Replicate\n",
"\n",
"# REPLICATE_API_TOKEN = getpass()\n",
"# os.environ[\"REPLICATE_API_TOKEN\"] = REPLICATE_API_TOKEN\n",
"replicate_id = \"meta/llama-2-13b-chat:f4e2de70d66816a838a89eeeb621910adffb0dd0baba3976c96980970978018d\"\n",
"llama2_chat_replicate = Replicate(\n",
" model=replicate_id, input={\"temperature\": 0.01, \"max_length\": 500, \"top_p\": 1}\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "ce96f7ea-b3d5-44e1-9fa5-a79e04a9e1fb",
"metadata": {},
"outputs": [],
"source": [
"# Simply set the LLM we want to use\n",
"llm = llama2_chat"
]
},
{
"cell_type": "markdown",
"id": "80222165-f353-4e35-a123-5f70fd70c6c8",
"metadata": {},
"source": [
"## DB\n",
"\n",
"Connect to a SQLite DB.\n",
"\n",
"To create this particular DB, you can use the code and follow the steps shown [here](https://github.com/facebookresearch/llama-recipes/blob/main/demo_apps/StructuredLlama.ipynb)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "025bdd82-3bb1-4948-bc7c-c3ccd94fd05c",
"metadata": {},
"outputs": [],
"source": [
"from langchain.utilities import SQLDatabase\n",
"\n",
"db = SQLDatabase.from_uri(\"sqlite:///nba_roster.db\", sample_rows_in_table_info=0)\n",
"\n",
"\n",
"def get_schema(_):\n",
" return db.get_table_info()\n",
"\n",
"\n",
"def run_query(query):\n",
" return db.run(query)"
]
},
{
"cell_type": "markdown",
"id": "654b3577-baa2-4e12-a393-f40e5db49ac7",
"metadata": {},
"source": [
"## Query a SQL DB \n",
"\n",
"Follow the runnables workflow [here](https://python.langchain.com/docs/expression_language/cookbook/sql_db)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "5a4933ea-d9c0-4b0a-8177-ba4490c6532b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' SELECT \"Team\" FROM nba_roster WHERE \"NAME\" = \\'Klay Thompson\\';'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Prompt\n",
"from langchain.prompts import ChatPromptTemplate\n",
"\n",
"template = \"\"\"Based on the table schema below, write a SQL query that would answer the user's question:\n",
"{schema}\n",
"\n",
"Question: {question}\n",
"SQL Query:\"\"\"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", \"Given an input question, convert it to a SQL query. No pre-amble.\"),\n",
" (\"human\", template),\n",
" ]\n",
")\n",
"\n",
"# Chain to query\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnablePassthrough\n",
"\n",
"sql_response = (\n",
" RunnablePassthrough.assign(schema=get_schema)\n",
" | prompt\n",
" | llm.bind(stop=[\"\\nSQLResult:\"])\n",
" | StrOutputParser()\n",
")\n",
"\n",
"sql_response.invoke({\"question\": \"What team is Klay Thompson on?\"})"
]
},
{
"cell_type": "markdown",
"id": "a0e9e2c8-9b88-4853-ac86-001bc6cc6695",
"metadata": {},
"source": [
"We can review the results:\n",
"\n",
"* [LangSmith trace](https://smith.langchain.com/public/afa56a06-b4e2-469a-a60f-c1746e75e42b/r) LLaMA2-13 Replicate API\n",
"* [LangSmith trace](https://smith.langchain.com/public/2d4ecc72-6b8f-4523-8f0b-ea95c6b54a1d/r) LLaMA2-13 local \n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "2a2825e3-c1b6-4f7d-b9c9-d9835de323bb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' Based on the table schema and SQL query, there are 30 unique teams in the NBA.')"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Chain to answer\n",
"template = \"\"\"Based on the table schema below, question, sql query, and sql response, write a natural language response:\n",
"{schema}\n",
"\n",
"Question: {question}\n",
"SQL Query: {query}\n",
"SQL Response: {response}\"\"\"\n",
"prompt_response = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"Given an input question and SQL response, convert it to a natural langugae answer. No pre-amble.\",\n",
" ),\n",
" (\"human\", template),\n",
" ]\n",
")\n",
"\n",
"full_chain = (\n",
" RunnablePassthrough.assign(query=sql_response)\n",
" | RunnablePassthrough.assign(\n",
" schema=get_schema,\n",
" response=lambda x: db.run(x[\"query\"]),\n",
" )\n",
" | prompt_response\n",
" | llm\n",
")\n",
"\n",
"full_chain.invoke({\"question\": \"How many unique teams are there?\"})"
]
},
{
"cell_type": "markdown",
"id": "ec17b3ee-6618-4681-b6df-089bbb5ffcd7",
"metadata": {},
"source": [
"We can review the results:\n",
"\n",
"* [LangSmith trace](https://smith.langchain.com/public/10420721-746a-4806-8ecf-d6dc6399d739/r) LLaMA2-13 Replicate API\n",
"* [LangSmith trace](https://smith.langchain.com/public/5265ebab-0a22-4f37-936b-3300f2dfa1c1/r) LLaMA2-13 local "
]
},
{
"cell_type": "markdown",
"id": "1e85381b-1edc-4bb3-a7bd-2ab23f81e54d",
"metadata": {},
"source": [
"## Chat with a SQL DB \n",
"\n",
"Next, we can add memory."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "022868f2-128e-42f5-8d90-d3bb2f11d994",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' SELECT \"Team\" FROM nba_roster WHERE \"NAME\" = \\'Klay Thompson\\';'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Prompt\n",
"from langchain.memory import ConversationBufferMemory\n",
"from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
"\n",
"template = \"\"\"Given an input question, convert it to a SQL query. No pre-amble. Based on the table schema below, write a SQL query that would answer the user's question:\n",
"{schema}\n",
"\"\"\"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", template),\n",
" MessagesPlaceholder(variable_name=\"history\"),\n",
" (\"human\", \"{question}\"),\n",
" ]\n",
")\n",
"\n",
"memory = ConversationBufferMemory(return_messages=True)\n",
"\n",
"# Chain to query with memory\n",
"from langchain.schema.runnable import RunnableLambda\n",
"\n",
"sql_chain = (\n",
" RunnablePassthrough.assign(\n",
" schema=get_schema,\n",
" history=RunnableLambda(lambda x: memory.load_memory_variables(x)[\"history\"]),\n",
" )\n",
" | prompt\n",
" | llm.bind(stop=[\"\\nSQLResult:\"])\n",
" | StrOutputParser()\n",
")\n",
"\n",
"\n",
"def save(input_output):\n",
" output = {\"output\": input_output.pop(\"output\")}\n",
" memory.save_context(input_output, output)\n",
" return output[\"output\"]\n",
"\n",
"\n",
"sql_response_memory = RunnablePassthrough.assign(output=sql_chain) | save\n",
"sql_response_memory.invoke({\"question\": \"What team is Klay Thompson on?\"})"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "800a7a3b-f411-478b-af51-2310cd6e0425",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' Sure! Here\\'s the natural language response based on the given input:\\n\\n\"Klay Thompson\\'s salary is $43,219,440.\"')"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Chain to answer\n",
"template = \"\"\"Based on the table schema below, question, sql query, and sql response, write a natural language response:\n",
"{schema}\n",
"\n",
"Question: {question}\n",
"SQL Query: {query}\n",
"SQL Response: {response}\"\"\"\n",
"prompt_response = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"Given an input question and SQL response, convert it to a natural langugae answer. No pre-amble.\",\n",
" ),\n",
" (\"human\", template),\n",
" ]\n",
")\n",
"\n",
"full_chain = (\n",
" RunnablePassthrough.assign(query=sql_response_memory)\n",
" | RunnablePassthrough.assign(\n",
" schema=get_schema,\n",
" response=lambda x: db.run(x[\"query\"]),\n",
" )\n",
" | prompt_response\n",
" | llm\n",
")\n",
"\n",
"full_chain.invoke({\"question\": \"What is his salary?\"})"
]
},
{
"cell_type": "markdown",
"id": "b77fee61-f4da-4bb1-8285-14101e505518",
"metadata": {},
"source": [
"Here is the [trace](https://smith.langchain.com/public/54794d18-2337-4ce2-8b9f-3d8a2df89e51/r)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

55
cookbook/README.md Normal file
View File

@@ -0,0 +1,55 @@
# LangChain cookbook
Example code for building applications with LangChain, with an emphasis on more applied and end-to-end examples than contained in the [main documentation](https://python.langchain.com).
Notebook | Description
:- | :-
[LLaMA2_sql_chat.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/LLaMA2_sql_chat.ipynb) | Build a chat application that interacts with a SQL database using an open source llm (llama2), specifically demonstrated on an SQLite database containing rosters.
[Semi_Structured_RAG.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_Structured_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data, including text and tables, using unstructured for parsing, multi-vector retriever for storing, and lcel for implementing chains.
[Semi_structured_and_multi_moda...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using unstructured for parsing, multi-vector retriever for storage and retrieval, and lcel for implementing chains.
[Semi_structured_multi_modal_RA...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using various tools and methods such as unstructured for parsing, multi-vector retriever for storing, lcel for implementing chains, and open source language models like llama2, llava, and gpt4all.
[autogpt/autogpt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/autogpt.ipynb) | Implement autogpt, a language model, with langchain primitives such as llms, prompttemplates, vectorstores, embeddings, and tools.
[autogpt/marathon_times.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/marathon_times.ipynb) | Implement autogpt for finding winning marathon times.
[baby_agi.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/baby_agi.ipynb) | Implement babyagi, an ai agent that can generate and execute tasks based on a given objective, with the flexibility to swap out specific vectorstores/model providers.
[baby_agi_with_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/baby_agi_with_agent.ipynb) | Swap out the execution chain in the babyagi notebook with an agent that has access to tools, aiming to obtain more reliable information.
[camel_role_playing.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/camel_role_playing.ipynb) | Implement the camel framework for creating autonomous cooperative agents in large-scale language models, using role-playing and inception prompting to guide chat agents towards task completion.
[causal_program_aided_language_...](https://github.com/langchain-ai/langchain/tree/master/cookbook/causal_program_aided_language_model.ipynb) | Implement the causal program-aided language (cpal) chain, which improves upon the program-aided language (pal) by incorporating causal structure to prevent hallucination in language models, particularly when dealing with complex narratives and math problems with nested dependencies.
[code-analysis-deeplake.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/code-analysis-deeplake.ipynb) | Analyze its own code base with the help of gpt and activeloop's deep lake.
[custom_agent_with_plugin_retri...](https://github.com/langchain-ai/langchain/tree/master/cookbook/custom_agent_with_plugin_retrieval.ipynb) | Build a custom agent that can interact with ai plugins by retrieving tools and creating natural language wrappers around openapi endpoints.
[custom_agent_with_plugin_retri...](https://github.com/langchain-ai/langchain/tree/master/cookbook/custom_agent_with_plugin_retrieval_using_plugnplai.ipynb) | Build a custom agent with plugin retrieval functionality, utilizing ai plugins from the `plugnplai` directory.
[databricks_sql_db.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/databricks_sql_db.ipynb) | Connect to databricks runtimes and databricks sql.
[deeplake_semantic_search_over_...](https://github.com/langchain-ai/langchain/tree/master/cookbook/deeplake_semantic_search_over_chat.ipynb) | Perform semantic search and question-answering over a group chat using activeloop's deep lake with gpt4.
[elasticsearch_db_qa.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/elasticsearch_db_qa.ipynb) | Interact with elasticsearch analytics databases in natural language and build search queries via the elasticsearch dsl API.
[extraction_openai_tools.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/extraction_openai_tools.ipynb) | Structured Data Extraction with OpenAI Tools
[forward_looking_retrieval_augm...](https://github.com/langchain-ai/langchain/tree/master/cookbook/forward_looking_retrieval_augmented_generation.ipynb) | Implement the forward-looking active retrieval augmented generation (flare) method, which generates answers to questions, identifies uncertain tokens, generates hypothetical questions based on these tokens, and retrieves relevant documents to continue generating the answer.
[generative_agents_interactive_...](https://github.com/langchain-ai/langchain/tree/master/cookbook/generative_agents_interactive_simulacra_of_human_behavior.ipynb) | Implement a generative agent that simulates human behavior, based on a research paper, using a time-weighted memory object backed by a langchain retriever.
[gymnasium_agent_simulation.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/gymnasium_agent_simulation.ipynb) | Create a simple agent-environment interaction loop in simulated environments like text-based games with gymnasium.
[hugginggpt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/hugginggpt.ipynb) | Implement hugginggpt, a system that connects language models like chatgpt with the machine learning community via hugging face.
[hypothetical_document_embeddin...](https://github.com/langchain-ai/langchain/tree/master/cookbook/hypothetical_document_embeddings.ipynb) | Improve document indexing with hypothetical document embeddings (hyde), an embedding technique that generates and embeds hypothetical answers to queries.
[learned_prompt_optimization.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/learned_prompt_optimization.ipynb) | Automatically enhance language model prompts by injecting specific terms using reinforcement learning, which can be used to personalize responses based on user preferences.
[llm_bash.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_bash.ipynb) | Perform simple filesystem commands using language learning models (llms) and a bash process.
[llm_checker.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_checker.ipynb) | Create a self-checking chain using the llmcheckerchain function.
[llm_math.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_math.ipynb) | Solve complex word math problems using language models and python repls.
[llm_summarization_checker.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_summarization_checker.ipynb) | Check the accuracy of text summaries, with the option to run the checker multiple times for improved results.
[llm_symbolic_math.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_symbolic_math.ipynb) | Solve algebraic equations with the help of llms (language learning models) and sympy, a python library for symbolic mathematics.
[meta_prompt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/meta_prompt.ipynb) | Implement the meta-prompt concept, which is a method for building self-improving agents that reflect on their own performance and modify their instructions accordingly.
[multi_modal_output_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_modal_output_agent.ipynb) | Generate multi-modal outputs, specifically images and text.
[multi_player_dnd.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_player_dnd.ipynb) | Simulate multi-player dungeons & dragons games, with a custom function determining the speaking schedule of the agents.
[multiagent_authoritarian.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_authoritarian.ipynb) | Implement a multi-agent simulation where a privileged agent controls the conversation, including deciding who speaks and when the conversation ends, in the context of a simulated news network.
[multiagent_bidding.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_bidding.ipynb) | Implement a multi-agent simulation where agents bid to speak, with the highest bidder speaking next, demonstrated through a fictitious presidential debate example.
[myscale_vector_sql.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/myscale_vector_sql.ipynb) | Access and interact with the myscale integrated vector database, which can enhance the performance of language model (llm) applications.
[openai_functions_retrieval_qa....](https://github.com/langchain-ai/langchain/tree/master/cookbook/openai_functions_retrieval_qa.ipynb) | Structure response output in a question-answering system by incorporating openai functions into a retrieval pipeline.
[openai_v1_cookbook.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/openai_v1_cookbook.ipynb) | Explore new functionality released alongside the V1 release of the OpenAI Python library.
[petting_zoo.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/petting_zoo.ipynb) | Create multi-agent simulations with simulated environments using the petting zoo library.
[plan_and_execute_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/plan_and_execute_agent.ipynb) | Create plan-and-execute agents that accomplish objectives by planning tasks with a language model (llm) and executing them with a separate agent.
[press_releases.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/press_releases.ipynb) | Retrieve and query company press release data powered by [Kay.ai](https://kay.ai).
[program_aided_language_model.i...](https://github.com/langchain-ai/langchain/tree/master/cookbook/program_aided_language_model.ipynb) | Implement program-aided language models as described in the provided research paper.
[retrieval_in_sql.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/retrieval_in_sql.ipynb) | Perform retrieval-augmented-generation (rag) on a PostgreSQL database using pgvector.
[sales_agent_with_context.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/sales_agent_with_context.ipynb) | Implement a context-aware ai sales agent, salesgpt, that can have natural sales conversations, interact with other systems, and use a product knowledge base to discuss a company's offerings.
[self_query_hotel_search.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/self_query_hotel_search.ipynb) | Build a hotel room search feature with self-querying retrieval, using a specific hotel recommendation dataset.
[smart_llm.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/smart_llm.ipynb) | Implement a smartllmchain, a self-critique chain that generates multiple output proposals, critiques them to find the best one, and then improves upon it to produce a final output.
[tree_of_thought.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/tree_of_thought.ipynb) | Query a large language model using the tree of thought technique.
[twitter-the-algorithm-analysis...](https://github.com/langchain-ai/langchain/tree/master/cookbook/twitter-the-algorithm-analysis-deeplake.ipynb) | Analyze the source code of the Twitter algorithm with the help of gpt4 and activeloop's deep lake.
[two_agent_debate_tools.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/two_agent_debate_tools.ipynb) | Simulate multi-agent dialogues where the agents can utilize various tools.
[two_player_dnd.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/two_player_dnd.ipynb) | Simulate a two-player dungeons & dragons game, where a dialogue simulator class is used to coordinate the dialogue between the protagonist and the dungeon master.
[wikibase_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/wikibase_agent.ipynb) | Create a simple wikibase agent that utilizes sparql generation, with testing done on http://wikidata.org.

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -10,7 +10,7 @@
"\n",
"The CPAL chain builds on the recent PAL to stop LLM hallucination. The problem with the PAL approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination.\n",
"\n",
"The original [PR's description](https://github.com/hwchase17/langchain/pull/6255) contains a full overview.\n",
"The original [PR's description](https://github.com/langchain-ai/langchain/pull/6255) contains a full overview.\n",
"\n",
"Using the CPAL chain, the LLM translated this\n",
"\n",

View File

@@ -837,7 +837,9 @@
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.chains import ConversationalRetrievalChain\n",
"\n",
"model = ChatOpenAI(model_name=\"gpt-3.5-turbo-0613\") # 'ada' 'gpt-3.5-turbo-0613' 'gpt-4',\n",
"model = ChatOpenAI(\n",
" model_name=\"gpt-3.5-turbo-0613\"\n",
") # 'ada' 'gpt-3.5-turbo-0613' 'gpt-4',\n",
"qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)"
]
},
@@ -940,7 +942,7 @@
"- DocArrayRetriever\n",
"- ElasticSearchBM25Retriever\n",
"- EnsembleRetriever\n",
"- GoogleCloudEnterpriseSearchRetriever\n",
"- GoogleVertexAISearchRetriever\n",
"- AmazonKendraRetriever\n",
"- KNNRetriever\n",
"- LlamaIndexGraphRetriever and LlamaIndexRetriever\n",
@@ -992,7 +994,7 @@
{
"data": {
"text/plain": [
"{'question': 'LangChain possesses a variety of retrievers including:\\n\\n1. ArxivRetriever\\n2. AzureCognitiveSearchRetriever\\n3. BM25Retriever\\n4. ChaindeskRetriever\\n5. ChatGPTPluginRetriever\\n6. ContextualCompressionRetriever\\n7. DocArrayRetriever\\n8. ElasticSearchBM25Retriever\\n9. EnsembleRetriever\\n10. GoogleCloudEnterpriseSearchRetriever\\n11. AmazonKendraRetriever\\n12. KNNRetriever\\n13. LlamaIndexGraphRetriever\\n14. LlamaIndexRetriever\\n15. MergerRetriever\\n16. MetalRetriever\\n17. MilvusRetriever\\n18. MultiQueryRetriever\\n19. ParentDocumentRetriever\\n20. PineconeHybridSearchRetriever\\n21. PubMedRetriever\\n22. RePhraseQueryRetriever\\n23. RemoteLangChainRetriever\\n24. SelfQueryRetriever\\n25. SVMRetriever\\n26. TFIDFRetriever\\n27. TimeWeightedVectorStoreRetriever\\n28. VespaRetriever\\n29. WeaviateHybridSearchRetriever\\n30. WebResearchRetriever\\n31. WikipediaRetriever\\n32. ZepRetriever\\n33. ZillizRetriever\\n\\nIt also includes self query translators like:\\n\\n1. ChromaTranslator\\n2. DeepLakeTranslator\\n3. MyScaleTranslator\\n4. PineconeTranslator\\n5. QdrantTranslator\\n6. WeaviateTranslator\\n\\nAnd remote retrievers like:\\n\\n1. RemoteLangChainRetriever'}"
"{'question': 'LangChain possesses a variety of retrievers including:\\n\\n1. ArxivRetriever\\n2. AzureCognitiveSearchRetriever\\n3. BM25Retriever\\n4. ChaindeskRetriever\\n5. ChatGPTPluginRetriever\\n6. ContextualCompressionRetriever\\n7. DocArrayRetriever\\n8. ElasticSearchBM25Retriever\\n9. EnsembleRetriever\\n10. GoogleVertexAISearchRetriever\\n11. AmazonKendraRetriever\\n12. KNNRetriever\\n13. LlamaIndexGraphRetriever\\n14. LlamaIndexRetriever\\n15. MergerRetriever\\n16. MetalRetriever\\n17. MilvusRetriever\\n18. MultiQueryRetriever\\n19. ParentDocumentRetriever\\n20. PineconeHybridSearchRetriever\\n21. PubMedRetriever\\n22. RePhraseQueryRetriever\\n23. RemoteLangChainRetriever\\n24. SelfQueryRetriever\\n25. SVMRetriever\\n26. TFIDFRetriever\\n27. TimeWeightedVectorStoreRetriever\\n28. VespaRetriever\\n29. WeaviateHybridSearchRetriever\\n30. WebResearchRetriever\\n31. WikipediaRetriever\\n32. ZepRetriever\\n33. ZillizRetriever\\n\\nIt also includes self query translators like:\\n\\n1. ChromaTranslator\\n2. DeepLakeTranslator\\n3. MyScaleTranslator\\n4. PineconeTranslator\\n5. QdrantTranslator\\n6. WeaviateTranslator\\n\\nAnd remote retrievers like:\\n\\n1. RemoteLangChainRetriever'}"
]
},
"execution_count": 31,
@@ -1124,7 +1126,7 @@
"- DocArrayRetriever\n",
"- ElasticSearchBM25Retriever\n",
"- EnsembleRetriever\n",
"- GoogleCloudEnterpriseSearchRetriever\n",
"- GoogleVertexAISearchRetriever\n",
"- AmazonKendraRetriever\n",
"- KNNRetriever\n",
"- LlamaIndexGraphRetriever and LlamaIndexRetriever\n",

View File

@@ -6,7 +6,7 @@
"source": [
"# Elasticsearch\n",
"\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/use_cases/qa_structured/integrations/elasticsearch.ipynb)\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/use_cases/qa_structured/integrations/elasticsearch.ipynb)\n",
"\n",
"We can use LLMs to interact with Elasticsearch analytics databases in natural language.\n",
"\n",

View File

@@ -0,0 +1,213 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "2def22ea",
"metadata": {},
"source": [
"# Extraction with OpenAI Tools\n",
"\n",
"Performing extraction has never been easier! OpenAI's tool calling ability is the perfect thing to use as it allows for extracting multiple different elements from text that are different types. \n",
"\n",
"Models after 1106 use tools and support \"parallel function calling\" which makes this super easy."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "5c628496",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.pydantic_v1 import BaseModel\n",
"from typing import Optional, List\n",
"from langchain.chains.openai_tools import create_extraction_chain_pydantic"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "afe9657b",
"metadata": {},
"outputs": [],
"source": [
"# Make sure to use a recent model that supports tools\n",
"model = ChatOpenAI(model=\"gpt-3.5-turbo-1106\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "bc0ca3b6",
"metadata": {},
"outputs": [],
"source": [
"# Pydantic is an easy way to define a schema\n",
"class Person(BaseModel):\n",
" \"\"\"Information about people to extract.\"\"\"\n",
"\n",
" name: str\n",
" age: Optional[int] = None"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "2036af68",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain_pydantic(Person, model)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "1748ad21",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Person(name='jane', age=2), Person(name='bob', age=3)]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"input\": \"jane is 2 and bob is 3\"})"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "c8262ce5",
"metadata": {},
"outputs": [],
"source": [
"# Let's define another element\n",
"class Class(BaseModel):\n",
" \"\"\"Information about classes to extract.\"\"\"\n",
"\n",
" teacher: str\n",
" students: List[str]"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "4973c104",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain_pydantic([Person, Class], model)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "e976a15e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Person(name='jane', age=2),\n",
" Person(name='bob', age=3),\n",
" Class(teacher='Mrs Sampson', students=['jane', 'bob'])]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"input\": \"jane is 2 and bob is 3 and they are in Mrs Sampson's class\"})"
]
},
{
"cell_type": "markdown",
"id": "6575a7d6",
"metadata": {},
"source": [
"## Under the hood\n",
"\n",
"Under the hood, this is a simple chain:"
]
},
{
"cell_type": "markdown",
"id": "b8ba83e5",
"metadata": {},
"source": [
"```python\n",
"from typing import Union, List, Type, Optional\n",
"\n",
"from langchain.output_parsers.openai_tools import PydanticToolsParser\n",
"from langchain.utils.openai_functions import convert_pydantic_to_openai_tool\n",
"from langchain.schema.runnable import Runnable\n",
"from langchain.pydantic_v1 import BaseModel\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.schema.messages import SystemMessage\n",
"from langchain.schema.language_model import BaseLanguageModel\n",
"\n",
"_EXTRACTION_TEMPLATE = \"\"\"Extract and save the relevant entities mentioned \\\n",
"in the following passage together with their properties.\n",
"\n",
"If a property is not present and is not required in the function parameters, do not include it in the output.\"\"\" # noqa: E501\n",
"\n",
"\n",
"def create_extraction_chain_pydantic(\n",
" pydantic_schemas: Union[List[Type[BaseModel]], Type[BaseModel]],\n",
" llm: BaseLanguageModel,\n",
" system_message: str = _EXTRACTION_TEMPLATE,\n",
") -> Runnable:\n",
" if not isinstance(pydantic_schemas, list):\n",
" pydantic_schemas = [pydantic_schemas]\n",
" prompt = ChatPromptTemplate.from_messages([\n",
" (\"system\", system_message),\n",
" (\"user\", \"{input}\")\n",
" ])\n",
" tools = [convert_pydantic_to_openai_tool(p) for p in pydantic_schemas]\n",
" model = llm.bind(tools=tools)\n",
" chain = prompt | model | PydanticToolsParser(tools=pydantic_schemas)\n",
" return chain\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2eac6b68",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -135,9 +135,9 @@
"outputs": [],
"source": [
"# We set this so we can see what exactly is going on\n",
"import langchain\n",
"from langchain.globals import set_verbose\n",
"\n",
"langchain.verbose = True"
"set_verbose(True)"
]
},
{
@@ -489,7 +489,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.1"
}
},
"nbformat": 4,

View File

@@ -77,6 +77,7 @@
"source": [
"from langchain.llms import OpenAI\n",
"from langchain_experimental.autonomous_agents import HuggingGPT\n",
"\n",
"# %env OPENAI_API_BASE=http://localhost:8000/v1"
]
},

File diff suppressed because one or more lines are too long

View File

@@ -10,7 +10,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 1,
"metadata": {},
"outputs": [
{
@@ -37,13 +37,13 @@
"'Hello World\\n'"
]
},
"execution_count": 9,
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chains import LLMBashChain\n",
"from langchain_experimental.llm_bash.base import LLMBashChain\n",
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(temperature=0)\n",
@@ -65,7 +65,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
@@ -98,7 +98,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 3,
"metadata": {},
"outputs": [
{
@@ -125,7 +125,7 @@
"'Hello World\\n'"
]
},
"execution_count": 11,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -149,7 +149,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 4,
"metadata": {},
"outputs": [
{
@@ -166,28 +166,24 @@
"cd ..\n",
"```\u001b[0m\n",
"Code: \u001b[33;1m\u001b[1;3m['ls', 'cd ..']\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3mapi.html\t\t\tllm_summarization_checker.html\n",
"constitutional_chain.html\tmoderation.html\n",
"llm_bash.html\t\t\topenai_openapi.yaml\n",
"llm_checker.html\t\topenapi.html\n",
"llm_math.html\t\t\tpal.html\n",
"llm_requests.html\t\tsqlite.html\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3mcpal.ipynb llm_bash.ipynb llm_symbolic_math.ipynb\n",
"index.mdx llm_math.ipynb pal.ipynb\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'api.html\\t\\t\\tllm_summarization_checker.html\\r\\nconstitutional_chain.html\\tmoderation.html\\r\\nllm_bash.html\\t\\t\\topenai_openapi.yaml\\r\\nllm_checker.html\\t\\topenapi.html\\r\\nllm_math.html\\t\\t\\tpal.html\\r\\nllm_requests.html\\t\\tsqlite.html'"
"'cpal.ipynb llm_bash.ipynb llm_symbolic_math.ipynb\\r\\nindex.mdx llm_math.ipynb pal.ipynb'"
]
},
"execution_count": 12,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.utilities.bash import BashProcess\n",
"from langchain_experimental.llm_bash.bash import BashProcess\n",
"\n",
"\n",
"persistent_process = BashProcess(persistent=True)\n",
@@ -200,7 +196,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 5,
"metadata": {},
"outputs": [
{
@@ -217,18 +213,19 @@
"cd ..\n",
"```\u001b[0m\n",
"Code: \u001b[33;1m\u001b[1;3m['ls', 'cd ..']\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3mexamples\t\tgetting_started.html\tindex_examples\n",
"generic\t\t\thow_to_guides.rst\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m_category_.yml\tdata_generation.ipynb\t\t self_check\n",
"agents\t\tgraph\n",
"code_writing\tlearned_prompt_optimization.ipynb\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'examples\\t\\tgetting_started.html\\tindex_examples\\r\\ngeneric\\t\\t\\thow_to_guides.rst'"
"'_category_.yml\\tdata_generation.ipynb\\t\\t self_check\\r\\nagents\\t\\tgraph\\r\\ncode_writing\\tlearned_prompt_optimization.ipynb'"
]
},
"execution_count": 13,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -237,13 +234,6 @@
"# Run the same command again and see that the state is maintained between calls\n",
"bash_chain.run(text)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -262,7 +252,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.11.4"
}
},
"nbformat": 4,

View File

@@ -10,12 +10,12 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.chains.llm_symbolic_math.base import LLMSymbolicMathChain\n",
"from langchain_experimental.llm_symbolic_math.base import LLMSymbolicMathChain\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"llm_symbolic_math = LLMSymbolicMathChain.from_llm(llm)"
@@ -30,7 +30,7 @@
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 4,
"metadata": {},
"outputs": [
{
@@ -39,7 +39,7 @@
"'Answer: exp(x)*sin(x) + exp(x)*cos(x)'"
]
},
"execution_count": 23,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -50,7 +50,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 5,
"metadata": {},
"outputs": [
{
@@ -59,7 +59,7 @@
"'Answer: exp(x)*sin(x)'"
]
},
"execution_count": 18,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -79,7 +79,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 6,
"metadata": {},
"outputs": [
{
@@ -88,7 +88,7 @@
"'Answer: Eq(y(t), C2*exp(-t) + (C1 + t/2)*exp(t))'"
]
},
"execution_count": 19,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
@@ -99,7 +99,7 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 7,
"metadata": {},
"outputs": [
{
@@ -108,7 +108,7 @@
"'Answer: {0, -sqrt(3)*I/3, sqrt(3)*I/3}'"
]
},
"execution_count": 21,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -119,7 +119,7 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 8,
"metadata": {},
"outputs": [
{
@@ -128,7 +128,7 @@
"'Answer: (3 - sqrt(7), -sqrt(7) - 2, 1 - sqrt(7)), (sqrt(7) + 3, -2 + sqrt(7), 1 + sqrt(7))'"
]
},
"execution_count": 22,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
@@ -140,9 +140,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "venv"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
@@ -154,9 +154,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -414,7 +414,7 @@
"1. define a format they will produce their outputs in\n",
"2. parse their outputs\n",
"\n",
"We can subclass the [RegexParser](https://github.com/hwchase17/langchain/blob/master/langchain/output_parsers/regex.py) to implement our own custom output parser for bids."
"We can subclass the [RegexParser](https://github.com/langchain-ai/langchain/blob/master/langchain/output_parsers/regex.py) to implement our own custom output parser for bids."
]
},
{

View File

@@ -27,11 +27,12 @@
"metadata": {},
"outputs": [],
"source": [
"\n",
"from os import environ\n",
"import getpass\n",
"from typing import Dict, Any\n",
"from langchain.llms import OpenAI\nfrom langchain.utilities import SQLDatabase\nfrom langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.utilities import SQLDatabase\n",
"from langchain.chains import LLMChain\n",
"from langchain_experimental.sql.vector_sql import VectorSQLDatabaseChain\n",
"from sqlalchemy import create_engine, Column, MetaData\n",
"from langchain.prompts import PromptTemplate\n",
@@ -39,7 +40,7 @@
"\n",
"from sqlalchemy import create_engine\n",
"\n",
"MYSCALE_HOST = \"msc-1decbcc9.us-east-1.aws.staging.myscale.cloud\"\n",
"MYSCALE_HOST = \"msc-4a9e710a.us-east-1.aws.staging.myscale.cloud\"\n",
"MYSCALE_PORT = 443\n",
"MYSCALE_USER = \"chatdata\"\n",
"MYSCALE_PASSWORD = \"myscale_rocks\"\n",
@@ -76,7 +77,6 @@
"metadata": {},
"outputs": [],
"source": [
"\n",
"from langchain.llms import OpenAI\n",
"from langchain.callbacks import StdOutCallbackHandler\n",
"\n",
@@ -124,8 +124,9 @@
"from langchain.chains.qa_with_sources.retrieval import RetrievalQAWithSourcesChain\n",
"\n",
"from langchain_experimental.sql.vector_sql import VectorSQLDatabaseChain\n",
"from langchain_experimental.retrievers.vector_sql_database \\\n",
" import VectorSQLDatabaseChainRetriever\n",
"from langchain_experimental.retrievers.vector_sql_database import (\n",
" VectorSQLDatabaseChainRetriever,\n",
")\n",
"from langchain_experimental.sql.prompt import MYSCALE_PROMPT\n",
"from langchain_experimental.sql.vector_sql import VectorSQLRetrieveAllOutputParser\n",
"\n",
@@ -144,7 +145,9 @@
")\n",
"\n",
"# You need all those keys to get docs\n",
"retriever = VectorSQLDatabaseChainRetriever(sql_db_chain=chain, page_content_key=\"abstract\")\n",
"retriever = VectorSQLDatabaseChainRetriever(\n",
" sql_db_chain=chain, page_content_key=\"abstract\"\n",
")\n",
"\n",
"document_with_metadata_prompt = PromptTemplate(\n",
" input_variables=[\"page_content\", \"id\", \"title\", \"authors\", \"pubdate\", \"categories\"],\n",
@@ -162,8 +165,10 @@
" },\n",
" return_source_documents=True,\n",
")\n",
"ans = chain(\"Please give me 10 papers to ask what is PageRank?\",\n",
" callbacks=[StdOutCallbackHandler()])\n",
"ans = chain(\n",
" \"Please give me 10 papers to ask what is PageRank?\",\n",
" callbacks=[StdOutCallbackHandler()],\n",
")\n",
"print(ans[\"answer\"])"
]
},

View File

@@ -0,0 +1,506 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f970f757-ec76-4bf0-90cd-a2fb68b945e3",
"metadata": {},
"source": [
"# Exploring OpenAI V1 functionality\n",
"\n",
"On 11.06.23 OpenAI released a number of new features, and along with it bumped their Python SDK to 1.0.0. This notebook shows off the new features and how to use them with LangChain."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ee897729-263a-4073-898f-bb4cf01ed829",
"metadata": {},
"outputs": [],
"source": [
"# need openai>=1.1.0, langchain>=0.0.333, langchain-experimental>=0.0.39\n",
"!pip install -U openai langchain langchain-experimental"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c3e067ce-7a43-47a7-bc89-41f1de4cf136",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema.messages import HumanMessage, SystemMessage"
]
},
{
"cell_type": "markdown",
"id": "fa7e7e95-90a1-4f73-98fe-10c4b4e0951b",
"metadata": {},
"source": [
"## [Vision](https://platform.openai.com/docs/guides/vision)\n",
"\n",
"OpenAI released multi-modal models, which can take a sequence of text and images as input."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "1c8c3965-d3c9-4186-b5f3-5e67855ef916",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='The image appears to be a diagram representing the architecture or components of a software system or framework related to language processing, possibly named LangChain or associated with a project or product called LangChain, based on the prominent appearance of that term. The diagram is organized into several layers or aspects, each containing various elements or modules:\\n\\n1. **Protocol**: This may be the foundational layer, which includes \"LCEL\" and terms like parallelization, fallbacks, tracing, batching, streaming, async, and composition. These seem related to communication and execution protocols for the system.\\n\\n2. **Integrations Components**: This layer includes \"Model I/O\" with elements such as the model, output parser, prompt, and example selector. It also has a \"Retrieval\" section with a document loader, retriever, embedding model, vector store, and text splitter. Lastly, there\\'s an \"Agent Tooling\" section. These components likely deal with the interaction with external data, models, and tools.\\n\\n3. **Application**: The application layer features \"LangChain\" with chains, agents, agent executors, and common application logic. This suggests that the system uses a modular approach with chains and agents to process language tasks.\\n\\n4. **Deployment**: This contains \"Lang')"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat = ChatOpenAI(model=\"gpt-4-vision-preview\", max_tokens=256)\n",
"chat.invoke(\n",
" [\n",
" HumanMessage(\n",
" content=[\n",
" {\"type\": \"text\", \"text\": \"What is this image showing\"},\n",
" {\n",
" \"type\": \"image_url\",\n",
" \"image_url\": {\n",
" \"url\": \"https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/static/img/langchain_stack.png\",\n",
" \"detail\": \"auto\",\n",
" },\n",
" },\n",
" ]\n",
" )\n",
" ]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "210f8248-fcf3-4052-a4a3-0684e08f8785",
"metadata": {},
"source": [
"## [OpenAI assistants](https://platform.openai.com/docs/assistants/overview)\n",
"\n",
"> The Assistants API allows you to build AI assistants within your own applications. An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries. The Assistants API currently supports three types of tools: Code Interpreter, Retrieval, and Function calling\n",
"\n",
"\n",
"You can interact with OpenAI Assistants using OpenAI tools or custom tools. When using exclusively OpenAI tools, you can just invoke the assistant directly and get final answers. When using custom tools, you can run the assistant and tool execution loop using the built-in AgentExecutor or easily write your own executor.\n",
"\n",
"Below we show the different ways to interact with Assistants. As a simple example, let's build a math tutor that can write and run code."
]
},
{
"cell_type": "markdown",
"id": "318da28d-4cec-42ab-ae3e-76d95bb34fa5",
"metadata": {},
"source": [
"### Using only OpenAI tools"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "a9064bbe-d9f7-4a29-a7b3-73933b3197e7",
"metadata": {},
"outputs": [],
"source": [
"from langchain_experimental.openai_assistant import OpenAIAssistantRunnable"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "7a20a008-49ac-46d2-aa26-b270118af5ea",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[ThreadMessage(id='msg_g9OJv0rpPgnc3mHmocFv7OVd', assistant_id='asst_hTwZeNMMphxzSOqJ01uBMsJI', content=[MessageContentText(text=Text(annotations=[], value='The result of \\\\(10 - 4^{2.7}\\\\) is approximately \\\\(-32.224\\\\).'), type='text')], created_at=1699460600, file_ids=[], metadata={}, object='thread.message', role='assistant', run_id='run_nBIT7SiAwtUfSCTrQNSPLOfe', thread_id='thread_14n4GgXwxgNL0s30WJW5F6p0')]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"interpreter_assistant = OpenAIAssistantRunnable.create_assistant(\n",
" name=\"langchain assistant\",\n",
" instructions=\"You are a personal math tutor. Write and run code to answer math questions.\",\n",
" tools=[{\"type\": \"code_interpreter\"}],\n",
" model=\"gpt-4-1106-preview\",\n",
")\n",
"output = interpreter_assistant.invoke({\"content\": \"What's 10 - 4 raised to the 2.7\"})\n",
"output"
]
},
{
"cell_type": "markdown",
"id": "a8ddd181-ac63-4ab6-a40d-a236120379c1",
"metadata": {},
"source": [
"### As a LangChain agent with arbitrary tools\n",
"\n",
"Now let's recreate this functionality using our own tools. For this example we'll use the [E2B sandbox runtime tool](https://e2b.dev/docs?ref=landing-page-get-started)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ee4cc355-f2d6-4c51-bcf7-f502868357d3",
"metadata": {},
"outputs": [],
"source": [
"!pip install e2b duckduckgo-search"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "48681ac7-b267-48d4-972c-8a7df8393a21",
"metadata": {},
"outputs": [],
"source": [
"from langchain.tools import E2BDataAnalysisTool, DuckDuckGoSearchRun\n",
"\n",
"tools = [E2BDataAnalysisTool(api_key=\"...\"), DuckDuckGoSearchRun()]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1c01dd79-dd3e-4509-a2e2-009a7f99f16a",
"metadata": {},
"outputs": [],
"source": [
"agent = OpenAIAssistantRunnable.create_assistant(\n",
" name=\"langchain assistant e2b tool\",\n",
" instructions=\"You are a personal math tutor. Write and run code to answer math questions. You can also search the internet.\",\n",
" tools=tools,\n",
" model=\"gpt-4-1106-preview\",\n",
" as_agent=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "1ac71d8b-4b4b-4f98-b826-6b3c57a34166",
"metadata": {},
"source": [
"#### Using AgentExecutor"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "1f137f94-801f-4766-9ff5-2de9df5e8079",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'content': \"What's the weather in SF today divided by 2.7\",\n",
" 'output': \"The weather in San Francisco today is reported to have temperatures as high as 66 °F. To get the temperature divided by 2.7, we will calculate that:\\n\\n66 °F / 2.7 = 24.44 °F\\n\\nSo, when the high temperature of 66 °F is divided by 2.7, the result is approximately 24.44 °F. Please note that this doesn't have a meteorological meaning; it's purely a mathematical operation based on the given temperature.\"}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.agents import AgentExecutor\n",
"\n",
"agent_executor = AgentExecutor(agent=agent, tools=tools)\n",
"agent_executor.invoke({\"content\": \"What's the weather in SF today divided by 2.7\"})"
]
},
{
"cell_type": "markdown",
"id": "2d0a0b1d-c1b3-4b50-9dce-1189b51a6206",
"metadata": {},
"source": [
"#### Custom execution"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "c0475fa7-b6c1-4331-b8e2-55407466c724",
"metadata": {},
"outputs": [],
"source": [
"agent = OpenAIAssistantRunnable.create_assistant(\n",
" name=\"langchain assistant e2b tool\",\n",
" instructions=\"You are a personal math tutor. Write and run code to answer math questions.\",\n",
" tools=tools,\n",
" model=\"gpt-4-1106-preview\",\n",
" as_agent=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "b76cb669-6aba-4827-868f-00aa960026f2",
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema.agent import AgentFinish\n",
"\n",
"\n",
"def execute_agent(agent, tools, input):\n",
" tool_map = {tool.name: tool for tool in tools}\n",
" response = agent.invoke(input)\n",
" while not isinstance(response, AgentFinish):\n",
" tool_outputs = []\n",
" for action in response:\n",
" tool_output = tool_map[action.tool].invoke(action.tool_input)\n",
" print(action.tool, action.tool_input, tool_output, end=\"\\n\\n\")\n",
" tool_outputs.append(\n",
" {\"output\": tool_output, \"tool_call_id\": action.tool_call_id}\n",
" )\n",
" response = agent.invoke(\n",
" {\n",
" \"tool_outputs\": tool_outputs,\n",
" \"run_id\": action.run_id,\n",
" \"thread_id\": action.thread_id,\n",
" }\n",
" )\n",
"\n",
" return response"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "7946116a-b82f-492e-835e-ca958a8949a5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"e2b_data_analysis {'python_code': 'print(10 - 4 ** 2.7)'} {\"stdout\": \"-32.22425314473263\", \"stderr\": \"\", \"artifacts\": []}\n",
"\n",
"\\( 10 - 4^{2.7} \\) is approximately \\(-32.22425314473263\\).\n"
]
}
],
"source": [
"response = execute_agent(agent, tools, {\"content\": \"What's 10 - 4 raised to the 2.7\"})\n",
"print(response.return_values[\"output\"])"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "f2744a56-9f4f-4899-827a-fa55821c318c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"e2b_data_analysis {'python_code': 'result = 10 - 4 ** 2.7\\nprint(result + 17.241)'} {\"stdout\": \"-14.983253144732629\", \"stderr\": \"\", \"artifacts\": []}\n",
"\n",
"When you add \\( 17.241 \\) to \\( 10 - 4^{2.7} \\), the result is approximately \\( -14.98325314473263 \\).\n"
]
}
],
"source": [
"next_response = execute_agent(\n",
" agent, tools, {\"content\": \"now add 17.241\", \"thread_id\": response.thread_id}\n",
")\n",
"print(next_response.return_values[\"output\"])"
]
},
{
"cell_type": "markdown",
"id": "71c34763-d1e7-4b9a-a9d7-3e4cc0dfc2c4",
"metadata": {},
"source": [
"## [JSON mode](https://platform.openai.com/docs/guides/text-generation/json-mode)\n",
"\n",
"Constrain the model to only generate valid JSON. Note that you must include a system message with instructions to use JSON for this mode to work.\n",
"\n",
"Only works with certain models. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db6072c4-f3f3-415d-872b-71ea9f3c02bb",
"metadata": {},
"outputs": [],
"source": [
"chat = ChatOpenAI(model=\"gpt-3.5-turbo-1106\").bind(\n",
" response_format={\"type\": \"json_object\"}\n",
")\n",
"\n",
"output = chat.invoke(\n",
" [\n",
" SystemMessage(\n",
" content=\"Extract the 'name' and 'origin' of any companies mentioned in the following statement. Return a JSON list.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Google was founded in the USA, while Deepmind was founded in the UK\"\n",
" ),\n",
" ]\n",
")\n",
"print(output.content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "08e00ccf-b991-4249-846b-9500a0ccbfa0",
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"json.loads(output.content)"
]
},
{
"cell_type": "markdown",
"id": "aa9a94d9-4319-4ab7-a979-c475ce6b5f50",
"metadata": {},
"source": [
"## [System fingerprint](https://platform.openai.com/docs/guides/text-generation/reproducible-outputs)\n",
"\n",
"OpenAI sometimes changes model configurations in a way that impacts outputs. Whenever this happens, the system_fingerprint associated with a generation will change."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1281883c-bf8f-4665-89cd-4f33ccde69ab",
"metadata": {},
"outputs": [],
"source": [
"chat = ChatOpenAI(model=\"gpt-3.5-turbo-1106\")\n",
"output = chat.generate(\n",
" [\n",
" [\n",
" SystemMessage(\n",
" content=\"Extract the 'name' and 'origin' of any companies mentioned in the following statement. Return a JSON list.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Google was founded in the USA, while Deepmind was founded in the UK\"\n",
" ),\n",
" ]\n",
" ]\n",
")\n",
"print(output.llm_output)"
]
},
{
"cell_type": "markdown",
"id": "aa6565be-985d-4127-848e-c3bca9d7b434",
"metadata": {},
"source": [
"## Breaking changes to Azure classes\n",
"\n",
"OpenAI V1 rewrote their clients and separated Azure and OpenAI clients. This has led to some changes in LangChain interfaces when using OpenAI V1.\n",
"\n",
"BREAKING CHANGES:\n",
"- To use Azure embeddings with OpenAI V1, you'll need to use the new `AzureOpenAIEmbeddings` instead of the existing `OpenAIEmbeddings`. `OpenAIEmbeddings` continue to work when using Azure with `openai<1`.\n",
"```python\n",
"from langchain.embeddings import AzureOpenAIEmbeddings\n",
"```\n",
"\n",
"\n",
"RECOMMENDED CHANGES:\n",
"- When using AzureChatOpenAI, if passing in an Azure endpoint (eg https://example-resource.azure.openai.com/) this should be specified via the `azure_endpoint` parameter or the `AZURE_OPENAI_ENDPOINT`. We're maintaining backwards compatibility for now with specifying this via `openai_api_base`/`base_url` or env var `OPENAI_API_BASE` but this shouldn't be relied upon.\n",
"- When using Azure chat or embedding models, pass in API keys either via `openai_api_key` parameter or `AZURE_OPENAI_API_KEY` parameter. We're maintaining backwards compatibility for now with specifying this via `OPENAI_API_KEY` but this shouldn't be relied upon."
]
},
{
"cell_type": "markdown",
"id": "49944887-3972-497e-8da2-6d32d44345a9",
"metadata": {},
"source": [
"## Tools\n",
"\n",
"Use tools for parallel function calling."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "916292d8-0f89-40a6-af1c-5a1122327de8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[GetCurrentWeather(location='New York, NY', unit='fahrenheit'),\n",
" GetCurrentWeather(location='Los Angeles, CA', unit='fahrenheit'),\n",
" GetCurrentWeather(location='San Francisco, CA', unit='fahrenheit')]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from typing import Literal\n",
"\n",
"from langchain.output_parsers.openai_tools import PydanticToolsParser\n",
"from langchain.utils.openai_functions import convert_pydantic_to_openai_tool\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.pydantic_v1 import BaseModel, Field\n",
"\n",
"\n",
"class GetCurrentWeather(BaseModel):\n",
" \"\"\"Get the current weather in a location.\"\"\"\n",
"\n",
" location: str = Field(description=\"The city and state, e.g. San Francisco, CA\")\n",
" unit: Literal[\"celsius\", \"fahrenheit\"] = Field(\n",
" default=\"fahrenheit\", description=\"The temperature unit, default to fahrenheit\"\n",
" )\n",
"\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", \"You are a helpful assistant\"), (\"user\", \"{input}\")]\n",
")\n",
"model = ChatOpenAI(model=\"gpt-3.5-turbo-1106\").bind(\n",
" tools=[convert_pydantic_to_openai_tool(GetCurrentWeather)]\n",
")\n",
"chain = prompt | model | PydanticToolsParser(tools=[GetCurrentWeather])\n",
"\n",
"chain.invoke({\"input\": \"what's the weather in NYC, LA, and SF\"})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "poetry-venv",
"language": "python",
"name": "poetry-venv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,258 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0ddfef23-3c74-444c-81dd-6753722997fa",
"metadata": {},
"source": [
"# Plan-and-execute\n",
"\n",
"Plan-and-execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the [\"Plan-and-Solve\" paper](https://arxiv.org/abs/2305.04091).\n",
"\n",
"The planning is almost always done by an LLM.\n",
"\n",
"The execution is usually done by a separate agent (equipped with tools)."
]
},
{
"cell_type": "markdown",
"id": "a7ecb22a-7009-48ec-b14e-f0fa5aac1cd0",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5fbbd4ee-bfe8-4a25-afe4-8d1a552a3d2e",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents.tools import Tool\n",
"from langchain.chains import LLMMathChain\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.llms import OpenAI\n",
"from langchain.utilities import DuckDuckGoSearchAPIWrapper\n",
"from langchain_experimental.plan_and_execute import (\n",
" PlanAndExecute,\n",
" load_agent_executor,\n",
" load_chat_planner,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "e0e995e5-af9d-4988-bcd0-467a2a2e18cd",
"metadata": {},
"source": [
"## Tools"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1d789f4e-54e3-4602-891a-f076e0ab9594",
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchAPIWrapper()\n",
"llm = OpenAI(temperature=0)\n",
"llm_math_chain = LLMMathChain.from_llm(llm=llm, verbose=True)\n",
"tools = [\n",
" Tool(\n",
" name=\"Search\",\n",
" func=search.run,\n",
" description=\"useful for when you need to answer questions about current events\",\n",
" ),\n",
" Tool(\n",
" name=\"Calculator\",\n",
" func=llm_math_chain.run,\n",
" description=\"useful for when you need to answer questions about math\",\n",
" ),\n",
"]"
]
},
{
"cell_type": "markdown",
"id": "04dc6452-a07f-49f9-be12-95be1e2afccc",
"metadata": {},
"source": [
"## Planner, Executor, and Agent\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "d8f49c03-c804-458b-8122-c92b26c7b7dd",
"metadata": {},
"outputs": [],
"source": [
"model = ChatOpenAI(temperature=0)\n",
"planner = load_chat_planner(model)\n",
"executor = load_agent_executor(model, tools, verbose=True)\n",
"agent = PlanAndExecute(planner=planner, executor=executor)"
]
},
{
"cell_type": "markdown",
"id": "78ba03dd-0322-4927-b58d-a7e2027fdbb3",
"metadata": {},
"source": [
"## Run example"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a57f7efe-7866-47a7-bce5-9c7b1047964e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction:\n",
"{\n",
" \"action\": \"Search\",\n",
" \"action_input\": \"current prime minister of the UK\"\n",
"}\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction:\n",
"```\n",
"{\n",
" \"action\": \"Search\",\n",
" \"action_input\": \"current prime minister of the UK\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mBottom right: Rishi Sunak is the current prime minister and the first non-white prime minister. The prime minister of the United Kingdom is the principal minister of the crown of His Majesty's Government, and the head of the British Cabinet. 3 min. British Prime Minister Rishi Sunak asserted his stance on gender identity in a speech Wednesday, stating it was \"common sense\" that \"a man is a man and a woman is a woman\" — a ... The former chancellor Rishi Sunak is the UK's new prime minister. Here's what you need to know about him. He won after running for the second time this year He lost to Liz Truss in September,... Isaeli Prime Minister Benjamin Netanyahu spoke with US President Joe Biden on Wednesday, the prime minister's office said in a statement. Netanyahu \"thanked the President for the powerful words of ... By Yasmeen Serhan/London Updated: October 25, 2022 12:56 PM EDT | Originally published: October 24, 2022 9:17 AM EDT S top me if you've heard this one before: After a tumultuous period of political...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe search results indicate that Rishi Sunak is the current prime minister of the UK. However, it's important to note that this information may not be accurate or up to date.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction:\n",
"```\n",
"{\n",
" \"action\": \"Search\",\n",
" \"action_input\": \"current age of the prime minister of the UK\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mHow old is Rishi Sunak? Mr Sunak was born on 12 May, 1980, making him 42 years old. He first became an MP in 2015, aged 34, and has served the constituency of Richmond in Yorkshire ever since. He... Prime Ministers' ages when they took office From oldest to youngest, the ages of the PMs were as follows: Winston Churchill - 65 years old James Callaghan - 64 years old Clement Attlee - 62 years... Anna Kaufman USA TODAY Just a few days after Liz Truss resigned as prime minister, the UK has a new prime minister. Truss, who lasted a mere 45 days in office, will be replaced by Rishi... Advertisement Rishi Sunak is the youngest British prime minister of modern times. Mr. Sunak is 42 and started out in Parliament in 2015. Rishi Sunak was appointed as chancellor of the Exchequer... The first prime minister of the current United Kingdom of Great Britain and Northern Ireland upon its effective creation in 1922 (when 26 Irish counties seceded and created the Irish Free State) was Bonar Law, [10] although the country was not renamed officially until 1927, when Stanley Baldwin was the serving prime minister. [11]\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mBased on the search results, it seems that Rishi Sunak is the current prime minister of the UK. However, I couldn't find any specific information about his age. Would you like me to search again for the current age of the prime minister?\n",
"\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Search\",\n",
" \"action_input\": \"age of Rishi Sunak\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mRishi Sunak is 42 years old, making him the youngest person to hold the office of prime minister in modern times. How tall is Rishi Sunak? How Old Is Rishi Sunak? Rishi Sunak was born on May 12, 1980, in Southampton, England. Parents and Nationality Sunak's parents were born to Indian-origin families in East Africa before... Born on May 12, 1980, Rishi is currently 42 years old. He has been a member of parliament since 2015 where he was an MP for Richmond and has served in roles including Chief Secretary to the Treasury and the Chancellor of Exchequer while Boris Johnson was PM. Family Murty, 42, is the daughter of the Indian billionaire NR Narayana Murthy, often described as the Bill Gates of India, who founded the software company Infosys. According to reports, his... Sunak became the first non-White person to lead the country and, at age 42, the youngest to take on the role in more than a century. Like most politicians, Sunak is revered by some and...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mBased on the search results, Rishi Sunak is currently 42 years old. He was born on May 12, 1980.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: To calculate the age raised to the power of 0.43, I can use the calculator tool.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Calculator\",\n",
" \"action_input\": \"42^0.43\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"42^0.43\u001b[32;1m\u001b[1;3m```text\n",
"42**0.43\n",
"```\n",
"...numexpr.evaluate(\"42**0.43\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m4.9888126515157\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 4.9888126515157\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe age raised to the power of 0.43 is approximately 4.9888126515157.\n",
"\n",
"Final Answer:\n",
"```json\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"The age raised to the power of 0.43 is approximately 4.9888126515157.\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction:\n",
"```\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"The current prime minister of the UK is Rishi Sunak. His age raised to the power of 0.43 is approximately 4.9888126515157.\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The current prime minister of the UK is Rishi Sunak. His age raised to the power of 0.43 is approximately 4.9888126515157.'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"Who is the current prime minister of the UK? What is their current age raised to the 0.43 power?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ef78a07-1a2a-46f8-9bc9-ae45f9bd706c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "poetry-venv",
"language": "python",
"name": "poetry-venv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,156 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "62ee82e4-2ad8-498b-8438-fac388afe1a2",
"metadata": {},
"source": [
"Press Releases Data\n",
"=\n",
"\n",
"Press Releases data powered by [Kay.ai](https://kay.ai).\n",
"\n",
">Press releases are used by companies to announce something noteworthy, including product launches, financial performance reports, partnerships, and other significant news. They are widely used by analysts to track corporate strategy, operational updates and financial performance.\n",
"Kay.ai obtains press releases of all US public companies from a variety of sources, which include the company's official press room and partnerships with various data API providers. \n",
"This data is updated till Sept 30th for free access, if you want to access the real-time feed, reach out to us at hello@kay.ai or [tweet at us](https://twitter.com/vishalrohra_)"
]
},
{
"cell_type": "markdown",
"id": "8183d85d-365f-4672-a963-52b533547de0",
"metadata": {},
"source": [
"Setup\n",
"=\n",
"\n",
"First you will need to install the `kay` package. You will also need an API key: you can get one for free at [https://kay.ai](https://kay.ai/). Once you have an API key, you must set it as an environment variable `KAY_API_KEY`.\n",
"\n",
"In this example we're going to use the `KayAiRetriever`. Take a look at the [kay notebook](/docs/integrations/retrievers/kay) for more detailed information for the parmeters that it accepts."
]
},
{
"cell_type": "markdown",
"id": "02ec21c7-49fe-4844-b58a-bf064ad40b2a",
"metadata": {},
"source": [
"Examples\n",
"="
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "bf0395f7-6ebe-4136-8b0d-00b9dea3becd",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n",
" ········\n"
]
}
],
"source": [
"# Setup API keys for Kay and OpenAI\n",
"from getpass import getpass\n",
"\n",
"KAY_API_KEY = getpass()\n",
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f7fcaf70-29a4-444b-8f07-9784f808c300",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"KAY_API_KEY\"] = KAY_API_KEY\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ac00bf93-3635-4ffe-b9a6-a8b4f35c0c85",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import ConversationalRetrievalChain\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.retrievers import KayAiRetriever\n",
"\n",
"model = ChatOpenAI(model_name=\"gpt-3.5-turbo\")\n",
"retriever = KayAiRetriever.create(\n",
" dataset_id=\"company\", data_types=[\"PressRelease\"], num_contexts=6\n",
")\n",
"qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8d9d927c-35b2-4a7b-8ea7-4d0350797941",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-> **Question**: How is the healthcare industry adopting generative AI tools? \n",
"\n",
"**Answer**: The healthcare industry is adopting generative AI tools to improve various aspects of patient care and administrative tasks. Companies like HCA Healthcare Inc, Amazon Com Inc, and Mayo Clinic have collaborated with technology providers like Google Cloud, AWS, and Microsoft to implement generative AI solutions.\n",
"\n",
"HCA Healthcare is testing a nurse handoff tool that generates draft reports quickly and accurately, which nurses have shown interest in using. They are also exploring the use of Google's medically-tuned Med-PaLM 2 LLM to support caregivers in asking complex medical questions.\n",
"\n",
"Amazon Web Services (AWS) has introduced AWS HealthScribe, a generative AI-powered service that automatically creates clinical documentation. However, integrating multiple AI systems into a cohesive solution requires significant engineering resources, including access to AI experts, healthcare data, and compute capacity.\n",
"\n",
"Mayo Clinic is among the first healthcare organizations to deploy Microsoft 365 Copilot, a generative AI service that combines large language models with organizational data from Microsoft 365. This tool has the potential to automate tasks like form-filling, relieving administrative burdens on healthcare providers and allowing them to focus more on patient care.\n",
"\n",
"Overall, the healthcare industry is recognizing the potential benefits of generative AI tools in improving efficiency, automating tasks, and enhancing patient care. \n",
"\n"
]
}
],
"source": [
"# More sample questions in the Playground on https://kay.ai\n",
"questions = [\n",
" \"How is the healthcare industry adopting generative AI tools?\",\n",
" # \"What are some recent challenges faced by the renewable energy sector?\",\n",
"]\n",
"chat_history = []\n",
"\n",
"for question in questions:\n",
" result = qa({\"question\": question, \"chat_history\": chat_history})\n",
" chat_history.append((question, result[\"answer\"]))\n",
" print(f\"-> **Question**: {question} \\n\")\n",
" print(f\"**Answer**: {result['answer']} \\n\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,168 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# RAG based on Qianfan and BES"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook is an implementation of Retrieval augmented generation (RAG) using Baidu Qianfan Platform combined with Baidu ElasricSearch, where the original data is located on BOS.\n",
"## Baidu Qianfan\n",
"Baidu AI Cloud Qianfan Platform is a one-stop large model development and service operation platform for enterprise developers. Qianfan not only provides including the model of Wenxin Yiyan (ERNIE-Bot) and the third-party open-source models, but also provides various AI development tools and the whole set of development environment, which facilitates customers to use and develop large model applications easily.\n",
"\n",
"## Baidu ElasticSearch\n",
"[Baidu Cloud VectorSearch](https://cloud.baidu.com/doc/BES/index.html?from=productToDoc) is a fully managed, enterprise-level distributed search and analysis service which is 100% compatible to open source. Baidu Cloud VectorSearch provides low-cost, high-performance, and reliable retrieval and analysis platform level product services for structured/unstructured data. As a vector database , it supports multiple index types and similarity distance methods. "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation and Setup\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#!pip install qianfan\n",
"#!pip install bce-python-sdk\n",
"#!pip install elasticsearch == 7.11.0"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from baidubce.bce_client_configuration import BceClientConfiguration\n",
"from baidubce.auth.bce_credentials import BceCredentials\n",
"from langchain.document_loaders.baiducloud_bos_directory import BaiduBOSDirectoryLoader\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
"from langchain.vectorstores import BESVectorStore\n",
"from langchain.llms.baidu_qianfan_endpoint import QianfanLLMEndpoint"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Document loading"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bos_host = \"your bos eddpoint\"\n",
"access_key_id = \"your bos access ak\"\n",
"secret_access_key = \"your bos access sk\"\n",
"\n",
"# create BceClientConfiguration\n",
"config = BceClientConfiguration(credentials=BceCredentials(access_key_id, secret_access_key), endpoint = bos_host)\n",
"\n",
"loader = BaiduBOSDirectoryLoader(conf=config, bucket=\"llm-test\", prefix=\"llm/\")\n",
"documents = loader.load()\n",
"\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=0)\n",
"split_docs = text_splitter.split_documents(documents)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Embedding and VectorStore"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"embeddings = HuggingFaceEmbeddings(model_name=\"shibing624/text2vec-base-chinese\")\n",
"embeddings.client = sentence_transformers.SentenceTransformer(embeddings.model_name)\n",
"\n",
"db = BESVectorStore.from_documents(\n",
" documents=split_docs, embedding=embeddings, bes_url=\"your bes url\", index_name='test-index', vector_query_field='vector'\n",
" )\n",
"\n",
"db.client.indices.refresh(index='test-index')\n",
"retriever = db.as_retriever()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## QA Retriever"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"llm = QianfanLLMEndpoint(model=\"ERNIE-Bot\", qianfan_ak='your qianfan ak', qianfan_sk='your qianfan sk', streaming=True)\n",
"qa = RetrievalQA.from_chain_type(llm=llm, chain_type=\"refine\", retriever=retriever, return_source_documents=True)\n",
"\n",
"query = \"什么是张量?\"\n",
"print(qa.run(query))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"> 张量Tensor是一个数学概念用于表示多维数据。它是一个可以表示多个数值的数组可以是标量、向量、矩阵等。在深度学习和人工智能领域中张量常用于表示神经网络的输入、输出和权重等。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.9.17"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}

272
cookbook/rag_fusion.ipynb Normal file
View File

@@ -0,0 +1,272 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "993c2768",
"metadata": {},
"source": [
"# RAG Fusion\n",
"\n",
"Re-implemented from [this GitHub repo](https://github.com/Raudaschl/rag-fusion), all credit to original author\n",
"\n",
"> RAG-Fusion, a search methodology that aims to bridge the gap between traditional search paradigms and the multifaceted dimensions of human queries. Inspired by the capabilities of Retrieval Augmented Generation (RAG), this project goes a step further by employing multiple query generation and Reciprocal Rank Fusion to re-rank search results."
]
},
{
"cell_type": "markdown",
"id": "ebcc6791",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"For this example, we will use Pinecone and some fake data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "661a1c36",
"metadata": {},
"outputs": [],
"source": [
"import pinecone\n",
"from langchain.vectorstores import Pinecone\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"\n",
"pinecone.init(api_key=\"...\", environment=\"...\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "48ef7e93",
"metadata": {},
"outputs": [],
"source": [
"all_documents = {\n",
" \"doc1\": \"Climate change and economic impact.\",\n",
" \"doc2\": \"Public health concerns due to climate change.\",\n",
" \"doc3\": \"Climate change: A social perspective.\",\n",
" \"doc4\": \"Technological solutions to climate change.\",\n",
" \"doc5\": \"Policy changes needed to combat climate change.\",\n",
" \"doc6\": \"Climate change and its impact on biodiversity.\",\n",
" \"doc7\": \"Climate change: The science and models.\",\n",
" \"doc8\": \"Global warming: A subset of climate change.\",\n",
" \"doc9\": \"How climate change affects daily weather.\",\n",
" \"doc10\": \"The history of climate change activism.\",\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fde89f0b",
"metadata": {},
"outputs": [],
"source": [
"vectorstore = Pinecone.from_texts(\n",
" list(all_documents.values()), OpenAIEmbeddings(), index_name=\"rag-fusion\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "22ddd041",
"metadata": {},
"source": [
"## Define the Query Generator\n",
"\n",
"We will now define a chain to do the query generation"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "1d547524",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "af9ab4db",
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"\n",
"prompt = hub.pull(\"langchain-ai/rag-fusion-query-generation\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3628b552",
"metadata": {},
"outputs": [],
"source": [
"# prompt = ChatPromptTemplate.from_messages([\n",
"# (\"system\", \"You are a helpful assistant that generates multiple search queries based on a single input query.\"),\n",
"# (\"user\", \"Generate multiple search queries related to: {original_query}\"),\n",
"# (\"user\", \"OUTPUT (4 queries):\")\n",
"# ])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8d6cbb73",
"metadata": {},
"outputs": [],
"source": [
"generate_queries = (\n",
" prompt | ChatOpenAI(temperature=0) | StrOutputParser() | (lambda x: x.split(\"\\n\"))\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ee2824cd",
"metadata": {},
"source": [
"## Define the full chain\n",
"\n",
"We can now put it all together and define the full chain. This chain:\n",
" \n",
" 1. Generates a bunch of queries\n",
" 2. Looks up each query in the retriever\n",
" 3. Joins all the results together using reciprocal rank fusion\n",
" \n",
" \n",
"Note that it does NOT do a final generation step"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "ca0bfec4",
"metadata": {},
"outputs": [],
"source": [
"original_query = \"impact of climate change\""
]
},
{
"cell_type": "code",
"execution_count": 75,
"id": "02437d65",
"metadata": {},
"outputs": [],
"source": [
"vectorstore = Pinecone.from_existing_index(\"rag-fusion\", OpenAIEmbeddings())\n",
"retriever = vectorstore.as_retriever()"
]
},
{
"cell_type": "code",
"execution_count": 76,
"id": "46a9a0e6",
"metadata": {},
"outputs": [],
"source": [
"from langchain.load import dumps, loads\n",
"\n",
"\n",
"def reciprocal_rank_fusion(results: list[list], k=60):\n",
" fused_scores = {}\n",
" for docs in results:\n",
" # Assumes the docs are returned in sorted order of relevance\n",
" for rank, doc in enumerate(docs):\n",
" doc_str = dumps(doc)\n",
" if doc_str not in fused_scores:\n",
" fused_scores[doc_str] = 0\n",
" previous_score = fused_scores[doc_str]\n",
" fused_scores[doc_str] += 1 / (rank + k)\n",
"\n",
" reranked_results = [\n",
" (loads(doc), score)\n",
" for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)\n",
" ]\n",
" return reranked_results"
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "3f9d4502",
"metadata": {},
"outputs": [],
"source": [
"chain = generate_queries | retriever.map() | reciprocal_rank_fusion"
]
},
{
"cell_type": "code",
"execution_count": 78,
"id": "d70c4fcd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='Climate change and economic impact.'),\n",
" 0.06558258417063283),\n",
" (Document(page_content='Climate change: A social perspective.'),\n",
" 0.06400409626216078),\n",
" (Document(page_content='How climate change affects daily weather.'),\n",
" 0.04787506400409626),\n",
" (Document(page_content='Climate change and its impact on biodiversity.'),\n",
" 0.03306010928961749),\n",
" (Document(page_content='Public health concerns due to climate change.'),\n",
" 0.016666666666666666),\n",
" (Document(page_content='Technological solutions to climate change.'),\n",
" 0.016666666666666666),\n",
" (Document(page_content='Policy changes needed to combat climate change.'),\n",
" 0.01639344262295082)]"
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"original_query\": original_query})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7866e551",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,688 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Incoporating semantic similarity in tabular databases\n",
"\n",
"In this notebook we will cover how to run semantic search over a specific table column within a single SQL query, combining tabular query with RAG.\n",
"\n",
"\n",
"### Overall workflow\n",
"\n",
"1. Generating embeddings for a specific column\n",
"2. Storing the embeddings in a new column (if column has low cardinality, it's better to use another table containing unique values and their embeddings)\n",
"3. Querying using standard SQL queries with [PGVector](https://github.com/pgvector/pgvector) extension which allows using L2 distance (`<->`), Cosine distance (`<=>` or cosine similarity using `1 - <=>`) and Inner product (`<#>`)\n",
"4. Running standard SQL query\n",
"\n",
"### Requirements\n",
"\n",
"We will need a PostgreSQL database with [pgvector](https://github.com/pgvector/pgvector) extension enabled. For this example, we will use a `Chinook` database using a local PostgreSQL server."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = os.environ.get(\"OPENAI_API_KEY\") or getpass.getpass(\n",
" \"OpenAI API Key:\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.sql_database import SQLDatabase\n",
"from langchain.chat_models import ChatOpenAI\n",
"\n",
"CONNECTION_STRING = \"postgresql+psycopg2://postgres:test@localhost:5432/vectordb\" # Replace with your own\n",
"db = SQLDatabase.from_uri(CONNECTION_STRING)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Embedding the song titles"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"For this example, we will run queries based on semantic meaning of song titles. In order to do this, let's start by adding a new column in the table for storing the embeddings:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# db.run('ALTER TABLE \"Track\" ADD COLUMN \"embeddings\" vector;')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's generate the embedding for each *track title* and store it as a new column in our \"Track\" table"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import OpenAIEmbeddings\n",
"\n",
"embeddings_model = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3503"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tracks = db.run('SELECT \"Name\" FROM \"Track\"')\n",
"song_titles = [s[0] for s in eval(tracks)]\n",
"title_embeddings = embeddings_model.embed_documents(song_titles)\n",
"len(title_embeddings)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's insert the embeddings in the into the new column from our table"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from tqdm import tqdm\n",
"\n",
"for i in tqdm(range(len(title_embeddings))):\n",
" title = titles[i].replace(\"'\", \"''\")\n",
" embedding = title_embeddings[i]\n",
" sql_command = (\n",
" f'UPDATE \"Track\" SET \"embeddings\" = ARRAY{embedding} WHERE \"Name\" ='\n",
" + f\"'{title}'\"\n",
" )\n",
" db.run(sql_command)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"We can test the semantic search running the following query:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'[(\"Tomorrow\\'s Dream\",), (\\'Remember Tomorrow\\',), (\\'Remember Tomorrow\\',), (\\'The Best Is Yet To Come\\',), (\"Thinking \\'Bout Tomorrow\",)]'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"embeded_title = embeddings_model.embed_query(\"hope about the future\")\n",
"query = (\n",
" 'SELECT \"Track\".\"Name\" FROM \"Track\" WHERE \"Track\".\"embeddings\" IS NOT NULL ORDER BY \"embeddings\" <-> '\n",
" + f\"'{embeded_title}' LIMIT 5\"\n",
")\n",
"db.run(query)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating the SQL Chain"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's start by defining useful functions to get info from database and running the query:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"def get_schema(_):\n",
" return db.get_table_info()\n",
"\n",
"\n",
"def run_query(query):\n",
" return db.run(query)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's build the **prompt** we will use. This prompt is an extension from [text-to-postgres-sql](https://smith.langchain.com/hub/jacob/text-to-postgres-sql?organizationId=f9b614b8-5c3a-4e7c-afbc-6d7ad4fd8892) prompt"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import ChatPromptTemplate\n",
"\n",
"template = \"\"\"You are a Postgres expert. Given an input question, first create a syntactically correct Postgres query to run, then look at the results of the query and return the answer to the input question.\n",
"Unless the user specifies in the question a specific number of examples to obtain, query for at most 5 results using the LIMIT clause as per Postgres. You can order the results to return the most informative data in the database.\n",
"Never query for all columns from a table. You must query only the columns that are needed to answer the question. Wrap each column name in double quotes (\") to denote them as delimited identifiers.\n",
"Pay attention to use only the column names you can see in the tables below. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.\n",
"Pay attention to use date('now') function to get the current date, if the question involves \"today\".\n",
"\n",
"You can use an extra extension which allows you to run semantic similarity using <-> operator on tables containing columns named \"embeddings\".\n",
"<-> operator can ONLY be used on embeddings columns.\n",
"The embeddings value for a given row typically represents the semantic meaning of that row.\n",
"The vector represents an embedding representation of the question, given below. \n",
"Do NOT fill in the vector values directly, but rather specify a `[search_word]` placeholder, which should contain the word that would be embedded for filtering.\n",
"For example, if the user asks for songs about 'the feeling of loneliness' the query could be:\n",
"'SELECT \"[whatever_table_name]\".\"SongName\" FROM \"[whatever_table_name]\" ORDER BY \"embeddings\" <-> '[loneliness]' LIMIT 5'\n",
"\n",
"Use the following format:\n",
"\n",
"Question: <Question here>\n",
"SQLQuery: <SQL Query to run>\n",
"SQLResult: <Result of the SQLQuery>\n",
"Answer: <Final answer here>\n",
"\n",
"Only use the following tables:\n",
"\n",
"{schema}\n",
"\"\"\"\n",
"\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", template), (\"human\", \"{question}\")]\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"And we can create the chain using **[LangChain Expression Language](https://python.langchain.com/docs/expression_language/)**:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnablePassthrough\n",
"\n",
"db = SQLDatabase.from_uri(\n",
" CONNECTION_STRING\n",
") # We reconnect to db so the new columns are loaded as well.\n",
"llm = ChatOpenAI(model_name=\"gpt-4\", temperature=0)\n",
"\n",
"sql_query_chain = (\n",
" RunnablePassthrough.assign(schema=get_schema)\n",
" | prompt\n",
" | llm.bind(stop=[\"\\nSQLResult:\"])\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'SQLQuery: SELECT \"Track\".\"Name\" FROM \"Track\" JOIN \"Genre\" ON \"Track\".\"GenreId\" = \"Genre\".\"GenreId\" WHERE \"Genre\".\"Name\" = \\'Rock\\' ORDER BY \"Track\".\"embeddings\" <-> \\'[dispair]\\' LIMIT 5'"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sql_query_chain.invoke(\n",
" {\n",
" \"question\": \"Which are the 5 rock songs with titles about deep feeling of dispair?\"\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This chain simply generates the query. Now we will create the full chain that also handles the execution and the final result for the user:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"from langchain.schema.runnable import RunnableLambda\n",
"\n",
"\n",
"def replace_brackets(match):\n",
" words_inside_brackets = match.group(1).split(\", \")\n",
" embedded_words = [\n",
" str(embeddings_model.embed_query(word)) for word in words_inside_brackets\n",
" ]\n",
" return \"', '\".join(embedded_words)\n",
"\n",
"\n",
"def get_query(query):\n",
" sql_query = re.sub(r\"\\[([\\w\\s,]+)\\]\", replace_brackets, query)\n",
" return sql_query\n",
"\n",
"\n",
"template = \"\"\"Based on the table schema below, question, sql query, and sql response, write a natural language response:\n",
"{schema}\n",
"\n",
"Question: {question}\n",
"SQL Query: {query}\n",
"SQL Response: {response}\"\"\"\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", template), (\"human\", \"{question}\")]\n",
")\n",
"\n",
"full_chain = (\n",
" RunnablePassthrough.assign(query=sql_query_chain)\n",
" | RunnablePassthrough.assign(\n",
" schema=get_schema,\n",
" response=RunnableLambda(lambda x: db.run(get_query(x[\"query\"]))),\n",
" )\n",
" | prompt\n",
" | llm\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using the Chain"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example 1: Filtering a column based on semantic meaning"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's say we want to retrieve songs that express `deep feeling of dispair`, but filtering based on genre:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"The 5 rock songs with titles that convey a deep feeling of despair are 'Sea Of Sorrow', 'Surrender', 'Indifference', 'Hard Luck Woman', and 'Desire'.\")"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_chain.invoke(\n",
" {\n",
" \"question\": \"Which are the 5 rock songs with titles about deep feeling of dispair?\"\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"What is substantially different in implementing this method is that we have combined:\n",
"- Semantic search (songs that have titles with some semantic meaning)\n",
"- Traditional tabular querying (running JOIN statements to filter track based on genre)\n",
"\n",
"This is something we _could_ potentially achieve using metadata filtering, but it's more complex to do so (we would need to use a vector database containing the embeddings, and use metadata filtering based on genre).\n",
"\n",
"However, for other use cases metadata filtering **wouldn't be enough**."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example 2: Combining filters"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"The three albums which have the most amount of songs in the top 150 saddest songs are 'International Superhits' with 5 songs, 'Ten' with 4 songs, and 'Album Of The Year' with 3 songs.\")"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_chain.invoke(\n",
" {\n",
" \"question\": \"I want to know the 3 albums which have the most amount of songs in the top 150 saddest songs\"\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"So we have result for 3 albums with most amount of songs in top 150 saddest ones. This **wouldn't** be possible using only standard metadata filtering. Without this _hybdrid query_, we would need some postprocessing to get the result.\n",
"\n",
"Another similar exmaple:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"The 6 albums with the shortest titles that contain songs which are in the 20 saddest song list are 'Ten', 'Core', 'Big Ones', 'One By One', 'Black Album', and 'Miles Ahead'.\")"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_chain.invoke(\n",
" {\n",
" \"question\": \"I need the 6 albums with shortest title, as long as they contain songs which are in the 20 saddest song list.\"\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see what the query looks like to double check:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WITH \"SadSongs\" AS (\n",
" SELECT \"TrackId\" FROM \"Track\" \n",
" ORDER BY \"embeddings\" <-> '[sad]' LIMIT 20\n",
"),\n",
"\"SadAlbums\" AS (\n",
" SELECT DISTINCT \"AlbumId\" FROM \"Track\" \n",
" WHERE \"TrackId\" IN (SELECT \"TrackId\" FROM \"SadSongs\")\n",
")\n",
"SELECT \"Album\".\"Title\" FROM \"Album\" \n",
"WHERE \"AlbumId\" IN (SELECT \"AlbumId\" FROM \"SadAlbums\") \n",
"ORDER BY \"title_len\" ASC \n",
"LIMIT 6\n"
]
}
],
"source": [
"print(\n",
" sql_query_chain.invoke(\n",
" {\n",
" \"question\": \"I need the 6 albums with shortest title, as long as they contain songs which are in the 20 saddest song list.\"\n",
" }\n",
" )\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example 3: Combining two separate semantic searches\n",
"\n",
"One interesting aspect of this approach which is **substantially different from using standar RAG** is that we can even **combine** two semantic search filters:\n",
"- _Get 5 saddest songs..._\n",
"- _**...obtained from albums with \"lovely\" titles**_\n",
"\n",
"This could generalize to **any kind of combined RAG** (paragraphs discussing _X_ topic belonging from books about _Y_, replies to a tweet about _ABC_ topic that express _XYZ_ feeling)\n",
"\n",
"We will combine semantic search on songs and album titles, so we need to do the same for `Album` table:\n",
"1. Generate the embeddings\n",
"2. Add them to the table as a new column (which we need to add in the table)"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [],
"source": [
"# db.run('ALTER TABLE \"Album\" ADD COLUMN \"embeddings\" vector;')"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 347/347 [00:01<00:00, 179.64it/s]\n"
]
}
],
"source": [
"albums = db.run('SELECT \"Title\" FROM \"Album\"')\n",
"album_titles = [title[0] for title in eval(albums)]\n",
"album_title_embeddings = embeddings_model.embed_documents(album_titles)\n",
"for i in tqdm(range(len(album_title_embeddings))):\n",
" album_title = album_titles[i].replace(\"'\", \"''\")\n",
" album_embedding = album_title_embeddings[i]\n",
" sql_command = (\n",
" f'UPDATE \"Album\" SET \"embeddings\" = ARRAY{album_embedding} WHERE \"Title\" ='\n",
" + f\"'{album_title}'\"\n",
" )\n",
" db.run(sql_command)"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"\"[('Realize',), ('Morning Dance',), ('Into The Light',), ('New Adventures In Hi-Fi',), ('Miles Ahead',)]\""
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"embeded_title = embeddings_model.embed_query(\"hope about the future\")\n",
"query = (\n",
" 'SELECT \"Album\".\"Title\" FROM \"Album\" WHERE \"Album\".\"embeddings\" IS NOT NULL ORDER BY \"embeddings\" <-> '\n",
" + f\"'{embeded_title}' LIMIT 5\"\n",
")\n",
"db.run(query)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can combine both filters:"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [],
"source": [
"db = SQLDatabase.from_uri(\n",
" CONNECTION_STRING\n",
") # We reconnect to dbso the new columns are loaded as well."
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='The songs about breakouts obtained from the top 5 albums about love are \\'Royal Orleans\\', \"Nobody\\'s Fault But Mine\", \\'Achilles Last Stand\\', \\'For Your Life\\', and \\'Hots On For Nowhere\\'.')"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_chain.invoke(\n",
" {\n",
" \"question\": \"I want to know songs about breakouts obtained from top 5 albums about love\"\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This is something **different** that **couldn't be achieved** using standard metadata filtering over a vectordb."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.18"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

353
cookbook/rewrite.ipynb Normal file
View File

@@ -0,0 +1,353 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "260629f9",
"metadata": {},
"source": [
"# Rewrite-Retrieve-Read\n",
"\n",
"**Rewrite-Retrieve-Read** is a method proposed in the paper [Query Rewriting for Retrieval-Augmented Large Language Models](https://arxiv.org/pdf/2305.14283.pdf)\n",
"\n",
"> Because the original query can not be always optimal to retrieve for the LLM, especially in the real world... we first prompt an LLM to rewrite the queries, then conduct retrieval-augmented reading\n",
"\n",
"We show how you can easily do that with LangChain Expression Language"
]
},
{
"cell_type": "markdown",
"id": "eda93712",
"metadata": {},
"source": [
"## Baseline\n",
"\n",
"Baseline RAG (**Retrieve-and-read**) can be done like the following:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1d2edbd2",
"metadata": {},
"outputs": [],
"source": [
"from operator import itemgetter\n",
"\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnablePassthrough, RunnableLambda\n",
"from langchain.utilities import DuckDuckGoSearchAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "86a46aa9",
"metadata": {},
"outputs": [],
"source": [
"template = \"\"\"Answer the users question based only on the following context:\n",
"\n",
"<context>\n",
"{context}\n",
"</context>\n",
"\n",
"Question: {question}\n",
"\"\"\"\n",
"prompt = ChatPromptTemplate.from_template(template)\n",
"\n",
"model = ChatOpenAI(temperature=0)\n",
"\n",
"search = DuckDuckGoSearchAPIWrapper()\n",
"\n",
"\n",
"def retriever(query):\n",
" return search.run(query)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "8566d48e",
"metadata": {},
"outputs": [],
"source": [
"chain = (\n",
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
" | prompt\n",
" | model\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "5c57f9ee",
"metadata": {},
"outputs": [],
"source": [
"simple_query = \"what is langchain?\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "37c5f962",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"\"LangChain is a powerful and versatile Python library that enables developers and researchers to create, experiment with, and analyze language models and agents. It simplifies the development of language-based applications by providing a suite of features for artificial general intelligence. It can be used to build chatbots, perform document analysis and summarization, and streamline interaction with various large language model providers. LangChain's unique proposition is its ability to create logical links between one or more language models, known as Chains. It is an open-source library that offers a generic interface to foundation models and allows prompt management and integration with other components and tools.\""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(simple_query)"
]
},
{
"cell_type": "markdown",
"id": "23bdb9bd",
"metadata": {},
"source": [
"While this is fine for well formatted queries, it can break down for more complicated queries"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "8df6a814",
"metadata": {},
"outputs": [],
"source": [
"distracted_query = \"man that sam bankman fried trial was crazy! what is langchain?\""
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "16d7db64",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Based on the given context, there is no information provided about \"langchain.\"'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(distracted_query)"
]
},
{
"cell_type": "markdown",
"id": "0b4f8b93",
"metadata": {},
"source": [
"This is because the retriever does a bad job with these \"distracted\" queries"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "3439d8dc",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Business She\\'s the star witness against Sam Bankman-Fried. Her testimony was explosive Gary Wang, who co-founded both FTX and Alameda Research, said Bankman-Fried directed him to change a... The Verge, following the trial\\'s Oct. 4 kickoff: \"Is Sam Bankman-Fried\\'s Defense Even Trying to Win?\". CBS Moneywatch, from Thursday: \"Sam Bankman-Fried\\'s Lawyer Struggles to Poke ... Sam Bankman-Fried, FTX\\'s founder, responded with a single word: \"Oof.\". Less than a year later, Mr. Bankman-Fried, 31, is on trial in federal court in Manhattan, fighting criminal charges ... July 19, 2023. A U.S. judge on Wednesday overruled objections by Sam Bankman-Fried\\'s lawyers and allowed jurors in the FTX founder\\'s fraud trial to see a profane message he sent to a reporter days ... Sam Bankman-Fried, who was once hailed as a virtuoso in cryptocurrency trading, is on trial over the collapse of FTX, the financial exchange he founded. Bankman-Fried is accused of...'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever(distracted_query)"
]
},
{
"cell_type": "markdown",
"id": "7eb748ac",
"metadata": {},
"source": [
"## Rewrite-Retrieve-Read Implementation\n",
"\n",
"The main part is a rewriter to rewrite the search query"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "88ae702e",
"metadata": {},
"outputs": [],
"source": [
"template = \"\"\"Provide a better search query for \\\n",
"web search engine to answer the given question, end \\\n",
"the queries with **. Question: \\\n",
"{x} Answer:\"\"\"\n",
"rewrite_prompt = ChatPromptTemplate.from_template(template)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "184e1bcb",
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"\n",
"rewrite_prompt = hub.pull(\"langchain-ai/rewrite\")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "a4c23d40",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Provide a better search query for web search engine to answer the given question, end the queries with **. Question {x} Answer:\n"
]
}
],
"source": [
"print(rewrite_prompt.template)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f55cd010",
"metadata": {},
"outputs": [],
"source": [
"# Parser to remove the `**`\n",
"\n",
"\n",
"def _parse(text):\n",
" return text.strip(\"**\")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "c9c34bef",
"metadata": {},
"outputs": [],
"source": [
"rewriter = rewrite_prompt | ChatOpenAI(temperature=0) | StrOutputParser() | _parse"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "fb17fb3d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'What is the definition and purpose of Langchain?'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rewriter.invoke({\"x\": distracted_query})"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f83edb09",
"metadata": {},
"outputs": [],
"source": [
"rewrite_retrieve_read_chain = (\n",
" {\n",
" \"context\": {\"x\": RunnablePassthrough()} | rewriter | retriever,\n",
" \"question\": RunnablePassthrough(),\n",
" }\n",
" | prompt\n",
" | model\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "43096322",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Based on the given context, LangChain is an open-source framework designed to simplify the creation of applications using large language models (LLMs). It enables LLM models to generate responses based on up-to-date online information and simplifies the organization of large volumes of data for easy access by LLMs. LangChain offers a standard interface for chains, integrations with other tools, and end-to-end chains for common applications. It is a robust library that streamlines interaction with various LLM providers. LangChain\\'s unique proposition is its ability to create logical links between one or more LLMs, known as Chains. It is an AI framework with features that simplify the development of language-based applications and offers a suite of features for artificial general intelligence. However, the context does not provide any information about the \"sam bankman fried trial\" mentioned in the question.'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rewrite_retrieve_read_chain.invoke(distracted_query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "59874b4f",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -12,14 +12,14 @@
"\n",
"SalesGPT is context-aware, which means it can understand what section of a sales conversation it is in and act accordingly.\n",
" \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activites, such as outbound sales calls. \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activities, such as outbound sales calls. \n",
"\n",
"Additionally, the AI Sales agent has access to tools, which allow it to interact with other systems.\n",
"\n",
"Here, we show how the AI Sales Agent can use a **Product Knowledge Base** to speak about a particular's company offerings,\n",
"hence increasing relevance and reducing hallucinations.\n",
"\n",
"We leverage the [`langchain`](https://github.com/hwchase17/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
"We leverage the [`langchain`](https://github.com/langchain-ai/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
]
},
{
@@ -66,7 +66,7 @@
"metadata": {},
"outputs": [],
"source": [
"# install aditional dependencies\n",
"# install additional dependencies\n",
"# ! pip install chromadb openai tiktoken"
]
},
@@ -150,7 +150,7 @@
" {conversation_history}\n",
" ===\n",
"\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",
@@ -277,7 +277,7 @@
" \n",
" ===\n",
"\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",

View File

@@ -0,0 +1,177 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e93283d1",
"metadata": {},
"source": [
"# Selecting LLMs based on Context Length\n",
"\n",
"Different LLMs have different context lengths. As a very immediate an practical example, OpenAI has two versions of GPT-3.5-Turbo: one with 4k context, another with 16k context. This notebook shows how to route between them based on input."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "cc453450",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.schema.prompt import PromptValue\n",
"from langchain.schema.messages import BaseMessage\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from typing import Union, Sequence"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "1cec6a10",
"metadata": {},
"outputs": [],
"source": [
"short_context_model = ChatOpenAI(model=\"gpt-3.5-turbo\")\n",
"long_context_model = ChatOpenAI(model=\"gpt-3.5-turbo-16k\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "772da153",
"metadata": {},
"outputs": [],
"source": [
"def get_context_length(prompt: PromptValue):\n",
" messages = prompt.to_messages()\n",
" tokens = short_context_model.get_num_tokens_from_messages(messages)\n",
" return tokens"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "db771e20",
"metadata": {},
"outputs": [],
"source": [
"prompt = PromptTemplate.from_template(\"Summarize this passage: {context}\")"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "af057e2f",
"metadata": {},
"outputs": [],
"source": [
"def choose_model(prompt: PromptValue):\n",
" context_len = get_context_length(prompt)\n",
" if context_len < 30:\n",
" print(\"short model\")\n",
" return short_context_model\n",
" else:\n",
" print(\"long model\")\n",
" return long_context_model"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "84f3e07d",
"metadata": {},
"outputs": [],
"source": [
"chain = prompt | choose_model | StrOutputParser()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "d8b14f8f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"short model\n"
]
},
{
"data": {
"text/plain": [
"'The passage mentions that a frog visited a pond.'"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"context\": \"a frog went to a pond\"})"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "70ebd3dd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"long model\n"
]
},
{
"data": {
"text/plain": [
"'The passage describes a frog that moved from one pond to another and perched on a log.'"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(\n",
" {\"context\": \"a frog went to a pond and sat on a log and went to a different pond\"}\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a7e29fef",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because it is too large Load Diff

View File

@@ -17,7 +17,7 @@
"\n",
"Note that SmartLLMChains\n",
"- use more LLM passes (ie n+2 instead of just 1)\n",
"- only work then the underlying LLM has the capability for reflection, whicher smaller models often don't\n",
"- only work then the underlying LLM has the capability for reflection, which smaller models often don't\n",
"- only work with underlying models that return exactly 1 output, not multiple\n",
"\n",
"This notebook demonstrates how to use a SmartLLMChain."
@@ -241,7 +241,7 @@
" ideation_llm=ChatOpenAI(temperature=0.9, model_name=\"gpt-4\"),\n",
" llm=ChatOpenAI(\n",
" temperature=0, model_name=\"gpt-4\"\n",
" ), # will be used for critqiue and resolution as no specific llms are given\n",
" ), # will be used for critique and resolution as no specific llms are given\n",
" prompt=prompt,\n",
" n_ideas=3,\n",
" verbose=True,\n",

View File

@@ -1,3 +1,7 @@
# SQL Database Chain
This example demonstrates the use of the `SQLDatabaseChain` for answering questions over a SQL database.
Under the hood, LangChain uses SQLAlchemy to connect to SQL databases. The `SQLDatabaseChain` can therefore be used with any SQL dialect supported by SQLAlchemy, such as MS SQL, MySQL, MariaDB, PostgreSQL, Oracle SQL, [Databricks](/docs/ecosystem/integrations/databricks.html) and SQLite. Please refer to the SQLAlchemy documentation for more information about requirements for connecting to your database. For example, a connection to MySQL requires an appropriate connector such as PyMySQL. A URI for a MySQL connection might look like: `mysql+pymysql://user:pass@some_mysql_db_address/db_name`.
This demonstration uses SQLite and the example Chinook database.
@@ -31,8 +35,8 @@ db_chain.run("How many employees are there?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many employees are there?
SQLQuery:
@@ -71,8 +75,8 @@ db_chain.run("How many albums by Aerosmith?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many albums by Aerosmith?
SQLQuery:SELECT COUNT(*) FROM Album WHERE ArtistId = 3;
@@ -129,8 +133,8 @@ db_chain.run("How many employees are there in the foobar table?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many employees are there in the foobar table?
SQLQuery:SELECT COUNT(*) FROM Employee;
@@ -165,8 +169,8 @@ result["intermediate_steps"]
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many employees are there in the foobar table?
SQLQuery:SELECT COUNT(*) FROM Employee;
@@ -191,6 +195,112 @@ result["intermediate_steps"]
</CodeOutputBlock>
## Adding Memory
How to add memory to a SQLDatabaseChain:
```python
from langchain.llms import OpenAI
from langchain.utilities import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain
```
Set up the SQLDatabase and LLM
```python
db = SQLDatabase.from_uri("sqlite:///../../../../notebooks/Chinook.db")
llm = OpenAI(temperature=0, verbose=True)
```
Set up the memory
```python
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
```
Now we need to add a place for memory in the prompt template
```python
from langchain.prompts import PromptTemplate
PROMPT_SUFFIX = """Only use the following tables:
{table_info}
Previous Conversation:
{history}
Question: {input}"""
_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer. Unless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most {top_k} results. You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for a the few relevant columns given the question.
Pay attention to use only the column names that you can see in the schema description. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
Use the following format:
Question: Question here
SQLQuery: SQL Query to run
SQLResult: Result of the SQLQuery
Answer: Final answer here
"""
PROMPT = PromptTemplate.from_template(
_DEFAULT_TEMPLATE + PROMPT_SUFFIX,
)
```
Now let's create and run out chain
```python
db_chain = SQLDatabaseChain.from_llm(llm, db, prompt=PROMPT, verbose=True, memory=memory)
db_chain.run("name one employee")
```
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
name one employee
SQLQuery:SELECT FirstName, LastName FROM Employee LIMIT 1
SQLResult: [('Andrew', 'Adams')]
Answer:Andrew Adams
> Finished chain.
'Andrew Adams'
```
</CodeOutputBlock>
```python
db_chain.run("how many letters in their name?")
```
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
how many letters in their name?
SQLQuery:SELECT LENGTH(FirstName) + LENGTH(LastName) AS 'NameLength' FROM Employee WHERE FirstName = 'Andrew' AND LastName = 'Adams'
SQLResult: [(11,)]
Answer:Andrew Adams has 11 letters in their name.
> Finished chain.
'Andrew Adams has 11 letters in their name.'
```
</CodeOutputBlock>
## Choosing how to limit the number of rows returned
If you are querying for several rows of a table you can select the maximum number of results you want to get by using the 'top_k' parameter (default is 10). This is useful for avoiding query results that exceed the prompt max length or consume tokens unnecessarily.
@@ -207,8 +317,8 @@ db_chain.run("What are some example tracks by composer Johann Sebastian Bach?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
What are some example tracks by composer Johann Sebastian Bach?
SQLQuery:SELECT Name FROM Track WHERE Composer = 'Johann Sebastian Bach' LIMIT 3
@@ -246,23 +356,23 @@ print(db.table_info)
<CodeOutputBlock lang="python">
```
CREATE TABLE "Track" (
"TrackId" INTEGER NOT NULL,
"Name" NVARCHAR(200) NOT NULL,
"AlbumId" INTEGER,
"MediaTypeId" INTEGER NOT NULL,
"GenreId" INTEGER,
"Composer" NVARCHAR(220),
"Milliseconds" INTEGER NOT NULL,
"Bytes" INTEGER,
"UnitPrice" NUMERIC(10, 2) NOT NULL,
PRIMARY KEY ("TrackId"),
FOREIGN KEY("MediaTypeId") REFERENCES "MediaType" ("MediaTypeId"),
FOREIGN KEY("GenreId") REFERENCES "Genre" ("GenreId"),
"TrackId" INTEGER NOT NULL,
"Name" NVARCHAR(200) NOT NULL,
"AlbumId" INTEGER,
"MediaTypeId" INTEGER NOT NULL,
"GenreId" INTEGER,
"Composer" NVARCHAR(220),
"Milliseconds" INTEGER NOT NULL,
"Bytes" INTEGER,
"UnitPrice" NUMERIC(10, 2) NOT NULL,
PRIMARY KEY ("TrackId"),
FOREIGN KEY("MediaTypeId") REFERENCES "MediaType" ("MediaTypeId"),
FOREIGN KEY("GenreId") REFERENCES "Genre" ("GenreId"),
FOREIGN KEY("AlbumId") REFERENCES "Album" ("AlbumId")
)
/*
2 rows from Track table:
TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice
@@ -286,8 +396,8 @@ db_chain.run("What are some example tracks by Bach?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
What are some example tracks by Bach?
SQLQuery:SELECT "Name", "Composer" FROM "Track" WHERE "Composer" LIKE '%Bach%' LIMIT 5
@@ -305,7 +415,7 @@ db_chain.run("What are some example tracks by Bach?")
</CodeOutputBlock>
### Custom Table Info
In some cases, it can be useful to provide custom table information instead of using the automatically generated table definitions and the first `sample_rows_in_table_info` sample rows. For example, if you know that the first few rows of a table are uninformative, it could help to manually provide example rows that are more diverse or provide more information to the model. It is also possible to limit the columns that will be visible to the model if there are unnecessary columns.
In some cases, it can be useful to provide custom table information instead of using the automatically generated table definitions and the first `sample_rows_in_table_info` sample rows. For example, if you know that the first few rows of a table are uninformative, it could help to manually provide example rows that are more diverse or provide more information to the model. It is also possible to limit the columns that will be visible to the model if there are unnecessary columns.
This information can be provided as a dictionary with table names as the keys and table information as the values. For example, let's provide a custom definition and sample rows for the Track table with only a few columns:
@@ -313,7 +423,7 @@ This information can be provided as a dictionary with table names as the keys an
```python
custom_table_info = {
"Track": """CREATE TABLE Track (
"TrackId" INTEGER NOT NULL,
"TrackId" INTEGER NOT NULL,
"Name" NVARCHAR(200) NOT NULL,
"Composer" NVARCHAR(220),
PRIMARY KEY ("TrackId")
@@ -342,22 +452,22 @@ print(db.table_info)
<CodeOutputBlock lang="python">
```
CREATE TABLE "Playlist" (
"PlaylistId" INTEGER NOT NULL,
"Name" NVARCHAR(120),
"PlaylistId" INTEGER NOT NULL,
"Name" NVARCHAR(120),
PRIMARY KEY ("PlaylistId")
)
/*
2 rows from Playlist table:
PlaylistId Name
1 Music
2 Movies
*/
CREATE TABLE Track (
"TrackId" INTEGER NOT NULL,
"TrackId" INTEGER NOT NULL,
"Name" NVARCHAR(200) NOT NULL,
"Composer" NVARCHAR(220),
PRIMARY KEY ("TrackId")
@@ -384,8 +494,8 @@ db_chain.run("What are some example tracks by Bach?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
What are some example tracks by Bach?
SQLQuery:SELECT "Name" FROM Track WHERE "Composer" LIKE '%Bach%' LIMIT 5;
@@ -395,31 +505,31 @@ db_chain.run("What are some example tracks by Bach?")
Unless the user specifies in the question a specific number of examples to obtain, query for at most 5 results using the LIMIT clause as per SQLite. You can order the results to return the most informative data in the database.
Never query for all columns from a table. You must query only the columns that are needed to answer the question. Wrap each column name in double quotes (") to denote them as delimited identifiers.
Pay attention to use only the column names you can see in the tables below. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
Use the following format:
Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"
Only use the following tables:
CREATE TABLE "Playlist" (
"PlaylistId" INTEGER NOT NULL,
"Name" NVARCHAR(120),
"PlaylistId" INTEGER NOT NULL,
"Name" NVARCHAR(120),
PRIMARY KEY ("PlaylistId")
)
/*
2 rows from Playlist table:
PlaylistId Name
1 Music
2 Movies
*/
CREATE TABLE Track (
"TrackId" INTEGER NOT NULL,
"TrackId" INTEGER NOT NULL,
"Name" NVARCHAR(200) NOT NULL,
"Composer" NVARCHAR(220),
PRIMARY KEY ("TrackId")
@@ -431,7 +541,7 @@ db_chain.run("What are some example tracks by Bach?")
2 Balls to the Wall None
3 My favorite song ever The coolest composer of all time
*/
Question: What are some example tracks by Bach?
SQLQuery:SELECT "Name" FROM Track WHERE "Composer" LIKE '%Bach%' LIMIT 5;
SQLResult: [('American Woman',), ('Concerto for 2 Violins in D Minor, BWV 1043: I. Vivace',), ('Aria Mit 30 Veränderungen, BWV 988 "Goldberg Variations": Aria',), ('Suite for Solo Cello No. 1 in G Major, BWV 1007: I. Prélude',), ('Toccata and Fugue in D Minor, BWV 565: I. Toccata',)]
@@ -451,7 +561,7 @@ db_chain.run("What are some example tracks by Bach?")
### SQL Views
In some case, the table schema can be hidden behind a JSON or JSONB column. Adding row samples into the prompt might help won't always describe the data perfectly.
In some case, the table schema can be hidden behind a JSON or JSONB column. Adding row samples into the prompt might help won't always describe the data perfectly.
For this reason, a custom SQL views can help.
@@ -503,19 +613,19 @@ chain.run("How many employees are also customers?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseSequentialChain chain...
Table names to use:
['Employee', 'Customer']
> Entering new SQLDatabaseChain chain...
How many employees are also customers?
SQLQuery:SELECT COUNT(*) FROM Employee e INNER JOIN Customer c ON e.EmployeeId = c.SupportRepId;
SQLResult: [(59,)]
Answer:59 employees are also customers.
> Finished chain.
> Finished chain.
@@ -586,8 +696,8 @@ local_chain("How many customers are there?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many customers are there?
SQLQuery:
@@ -773,8 +883,8 @@ print("\n" + yaml_example)
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
List all the customer first names that start with 'a'
SQLQuery:
@@ -794,7 +904,7 @@ print("\n" + yaml_example)
[('François', 'Frantiek', 'Helena', 'Astrid', 'Daan', 'Kara', 'Eduardo', 'Alexandre', 'Fernanda', 'Mark', 'Frank', 'Jack', 'Dan', 'Kathy', 'Heather', 'Frank', 'Richard', 'Patrick', 'Julia', 'Edward', 'Martha', 'Aaron', 'Madalena', 'Hannah', 'Niklas', 'Camille', 'Marc', 'Wyatt', 'Isabelle', 'Ladislav', 'Lucas', 'Johannes', 'Stanisaw', 'Joakim', 'Emma', 'Mark', 'Manoj', 'Puja']
> Finished chain.
*** Query succeeded
answer: '[(''François'', ''Frantiek'', ''Helena'', ''Astrid'', ''Daan'', ''Kara'',
''Eduardo'', ''Alexandre'', ''Fernanda'', ''Mark'', ''Frank'', ''Jack'', ''Dan'',
''Kathy'', ''Heather'', ''Frank'', ''Richard'', ''Patrick'', ''Julia'', ''Edward'',
@@ -825,7 +935,7 @@ print("\n" + yaml_example)
None\tGermany\t70174\t+49 0711 2842222\tNone\tleonekohler@surfeu.de\t5\n3\tFrançois\t\
Tremblay\tNone\t1498 rue Bélanger\tMontréal\tQC\tCanada\tH2G 1A7\t+1 (514) 721-4711\t\
None\tftremblay@gmail.com\t3\n*/"
```
</CodeOutputBlock>
@@ -838,20 +948,20 @@ YAML_EXAMPLES = """
- input: How many customers are not from Brazil?
table_info: |
CREATE TABLE "Customer" (
"CustomerId" INTEGER NOT NULL,
"FirstName" NVARCHAR(40) NOT NULL,
"LastName" NVARCHAR(20) NOT NULL,
"Company" NVARCHAR(80),
"Address" NVARCHAR(70),
"City" NVARCHAR(40),
"State" NVARCHAR(40),
"Country" NVARCHAR(40),
"PostalCode" NVARCHAR(10),
"Phone" NVARCHAR(24),
"Fax" NVARCHAR(24),
"Email" NVARCHAR(60) NOT NULL,
"SupportRepId" INTEGER,
PRIMARY KEY ("CustomerId"),
"CustomerId" INTEGER NOT NULL,
"FirstName" NVARCHAR(40) NOT NULL,
"LastName" NVARCHAR(20) NOT NULL,
"Company" NVARCHAR(80),
"Address" NVARCHAR(70),
"City" NVARCHAR(40),
"State" NVARCHAR(40),
"Country" NVARCHAR(40),
"PostalCode" NVARCHAR(10),
"Phone" NVARCHAR(24),
"Fax" NVARCHAR(24),
"Email" NVARCHAR(60) NOT NULL,
"SupportRepId" INTEGER,
PRIMARY KEY ("CustomerId"),
FOREIGN KEY("SupportRepId") REFERENCES "Employee" ("EmployeeId")
)
sql_cmd: SELECT COUNT(*) FROM "Customer" WHERE NOT "Country" = "Brazil";
@@ -860,8 +970,8 @@ YAML_EXAMPLES = """
- input: list all the genres that start with 'r'
table_info: |
CREATE TABLE "Genre" (
"GenreId" INTEGER NOT NULL,
"Name" NVARCHAR(120),
"GenreId" INTEGER NOT NULL,
"Name" NVARCHAR(120),
PRIMARY KEY ("GenreId")
)
@@ -874,7 +984,7 @@ YAML_EXAMPLES = """
*/
sql_cmd: SELECT "Name" FROM "Genre" WHERE "Name" LIKE 'r%';
sql_result: "[('Rock',), ('Rock and Roll',), ('Reggae',), ('R&B/Soul',)]"
answer: The genres that start with 'r' are Rock, Rock and Roll, Reggae and R&B/Soul.
answer: The genres that start with 'r' are Rock, Rock and Roll, Reggae and R&B/Soul.
"""
```
@@ -940,8 +1050,8 @@ result = local_chain("How many customers are from Brazil?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many customers are from Brazil?
SQLQuery:SELECT count(*) FROM Customer WHERE Country = "Brazil";
@@ -960,8 +1070,8 @@ result = local_chain("How many customers are not from Brazil?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many customers are not from Brazil?
SQLQuery:SELECT count(*) FROM customer WHERE country NOT IN (SELECT country FROM customer WHERE country = 'Brazil')
@@ -980,8 +1090,8 @@ result = local_chain("How many customers are there in total?")
<CodeOutputBlock lang="python">
```
> Entering new SQLDatabaseChain chain...
How many customers are there in total?
SQLQuery:SELECT count(*) FROM Customer;

351
cookbook/stepback-qa.ipynb Normal file
View File

@@ -0,0 +1,351 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "83ef724e",
"metadata": {},
"source": [
"# Step-Back Prompting (Question-Answering)\n",
"\n",
"One prompting technique called \"Step-Back\" prompting can improve performance on complex questions by first asking a \"step back\" question. This can be combined with regular question-answering applications by then doing retrieval on both the original and step-back question.\n",
"\n",
"Read the paper [here](https://arxiv.org/abs/2310.06117)\n",
"\n",
"See an excellent blog post on this by Cobus Greyling [here](https://cobusgreyling.medium.com/a-new-prompt-engineering-technique-has-been-introduced-called-step-back-prompting-b00e8954cacb)\n",
"\n",
"In this cookbook we will replicate this technique. We modify the prompts used slightly to work better with chat models."
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "67b5cdac",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnableLambda"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "7e017c44",
"metadata": {},
"outputs": [],
"source": [
"# Few Shot Examples\n",
"examples = [\n",
" {\n",
" \"input\": \"Could the members of The Police perform lawful arrests?\",\n",
" \"output\": \"what can the members of The Police do?\",\n",
" },\n",
" {\n",
" \"input\": \"Jan Sindels was born in what country?\",\n",
" \"output\": \"what is Jan Sindels personal history?\",\n",
" },\n",
"]\n",
"# We now transform these to example messages\n",
"example_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"human\", \"{input}\"),\n",
" (\"ai\", \"{output}\"),\n",
" ]\n",
")\n",
"few_shot_prompt = FewShotChatMessagePromptTemplate(\n",
" example_prompt=example_prompt,\n",
" examples=examples,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "206415ee",
"metadata": {},
"outputs": [],
"source": [
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"\"\"You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:\"\"\",\n",
" ),\n",
" # Few shot examples\n",
" few_shot_prompt,\n",
" # New question\n",
" (\"user\", \"{question}\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "d643a85c",
"metadata": {},
"outputs": [],
"source": [
"question_gen = prompt | ChatOpenAI(temperature=0) | StrOutputParser()"
]
},
{
"cell_type": "code",
"execution_count": 182,
"id": "5ba21b2a",
"metadata": {},
"outputs": [],
"source": [
"question = \"was chatgpt around while trump was president?\""
]
},
{
"cell_type": "code",
"execution_count": 183,
"id": "5992c8ca",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'when was ChatGPT developed?'"
]
},
"execution_count": 183,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question_gen.invoke({\"question\": question})"
]
},
{
"cell_type": "code",
"execution_count": 190,
"id": "32667424",
"metadata": {},
"outputs": [],
"source": [
"from langchain.utilities import DuckDuckGoSearchAPIWrapper\n",
"\n",
"\n",
"search = DuckDuckGoSearchAPIWrapper(max_results=4)\n",
"\n",
"\n",
"def retriever(query):\n",
" return search.run(query)"
]
},
{
"cell_type": "code",
"execution_count": 191,
"id": "ffc28c91",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'This includes content about former President Donald Trump. According to further tests, ChatGPT successfully wrote poems admiring all recent U.S. presidents, but failed when we entered a query for ... On Wednesday, a Twitter user posted screenshots of him asking OpenAI\\'s chatbot, ChatGPT, to write a positive poem about former President Donald Trump, to which the chatbot declined, citing it ... While impressive in many respects, ChatGPT also has some major flaws. ... [President\\'s Name],\" refused to write a poem about ex-President Trump, but wrote one about President Biden ... During the Trump administration, Altman gained new attention as a vocal critic of the president. It was against that backdrop that he was rumored to be considering a run for California governor.'"
]
},
"execution_count": 191,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever(question)"
]
},
{
"cell_type": "code",
"execution_count": 192,
"id": "00c77443",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Will Douglas Heaven March 3, 2023 Stephanie Arnett/MITTR | Envato When OpenAI launched ChatGPT, with zero fanfare, in late November 2022, the San Francisco-based artificial-intelligence company... ChatGPT, which stands for Chat Generative Pre-trained Transformer, is a large language model -based chatbot developed by OpenAI and launched on November 30, 2022, which enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. ChatGPT is an artificial intelligence (AI) chatbot built on top of OpenAI's foundational large language models (LLMs) like GPT-4 and its predecessors. This chatbot has redefined the standards of... June 4, 2023 ⋅ 4 min read 124 SHARES 13K At the end of 2022, OpenAI introduced the world to ChatGPT. Since its launch, ChatGPT hasn't shown significant signs of slowing down in developing new...\""
]
},
"execution_count": 192,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever(question_gen.invoke({\"question\": question}))"
]
},
{
"cell_type": "code",
"execution_count": 193,
"id": "b257bc06",
"metadata": {},
"outputs": [],
"source": [
"# response_prompt_template = \"\"\"You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.\n",
"\n",
"# {normal_context}\n",
"# {step_back_context}\n",
"\n",
"# Original Question: {question}\n",
"# Answer:\"\"\"\n",
"# response_prompt = ChatPromptTemplate.from_template(response_prompt_template)"
]
},
{
"cell_type": "code",
"execution_count": 203,
"id": "f48c65b2",
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"\n",
"response_prompt = hub.pull(\"langchain-ai/stepback-answer\")"
]
},
{
"cell_type": "code",
"execution_count": 204,
"id": "97a6d5ab",
"metadata": {},
"outputs": [],
"source": [
"chain = (\n",
" {\n",
" # Retrieve context using the normal question\n",
" \"normal_context\": RunnableLambda(lambda x: x[\"question\"]) | retriever,\n",
" # Retrieve context using the step-back question\n",
" \"step_back_context\": question_gen | retriever,\n",
" # Pass on the question\n",
" \"question\": lambda x: x[\"question\"],\n",
" }\n",
" | response_prompt\n",
" | ChatOpenAI(temperature=0)\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 205,
"id": "ce554cb0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"No, ChatGPT was not around while Donald Trump was president. ChatGPT was launched on November 30, 2022, which is after Donald Trump's presidency. The context provided mentions that during the Trump administration, Altman, the CEO of OpenAI, gained attention as a vocal critic of the president. This suggests that ChatGPT was not developed or available during that time.\""
]
},
"execution_count": 205,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"question\": question})"
]
},
{
"cell_type": "markdown",
"id": "a9fb8dd2",
"metadata": {},
"source": [
"## Baseline"
]
},
{
"cell_type": "code",
"execution_count": 206,
"id": "00db8a15",
"metadata": {},
"outputs": [],
"source": [
"response_prompt_template = \"\"\"You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.\n",
"\n",
"{normal_context}\n",
"\n",
"Original Question: {question}\n",
"Answer:\"\"\"\n",
"response_prompt = ChatPromptTemplate.from_template(response_prompt_template)"
]
},
{
"cell_type": "code",
"execution_count": 207,
"id": "06335ebb",
"metadata": {},
"outputs": [],
"source": [
"chain = (\n",
" {\n",
" # Retrieve context using the normal question (only the first 3 results)\n",
" \"normal_context\": RunnableLambda(lambda x: x[\"question\"]) | retriever,\n",
" # Pass on the question\n",
" \"question\": lambda x: x[\"question\"],\n",
" }\n",
" | response_prompt\n",
" | ChatOpenAI(temperature=0)\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 208,
"id": "15e0e741",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Yes, ChatGPT was around while Donald Trump was president. However, it is important to note that the specific context you provided mentions that ChatGPT refused to write a positive poem about former President Donald Trump. This suggests that while ChatGPT was available during Trump's presidency, it may have had limitations or biases in its responses regarding him.\""
]
},
"execution_count": 208,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"question\": question})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e7b9e5d6",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -51,7 +51,7 @@
}
],
"source": [
"sudoku_puzzle = \"3,*,*,2|1,*,3,*|*,1,*,3|4,*,*,1\"\n",
"sudoku_puzzle = \"3,*,*,2|1,*,3,*|*,1,*,3|4,*,*,1\"\n",
"sudoku_solution = \"3,4,1,2|1,2,3,4|2,1,4,3|4,3,2,1\"\n",
"problem_description = f\"\"\"\n",
"{sudoku_puzzle}\n",
@@ -64,7 +64,7 @@
"- Keep the known digits from previous valid thoughts in place.\n",
"- Each thought can be a partial or the final solution.\n",
"\"\"\".strip()\n",
"print(problem_description)\n"
"print(problem_description)"
]
},
{
@@ -89,8 +89,11 @@
"from langchain_experimental.tot.thought import ThoughtValidity\n",
"import re\n",
"\n",
"\n",
"class MyChecker(ToTChecker):\n",
" def evaluate(self, problem_description: str, thoughts: Tuple[str, ...] = ()) -> ThoughtValidity:\n",
" def evaluate(\n",
" self, problem_description: str, thoughts: Tuple[str, ...] = ()\n",
" ) -> ThoughtValidity:\n",
" last_thought = thoughts[-1]\n",
" clean_solution = last_thought.replace(\" \", \"\").replace('\"', \"\")\n",
" regex_solution = clean_solution.replace(\"*\", \".\").replace(\"|\", \"\\\\|\")\n",
@@ -116,10 +119,22 @@
"outputs": [],
"source": [
"checker = MyChecker()\n",
"assert checker.evaluate(\"\", (\"3,*,*,2|1,*,3,*|*,1,*,3|4,*,*,1\",)) == ThoughtValidity.VALID_INTERMEDIATE\n",
"assert checker.evaluate(\"\", (\"3,4,1,2|1,2,3,4|2,1,4,3|4,3,2,1\",)) == ThoughtValidity.VALID_FINAL\n",
"assert checker.evaluate(\"\", (\"3,4,1,2|1,2,3,4|2,1,4,3|4,3,*,1\",)) == ThoughtValidity.VALID_INTERMEDIATE\n",
"assert checker.evaluate(\"\", (\"3,4,1,2|1,2,3,4|2,1,4,3|4,*,3,1\",)) == ThoughtValidity.INVALID"
"assert (\n",
" checker.evaluate(\"\", (\"3,*,*,2|1,*,3,*|*,1,*,3|4,*,*,1\",))\n",
" == ThoughtValidity.VALID_INTERMEDIATE\n",
")\n",
"assert (\n",
" checker.evaluate(\"\", (\"3,4,1,2|1,2,3,4|2,1,4,3|4,3,2,1\",))\n",
" == ThoughtValidity.VALID_FINAL\n",
")\n",
"assert (\n",
" checker.evaluate(\"\", (\"3,4,1,2|1,2,3,4|2,1,4,3|4,3,*,1\",))\n",
" == ThoughtValidity.VALID_INTERMEDIATE\n",
")\n",
"assert (\n",
" checker.evaluate(\"\", (\"3,4,1,2|1,2,3,4|2,1,4,3|4,*,3,1\",))\n",
" == ThoughtValidity.INVALID\n",
")"
]
},
{
@@ -203,7 +218,9 @@
"source": [
"from langchain_experimental.tot.base import ToTChain\n",
"\n",
"tot_chain = ToTChain(llm=llm, checker=MyChecker(), k=30, c=5, verbose=True, verbose_llm=False)\n",
"tot_chain = ToTChain(\n",
" llm=llm, checker=MyChecker(), k=30, c=5, verbose=True, verbose_llm=False\n",
")\n",
"tot_chain.run(problem_description=problem_description)"
]
},

View File

@@ -35,7 +35,7 @@
"tags": []
},
"source": [
"### API keys and other secrats\n",
"### API keys and other secrets\n",
"\n",
"We use an `.ini` file, like this: \n",
"```\n",

3
docker/Dockerfile.base Normal file
View File

@@ -0,0 +1,3 @@
FROM python:3.11
RUN pip install langchain

View File

@@ -8,11 +8,14 @@ set -o xtrace
SCRIPT_DIR="$(cd "$(dirname "$0")"; pwd)"
cd "${SCRIPT_DIR}"
mkdir -p _dist/docs_skeleton
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
poetry run nbdoc_build
poetry run python generate_api_reference_links.py
mkdir -p ../_dist
cp -r . ../_dist
cd ../_dist
poetry run python scripts/model_feat_table.py
poetry run nbdoc_build --srcdir docs
cp ../cookbook/README.md src/pages/cookbook.mdx
cp ../.github/CONTRIBUTING.md docs/contributing.md
wget https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
poetry run python scripts/generate_api_reference_links.py
yarn install
yarn start

View File

@@ -42,7 +42,7 @@ If you are using GitHub pages for hosting, this command is a convenient way to b
### Continuous Integration
Some common defaults for linting/formatting have been set for you. If you integrate your project with an open source Continuous Integration system (e.g. Travis CI, CircleCI), you may check for issues using the following command.
Some common defaults for linting/formatting have been set for you. If you integrate your project with an open-source Continuous Integration system (e.g. Travis CI, CircleCI), you may check for issues using the following command.
```
$ yarn ci

View File

@@ -3,7 +3,7 @@
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXOPTS ?= -j auto
SPHINXBUILD ?= sphinx-build
SPHINXAUTOBUILD ?= sphinx-autobuild
SOURCEDIR = .

View File

@@ -2,9 +2,9 @@
import importlib
import inspect
import typing
from pathlib import Path
from typing import TypedDict, Sequence, List, Dict, Literal, Union, Optional
from enum import Enum
from pathlib import Path
from typing import Dict, List, Literal, Optional, Sequence, TypedDict, Union
from pydantic import BaseModel
@@ -122,8 +122,7 @@ def _merge_module_members(
def _load_package_modules(
package_directory: Union[str, Path],
submodule: Optional[str] = None
package_directory: Union[str, Path], submodule: Optional[str] = None
) -> Dict[str, ModuleMembers]:
"""Recursively load modules of a package based on the file system.
@@ -171,7 +170,8 @@ def _load_package_modules(
# different way
if submodule is not None:
module_members = _load_module_members(
f"{package_name}.{submodule}.{namespace}", f"{submodule}.{namespace}"
f"{package_name}.{submodule}.{namespace}",
f"{submodule}.{namespace}",
)
else:
module_members = _load_module_members(
@@ -280,18 +280,9 @@ Functions
return full_doc
def main() -> None:
"""Generate the reference.rst file for each package."""
lc_members = _load_package_modules(PKG_DIR)
# Put some packages at top level
tools = _load_package_modules(PKG_DIR, "tools")
lc_members['tools.render'] = tools['render']
agents = _load_package_modules(PKG_DIR, "agents")
lc_members['agents.output_parsers'] = agents['output_parsers']
lc_members['agents.format_scratchpad'] = agents['format_scratchpad']
lc_doc = ".. _api_reference:\n\n" + _construct_doc("langchain", lc_members)
with open(WRITE_FILE, "w") as f:
f.write(lc_doc)
def _document_langchain_experimental() -> None:
"""Document the langchain_experimental package."""
# Generate experimental_api_reference.rst
exp_members = _load_package_modules(EXP_DIR)
exp_doc = ".. _experimental_api_reference:\n\n" + _construct_doc(
"langchain_experimental", exp_members
@@ -300,5 +291,36 @@ def main() -> None:
f.write(exp_doc)
def _document_langchain_core() -> None:
"""Document the main langchain package."""
# load top level module members
lc_members = _load_package_modules(PKG_DIR)
# Add additional packages
tools = _load_package_modules(PKG_DIR, "tools")
agents = _load_package_modules(PKG_DIR, "agents")
schema = _load_package_modules(PKG_DIR, "schema")
lc_members.update(
{
"agents.output_parsers": agents["output_parsers"],
"agents.format_scratchpad": agents["format_scratchpad"],
"tools.render": tools["render"],
"schema.runnable": schema["runnable"],
}
)
lc_doc = ".. _api_reference:\n\n" + _construct_doc("langchain", lc_members)
with open(WRITE_FILE, "w") as f:
f.write(lc_doc)
def main() -> None:
"""Generate the reference.rst file for each package."""
_document_langchain_core()
_document_langchain_experimental()
if __name__ == "__main__":
main()

File diff suppressed because one or more lines are too long

Some files were not shown because too many files have changed in this diff Show More