Compare commits

..

233 Commits

Author SHA1 Message Date
William Fu-Hinthorn
812bf93818 Add vectorstore type info 2023-07-20 11:31:09 -07:00
William FH
e2a99bd169 Different error strings (#8010) 2023-07-20 09:58:25 -07:00
Bagatur
ec4f93b629 bump 238 (#8012) 2023-07-20 09:21:15 -07:00
vrushankportkey
5f10d2ea1d Add Portkey LLMOps integration (#7877)
Integrating Portkey, which adds production features like caching,
tracing, tagging, retries, etc. to langchain apps.

  - Dependencies: None
  - Twitter handle: https://twitter.com/portkeyai
  - test_portkey.py added for tests
  - example notebook added in new utilities folder in modules
  
 Also fixed a bug with OpenAIEmbeddings where headers weren't passing.

cc @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 09:08:44 -07:00
Boris Nieuwenhuis
095937ad52 Add google place ID to google places tool response (#7789)
- Description: this change will add the google place ID of the found
location to the response of the GooglePlacesTool
  - Issue: Not applicable
  - Dependencies: no dependencies
  - Tag maintainer: @hinthornw
  - Twitter handle: Not applicable
2023-07-20 09:04:31 -07:00
Bagatur
7c24a6b9d1 Bagatur/apify (#8008)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Jiří Moravčík <jiri.moravcik@gmail.com>
Co-authored-by: Jan Čurn <jan.curn@gmail.com>
2023-07-20 08:36:01 -07:00
Aiden Le
1d7414a371 Feature: Add openai_api_model attribute to Doctran models (#7868)
- Description: Added the ability to define the open AI model.
- Issue: Currently the Doctran instance uses gpt-4 by default, this does
not work if the user has no access to gpt -4.
  - rlancemartin, @eyurtsev, @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:27:56 -07:00
Dwai Banerjee
d8c40253c3 Adding endpoint_url to embeddings/bedrock.py and updated docs (#7927)
BedrockEmbeddings does not have endpoint_url so that switching to custom
endpoint is not possible. I have access to Bedrock custom endpoint and
cannot use BedrockEmbeddings

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:25:59 -07:00
Bagatur
ea028b66ab undo vectstore memory bug (#8007) 2023-07-20 07:25:23 -07:00
Mohammad Mohtashim
453d4c3a99 VectorStoreRetrieverMemory exclude additional input keys feature (#7941)
- Description: Added a parameter in VectorStoreRetrieverMemory which
filters the input given by the key when constructing the buffering the
document for Vector. This feature is helpful if you have certain inputs
apart from the VectorMemory's own memory_key that needs to be ignored
e.g when using combined memory, we might need to filter the memory_key
of the other memory, Please see the issue.
  - Issue: #7695
  - Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 07:23:27 -07:00
Constantin Musca
d593833e4d Add Golden Query Tool (#7930)
**Description:** Golden Query is a wrapper on top of the [Golden Query
API](https://docs.golden.com/reference/query-api) which enables
programmatic access to query results on entities across Golden's
Knowledge Base. For more information about Golden API, please see the
[Golden API Getting
Started](https://docs.golden.com/reference/getting-started) page.
**Issue:** None
**Dependencies:** requests(already present in project)
**Tag maintainer:** @hinthornw

Signed-off-by: Constantin Musca <constantin.musca@gmail.com>
2023-07-20 07:03:20 -07:00
eahova
aea97efe8b Adding code to allow pandas to show all columns instead of truncating… (#7901)
- Description: Adding code to set pandas dataframe to display all the
columns. Otherwise, some data get truncated (it puts a "..." in the
middle and just shows the first 4 and last 4 columns) and the LLM
doesn't realize it isn't getting the full data. Default value is 8, so
this helps Dataframes larger than that.
  - Issue: none
  - Dependencies: none
  - Tag maintainer: @hinthornw 
  - Twitter handle: none
2023-07-20 07:02:01 -07:00
Santiago Delgado
c416dbe8e0 Amadeus Flight and Travel Search Tool (#7890)
## Background
With the addition on email and calendar tools, LangChain is continuing
to complete its functionality to automate business processes.

## Challenge
One of the pieces of business functionality that LangChain currently
doesn't have is the ability to search for flights and travel in order to
book business travel.

## Changes
This PR implements an integration with the
[Amadeus](https://developers.amadeus.com/) travel search API for
LangChain, enabling seamless search for flights with a single
authentication process.

## Who can review?
@hinthornw

## Appendix
@tsolakoua and @minjikarin, I utilized your
[amadeus-python](https://github.com/amadeus4dev/amadeus-python) library
extensively. Given the rising popularity of LangChain and similar AI
frameworks, the convergence of libraries like amadeus-python and tools
like this one is likely. So, I wanted to keep you updated on our
progress.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:59:29 -07:00
Hanit
ea149dbd89 Allowing outside parameters for Qdrant. (#7910)
@baskaryan @rlancemartin, @eyurtsev
2023-07-20 06:58:54 -07:00
Sheik Irfan Basha
d6493590da Add Verbose support (#7982) (#7984)
- Description: Add verbose support for the extraction_chain
- Issue: Fixes #7982 
- Dependencies: NA
- Twitter handle: sheikirfanbasha
@hwchase17 and @agola11

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:52:13 -07:00
Junlin Zhou
812a1643db chore(hf-text-gen): extract default params for reusing (#7929)
This PR extract common code (default generation params) for
`HuggingFaceTextGenInference`.

Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>
2023-07-20 06:49:12 -07:00
Yun Kim
54e02e4392 Add datadog-langchain integration doc (#7955)
## Description
Added a doc about the [Datadog APM integration for
LangChain](https://github.com/DataDog/dd-trace-py/pull/6137).
Note that the integration is on `ddtrace`'s end and so no code is
introduced/required by this integration into the langchain library. For
that reason I've refrained from adding an example notebook (although
I've added setup instructions for enabling the integration in the doc)
as no code is technically required to enable the integration.

Tagging @baskaryan as reviewer on this PR, thank you very much!

## Dependencies
Datadog APM users will need to have `ddtrace` installed, but the
integration is on `ddtrace` end and so does not introduce any external
dependencies to the LangChain project.


Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-20 06:44:58 -07:00
Wian Stipp
0ffb7fc10c One Line Fix: missing text output with huggingface TGI LLM (#7972)
Small bug fix. The async _call method was missing a line to return the
generated text.

@baskaryan
2023-07-20 06:44:29 -07:00
Jithin James
493cbc9410 docs: fix a couple of small indentation errors in the strings (#7951)
Fixed a few indentations I came across in the docs @baskaryan
2023-07-20 06:34:01 -07:00
Bhashithe Abeysinghe
73901ef132 Added windows specific instructions to Llama.cpp documentation. (#8000)
- Description: Added windows specific instructions on llama.cpp in the
notebook file
  - Issue: #6356 
  - Dependencies: None
  - Tag maintainer: @baskaryan
2023-07-20 06:31:25 -07:00
Leonid Ganeline
24b26a922a docstrings for embeddings (#7973)
Added/updated docstrings for the `embeddings`

@baskaryan
2023-07-20 06:26:44 -07:00
Leonid Ganeline
0613ed5b95 docstrings for LLMs (#7976)
docstrings for the `llms/`:
- added missed docstrings
- update existing docstrings to consistent format (no `Wrappers`!)
@baskaryan
2023-07-20 06:26:16 -07:00
Jeff Huber
5694e7b8cf Update chroma notebook (#7978)
Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
2023-07-20 06:25:31 -07:00
Harutaka Kawamura
4a5894db47 Fix incorrect field name in MLflow AI Gateway config example (#7983) 2023-07-20 06:24:59 -07:00
Kacper Łukawski
19e8472521 Add async Qdrant to async_agent.ipynb (#7993)
I added Qdrant to the async API docs. This is the only vector store that
supports full async API.

@baskaryan @rlancemartin, @eyurtsev
2023-07-20 06:23:15 -07:00
Nuno Campos
8edb1db9dc Fix key errors in weaviate hybrid retriever init (#7988) 2023-07-20 06:22:18 -07:00
Harrison Chase
df84e1bb64 pass callbacks along baby ai (#7908) 2023-07-19 22:40:33 -07:00
William FH
a4c5914c9a Bump LS Version (#7970) 2023-07-19 17:12:16 -07:00
Bagatur
5d021c0962 nb fix (#7962) 2023-07-19 15:27:43 -07:00
Julien Salinas
3adab5e5be Integrate NLP Cloud embeddings endpoint (#7931)
Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-19 15:27:34 -07:00
Bagatur
854a2be0ca Add debugging guide (#7956) 2023-07-19 14:15:11 -07:00
Brendan Collins
9aef79c2e3 Add Geopandas.GeoDataFrame Document Loader (#3817)
Work in Progress.
WIP
Not ready...

Adds Document Loader support for
[Geopandas.GeoDataFrames](https://geopandas.org/)

Example:
- [x] stub out `GeoDataFrameLoader` class
- [x] stub out integration tests
- [ ] Experiment with different geometry text representations
- [ ] Verify CRS is successfully added in metadata
- [ ] Test effectiveness of searches on geometries
- [ ] Test with different geometry types (point, line, polygon with
multi-variants).
- [ ] Add documentation

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
2023-07-19 12:14:41 -07:00
Lance Martin
dfc533aa74 Add llama-v2 to local document QA (#7952) 2023-07-19 11:15:47 -07:00
Bagatur
d9b5bcd691 bump (#7948) 2023-07-19 10:23:21 -07:00
Bagatur
f97535b33e fix (#7947) 2023-07-19 10:23:10 -07:00
Adilkhan Sarsen
7bb843477f Removed kwargs from add_texts (#7595)
Removing **kwargs argument from add_texts method in DeepLake vectorstore
as it confuses users and doesn't fail when user is typing incorrect
parameters.

Also added small test to ensure the change is applies correctly.

Guys could pls take a look: @rlancemartin, @eyurtsev, this is a small
PR.

Thx so much!
2023-07-19 09:23:49 -07:00
Bagatur
4d8b48bdb3 bump 236 (#7938) 2023-07-19 07:51:40 -07:00
Harutaka Kawamura
f6839a8682 Add integration for MLflow AI Gateway (#7113)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->


- Adds integration for MLflow AI Gateway (this will be shipped in MLflow
2.5 this week).


Manual testing:

```sh
# Move to mlflow repo
cd /path/to/mlflow

# install langchain
pip install git+https://github.com/harupy/langchain.git@gateway-integration

# launch gateway service
mlflow gateway start --config-path examples/gateway/openai/config.yaml

# Then, run the examples in this PR
```
2023-07-19 07:40:55 -07:00
David Preti
6792a3557d Update openai.py compatibility with azure 2023-07-01-preview (#7937)
Fixed missing "content" field in azure. 
Added a check for "content" in _dict (missing for azure
api=2023-07-01-preview)
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-19 07:31:18 -07:00
王斌(Bin Wang)
b65102bdb2 fix: pgvector search_type of similarity_score_threshold not working (#7771)
- Description: VectorStoreRetriever->similarity_score_threshold with
search_type of "similarity_score_threshold" not working with the
following two minor issues,
- Issue: 1. In line 237 of `vectorstores/base.py`, "score_threshold" is
passed to `_similarity_search_with_relevance_scores` as in the kwargs,
while score_threshold is not a valid argument of this method. As a fix,
before calling `_similarity_search_with_relevance_scores`,
score_threshold is popped from kwargs. 2. In line 596 to 607 of
`vectorstores/pgvector.py`, it's checking the distance_strategy against
the string in Enum. However, self.distance_strategy will get the
property of distance_strategy from line 316, where the callable function
is passed. To solve this issue, self.distance_strategy is changed to
self._distance_strategy to avoid calling the property method.,
  - Dependencies: No,
  - Tag maintainer: @rlancemartin, @eyurtsev,
  - Twitter handle: No

---------

Co-authored-by: Bin Wang <bin@arcanum.ai>
2023-07-19 07:20:52 -07:00
William FH
9d7e57f5c0 Docs Nit (#7918) 2023-07-18 21:47:28 -07:00
Wilson Leao Neto
8bb33f2296 Exposes Kendra result item DocumentAttributes in the document metadata (#7781)
- Description: exposes the ResultItem DocumentAttributes as document
metadata with key 'document_attributes' and refactors
AmazonKendraRetriever by providing a ResultItem base class in order to
avoid duplicate code;
- Tag maintainer: @3coins @hupe1980 @dev2049 @baskaryan
- Twitter handle: wilsonleao

### Why?
Some use cases depend on specific document attributes returned by the
retriever in order to improve the quality of the overall completion and
adjust what will be displayed to the user. For the sake of consistency,
we need to expose the DocumentAttributes as document metadata so we are
sure that we are using the values returned by the kendra request issued
by langchain.

I would appreciate your review @3coins @hupe1980 @dev2049. Thank you in
advance!

### References
- [Amazon Kendra
DocumentAttribute](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttribute.html)
- [Amazon Kendra
DocumentAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttributeValue.html)

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
2023-07-18 18:46:38 -07:00
Wilson Leao Neto
efa67ed0ef fix #7782: check title and excerpt separately for page_content (#7783)
- Description: check title and excerpt separately for page_content so
that if title is empty but excerpt is present, the page_content will
only contain the excerpt
  - Issue: #7782 
  - Tag maintainer: @3coins @baskaryan 
  - Twitter handle: wilsonleao
2023-07-18 18:46:23 -07:00
Leonid Ganeline
d92926cbc2 docstrings chains (#7892)
Added/updated docstrings.
2023-07-18 18:25:42 -07:00
Leonid Ganeline
4a810756f8 docstrings chains (#7892)
Added/updated docstrings.

@baskaryan
2023-07-18 18:25:27 -07:00
Jarek Kazmierczak
f2ef3ff54a Google Cloud Enterprise Search retriever (#7857)
Added a retriever that encapsulated Google Cloud Enterprise Search.


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 18:24:08 -07:00
Alonso Silva Allende
1152f4d48b Allow chat models that do not return token usage (#7907)
- Description: It allows to use chat models that do not return token
usage
- Issue: [#7900](https://github.com/hwchase17/langchain/issues/7900)
- Dependencies: None
- Tag maintainer: @agola11 @hwchase17 
- Twitter handle: @alonsosilva

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-07-18 18:12:09 -07:00
Zizhong Zhang
bdf0c2267f docs(custom_chain) fix typo (#7898)
Fix typo in the document of custom_chain
2023-07-18 18:03:19 -07:00
Jeff Huber
2139d0197e upgrade chroma to 0.4.0 (#7749)
** This should land Monday the 17th ** 

Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to
build, more durable, faster, smaller, and more extensible. This comes
with a few changes:

1. A simplified and improved client setup. Instead of having to remember
weird settings, users can just do `EphemeralClient`, `PersistentClient`
or `HttpClient` (the underlying direct `Client` implementation is also
still accessible)

2. We migrated data stores away from `duckdb` and `clickhouse`. This
changes the api for the `PersistentClient` that used to reference
`chroma_db_impl="duckdb+parquet"`. Now we simply set
`is_persistent=true`. `is_persistent` is set for you to `true` if you
use `PersistentClient`.

3. Because we migrated away from `duckdb` and `clickhouse` - this also
means that users need to migrate their data into the new layout and
schema. Chroma is committed to providing extension notification and
tooling around any schema and data migrations (for example - this PR!).

After upgrading to `0.4.0` - if users try to access their data that was
stored in the previous regime, the system will throw an `Exception` and
instruct them how to use the migration assistant to migrate their data.
The migration assitant is a pip installable CLI: `pip install
chroma_migrate`. And is runnable by calling `chroma_migrate`

-- TODO ADD here is a short video demonstrating how it works. 

Please reference the readme at
[chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate)
to see a full write-up of our philosophy on migrations as well as more
details about this particular migration.

Please direct any users facing issues upgrading to our Discord channel
called
[#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883).
We have also created a [email
listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers
directly in the future about breaking changes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 17:20:54 -07:00
Gergely Papp
10246375a5 Gpapp/chromadb (#7891)
- Description: version check to make sure chromadb >=0.4.0 does not
throw an error, and uses the default sqlite persistence engine when the
directory is set,
  - Issue: the issue #7887 

For attention of
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 17:03:42 -07:00
Lance Martin
41c841ec85 Add Llama-v2 to Llama.cpp notebook (#7913) 2023-07-18 15:13:27 -07:00
Bagatur
b9639f6067 fix docs (#7911) 2023-07-18 14:25:45 -07:00
Jeff Huber
dc8b790214 Improve vector store onboarding exp (#6698)
This PR
- fixes the `similarity_search_by_vector` example, makes the code run
and adds the example to mirror `similarity_search`
- reverts back to chroma from faiss to remove sharp edges / create a
happy path for new developers. (1) real metadata filtering, (2) expected
functionality like `update`, `delete`, etc to serve beyond the most
trivial use cases

@hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 13:48:42 -07:00
Bagatur
25a2bdfb70 add pr template instructions (#7904) 2023-07-18 13:22:28 -07:00
Hanit
0d23c0c82a Allowing additional params for OpenAIEmbeddings. (#7752)
(#7654)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 12:14:51 -07:00
Lance Martin
862268175e Add llama-v2 to docs (#7893) 2023-07-18 12:09:09 -07:00
TRY-ER
21d1c988a9 Try er/redis index retrieval retry00 (#7773)
Replace this comment with:
- Description: Modified the code to return the document id from the
redis document search as metadata.
  - Issue: the issue # it fixes retrieval of id as metadata as string 
  - Tag maintainer: @rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 10:49:50 -07:00
shibuiwilliam
177baef3a1 Add test for svm retriever (#7768)
# What
- This is to add unit test for svm retriever.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 09:57:24 -07:00
Filip Michalsky
69b9db2b5e Notebook update: sales agent with tools (#7753)
- Description: This is an update to a previously published notebook. 
Sales Agent now has access to tools, and this notebook shows how to use
a Product Knowledge base
  to reduce hallucinations and act as a better sales person!
  - Issue: N/A
  - Dependencies: `chromadb openai tiktoken`
  - Tag maintainer:  @baskaryan @hinthornw
  - Twitter handle: @FilipMichalsky
2023-07-18 09:53:12 -07:00
shibuiwilliam
f29a5d4bcc add test for knn retriever (#7769)
# What
- This is to add test for knn retriever.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 09:52:11 -07:00
Orgil
75d3f1e5e6 remove unused import in voice assistant doc (#7757)
Description: Removed unused import in voice_assistant doc. 
Tag maintainer: @baskaryan
2023-07-18 09:51:28 -07:00
maciej-skorupka
c6d1d6d7fc feat: moving azure OpenAI API version to the latest 2023-05-15 (#7764)
Moving to the latest non-preview Azure OpenAI API version=2023-05-15.
The previous 2023-03-15-preview doesn't have support, SLA etc. For
instance, OpenAI SDK has moved to this version
https://github.com/openai/openai-python/releases/tag/v0.27.7

@baskaryan
2023-07-18 09:50:15 -07:00
satorioh
259a409998 docs(zilliz): connection_args add token description for serverless cl… (#7810)
Description:

Currently, Zilliz only support dedicated clusters using a pair of
username and password for connection. Regarding serverless clusters,
they can connect to them by using API keys( [ see official note
detail](https://docs.zilliz.com/docs/manage-cluster-credentials)), so I
add API key(token) description in Zilliz docs to make it more obvious
and convenient for this group of users to better utilize Zilliz. No
changes done to code.

---------

Co-authored-by: Robin.Wang <3Jg$94sbQ@q1>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 09:31:39 -07:00
shibuiwilliam
235264a246 Add/test faiss (#7809)
# What
- Add missing test cases to faiss vectore stores
2023-07-18 08:30:35 -07:00
maciej-skorupka
5de7815310 docs: added comment from azure llm to azure chat about GPT-4 (#7884)
Azure GPT-4 models can't be accessed via LLM model. It's easy to miss
that and a lot of discussions about that are on the Internet. Therefore
I added a comment in Azure LLM docs that mentions that and points to
Azure Chat OpenAI docs.
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-18 08:05:41 -07:00
Leonid Ganeline
4a05b7f772 docstrings prompts (#7844)
Added missed docstrings in `prompts`
@baskaryan
2023-07-18 07:58:22 -07:00
Bill Zhang
dda11d2a05 WeaviateHybridSearchRetriever option to enable scores. (#7861)
Description: This PR adds the option to retrieve scores and explanations
in the WeaviateHybridSearchRetriever. This feature improves the
usability of the retriever by allowing users to understand the scoring
logic behind the search results and further refine their search queries.

Issue: This PR is a solution to the issue #7855 
Dependencies: This PR does not introduce any new dependencies.

Tag maintainer: @rlancemartin, @eyurtsev

I have included a unit test for the added feature, ensuring that it
retrieves scores and explanations correctly. I have also included an
example notebook demonstrating its use.
2023-07-18 07:57:17 -07:00
Leonid Ganeline
527210972e docstrings output_parsers (#7859)
Added/updated the docstrings from `output_parsers`
 @baskaryan
2023-07-18 07:51:44 -07:00
Jonathan Pedoeem
c460c29a64 Adding Docs for PromptLayerCallbackHandler (#7860)
Here I am adding documentation for the `PromptLayerCallbackHandler`.
When we created the initial PR for the callback handler the docs were
causing issues, so we merged without the docs.
2023-07-18 07:51:16 -07:00
ljeagle
3902b85657 Add metadata and page_content filters of documents in AwaDB (#7862)
1. Add the metadata filter of documents.
2. Add the text page_content filter of documents
3. fix the bug of similarity_search_with_score

Improvement and fix bug of AwaDB
Fix the conflict https://github.com/hwchase17/langchain/pull/7840
@rlancemartin @eyurtsev  Thanks!

---------

Co-authored-by: vincent <awadb.vincent@gmail.com>
2023-07-18 07:50:17 -07:00
German Martin
f1eaa9b626 Lost in the middle: We have been ordering documents the WRONG way. (for long context) (#7520)
Motivation, it seems that when dealing with a long context and "big"
number of relevant documents we must avoid using out of the box score
ordering from vector stores.
See: https://arxiv.org/pdf/2306.01150.pdf

So, I added an additional parameter that allows you to reorder the
retrieved documents so we can work around this performance degradation.
The relevance respect the original search score but accommodates the
lest relevant document in the middle of the context.
Extract from the paper (one image speaks 1000 tokens):

![image](https://github.com/hwchase17/langchain/assets/1821407/fafe4843-6e18-4fa6-9416-50cc1d32e811)
This seems to be common to all diff arquitectures. SO I think we need a
good generic way to implement this reordering and run some test in our
already running retrievers.
It could be that my approach is not the best one from the architecture
point of view, happy to have a discussion about that.
For me this was the best place to introduce the change and start
retesting diff implementations.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-18 07:45:15 -07:00
Bagatur
6a32f93669 add ls link (#7847) 2023-07-18 07:39:26 -07:00
Leonid Ganeline
17956ff08e docstrings agents (#7866)
Added/Updated docstrings for `agents`
@baskaryan
2023-07-18 02:23:24 -07:00
William FH
c6f2d27789 Docs Nits (#7874)
Add links to reference docs
2023-07-18 01:50:14 -07:00
William FH
3179ee3a56 Evals docs (#7460)
Still don't have good "how to's", and the guides / examples section
could be further pruned and improved, but this PR adds a couple examples
for each of the common evaluator interfaces.

- [x] Example docs for each implemented evaluator
- [x] "how to make a custom evalutor" notebook for each low level APIs
(comparison, string, agent)
- [x] Move docs to modules area
- [x] Link to reference docs for more information
- [X] Still need to finish the evaluation index page
- ~[ ] Don't have good data generation section~
- ~[ ] Don't have good how to section for other common scenarios / FAQs
like regression testing, testing over similar inputs to measure
sensitivity, etc.~
2023-07-18 01:00:01 -07:00
William FH
d87564951e LS0010 (#7871)
Bump langsmith version. Has some additional UX improvements
2023-07-18 00:28:37 -07:00
William FH
e294ba475a Some mitigations for RCE in PAL chain (#7870)
Some docstring / small nits to #6003

---------

Co-authored-by: BoazWasserman <49598618+boazwasserman@users.noreply.github.com>
Co-authored-by: HippoTerrific <49598618+HippoTerrific@users.noreply.github.com>
Co-authored-by: Or Raz <orraz1994@gmail.com>
2023-07-17 22:58:47 -07:00
Nicolas
46330da2e7 docs: Mendable: Fixes pretty sources not working (#7863)
This new version fixes the"Verified Sources" display that got broken.
Instead of displaying the full URL, it shows the title of the page the
source is from.
2023-07-17 18:23:46 -07:00
Leonid Ganeline
f5ae8f1980 docstrings tools (#7848)
Added docstrings in `tools`.

 @baskaryan
2023-07-17 17:50:19 -07:00
Leonid Ganeline
74b701f42b docstrings retrievers (#7858)
Added/updated docstrings `retrievers`

@baskaryan
2023-07-17 17:47:17 -07:00
Jasper
5b4d53e8ef Add text_content kwarg to BrowserlessLoader (#7856)
Added keyword argument to toggle between getting the text content of a
site versus its HTML when using the `BrowserlessLoader`
2023-07-17 17:02:19 -07:00
William FH
2aa3cf4e5f update notebook (#7852) 2023-07-17 14:46:42 -07:00
Matt Robinson
3c489be773 feat: optional post-processing for Unstructured loaders (#7850)
### Summary

Adds a post-processing method for Unstructured loaders that allows users
to optionally modify or clean extracted elements.

### Testing

```python
from langchain.document_loaders import UnstructuredFileLoader
from unstructured.cleaners.core import clean_extra_whitespace

loader = UnstructuredFileLoader(
    "./example_data/layout-parser-paper.pdf",
    mode="elements",
    post_processors=[clean_extra_whitespace],
)

docs = loader.load()
docs[:5]
```


### Reviewrs
  - @rlancemartin
  - @eyurtsev
  - @hwchase17
2023-07-17 12:13:05 -07:00
Bagatur
2a315dbee9 fix nb (#7843) 2023-07-17 09:39:11 -07:00
Bagatur
3f1302a4ab bump 235 (#7836) 2023-07-17 09:37:20 -07:00
Mike Lambert
9cdea4e0e1 Update to Anthropic's claude-v2 (#7793) 2023-07-17 08:55:49 -07:00
Bagatur
98c48f303a fix (#7838) 2023-07-17 07:53:11 -07:00
Bagatur
111bd7ddbe specify comparators (#7805) 2023-07-17 07:30:48 -07:00
Dayuan Jiang
ee40d37098 add bm25 module (#7779)
- Description: Add a BM25 Retriever that do not need Elastic search
- Dependencies: rank_bm25(if it is not installed it will be install by
using pip, just like TFIDFRetriever do)
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: DayuanJian21687

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-17 07:30:17 -07:00
Liu Ming
fa0a9e502a Add LLM for ChatGLM(2)-6B API (#7774)
Description:
Add LLM for ChatGLM-6B & ChatGLM2-6B API

Related Issue: 
Will the langchain support ChatGLM? #4766
Add support for selfhost models like ChatGLM or transformer models #1780

Dependencies: 
No extra library install required. 
It wraps api call to a ChatGLM(2)-6B server(start with api.py), so api
endpoint is required to run.

Tag maintainer:  @mlot 

Any comments on this PR would be appreciated.
---------

Co-authored-by: mlot <limpo2000@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-17 07:27:17 -07:00
sseide
25e3d3f283 Support Redis Sentinel database connections (#5196)
# Support Redis Sentinel database connections

This PR adds the support to connect not only to Redis standalone servers
but High Availability Replication sets too
(https://redis.io/docs/management/sentinel/)
Redis Replica Sets have on Master allowing to write data and 2+ replicas
with read-only access to the data. The additional Redis Sentinel
instances monitor all server and reconfigure the RW-Master on the fly if
it comes unavailable.

Therefore all connections must be made through the Sentinels the query
the current master for a read-write connection. This PR adds basic
support to also allow a redis connection url specifying a Sentinel as
Redis connection.

Redis documentation and Jupyter notebook with Redis examples are updated
to mention how to connect to a redis Replica Set with Sentinels

        - 

Remark - i did not found test cases for Redis server connections to add
new cases here. Therefor i tests the new utility class locally with
different kind of setups to make sure different connection urls are
working as expected. But no test case here as part of this PR.
2023-07-17 07:18:51 -07:00
Yifei Song
2e47412073 Add Xorbits agent (#7647)
- [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source
computing framework that makes it easy to scale data science and machine
learning workloads in parallel. Xorbits can leverage multi cores or GPUs
to accelerate computation on a single machine, or scale out up to
thousands of machines to support processing terabytes of data.

- This PR added support for the Xorbits agent, which allows langchain to
interact with Xorbits Pandas dataframe and Xorbits Numpy array.
- Dependencies: This change requires the Xorbits library to be installed
in order to be used.
`pip install xorbits`
- Request for review: @hinthornw
- Twitter handle: https://twitter.com/Xorbitsio
2023-07-17 07:09:51 -07:00
Ankush Gola
ff3aada0b2 minor langsmith notebook fixes (#7814)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-16 21:27:03 -07:00
William FH
ca79044948 Export Tracer from callbacks (#7812)
Improve discoverability
2023-07-16 20:58:13 -07:00
William FH
beb38f4f4d Share client in evaluation callback (#7807)
Guarantee the evaluator traces go to same endpoint
2023-07-16 17:47:38 -07:00
William FH
1db13e8a85 Fix chat example output mapper (#7808)
Was only serializing when no key was provided
2023-07-16 17:47:05 -07:00
William FH
c58d35765d Add examples to docstrings (#7796)
and:
- remove dataset name from autogenerated project name
- print out project name to view
2023-07-16 12:05:56 -07:00
William FH
ed97af423c Accept LLM via constructor (#7794) 2023-07-16 08:46:36 -07:00
Ankush Gola
c4ece52dac update LangSmith notebook (#7767)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-15 21:05:09 -07:00
Kenny
0d058d4046 Add try except block to OpenAIWhisperParser (#7505) 2023-07-15 15:42:00 -07:00
William FH
4cb9f1eda8 Update langsmith version (#7759) 2023-07-15 12:01:41 -07:00
Lance Martin
1d06eee3b5 Fix ntbk link in docs (#7755)
Minor fix to running to
[docs](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).
2023-07-15 09:11:18 -07:00
William FH
2e3d77c34e Fix eval loader when overriding arguments (#7734)
- Update the negative criterion descriptions to prevent bad predictions
- Add support for normalizing the string distance
- Fix potential json deserializing into float issues in the example
mapper
2023-07-15 08:30:32 -07:00
Bagatur
c871c04270 bump 234 (#7754) 2023-07-15 10:49:51 -04:00
Gordon Clark
96f3dff050 MediaWiki docloader improvements + unit tests (#5879)
Starting over from #5654 because I utterly borked the poetry.lock file.

Adds new paramerters for to the MWDumpLoader class:

* skip_redirecst (bool) Tells the loader to skip articles that redirect
to other articles. False by default.
* stop_on_error (bool) Tells the parser to skip any page that causes a
parse error. True by default.
* namespaces (List[int]) Tells the parser which namespaces to parse.
Contains namespaces from -2 to 15 by default.

Default values are chosen to preserve backwards compatibility.

Sample dump XML and full unit test coverage (with extended tests that
pass!) also included!

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-15 10:49:36 -04:00
Xavier
4c8106311f Add pip install langsmith for Quick Install part of README (#7694)
**Issue**
When I use conda to install langchain, a dependency error throwed -
"ModuleNotFoundError: No module named 'langsmith'"

**Updated**
Run `pip install langsmith` when install langchain with conda

Co-authored-by: xaver.xu <xavier.xu@batechworks.com>
2023-07-15 10:27:32 -04:00
Mohammad Mohtashim
b8b8a138df Simple Import fix in Tools Exception Docs (#7740)
Issue: #7720
 @hinthornw
2023-07-15 10:25:34 -04:00
Nicolas
43f900fd38 docs: Mendable Search Improvements (#7744)
- New pin-to-side (button). This functionality allows you to search the
docs while asking the AI for questions
- Fixed the search bar in Firefox that won't detect a mouse click
- Fixes and improvements overall in the model's performance
2023-07-15 10:19:21 -04:00
rjarun8
b7c409152a Document loader/debug (#7750)
Description: Added debugging output in DirectoryLoader to identify the
file being processed.
Issue: [Need a trace or debug feature in Lanchain DirectoryLoader
#7725](https://github.com/hwchase17/langchain/issues/7725)
Dependencies: No additional dependencies are required.
Tag maintainer: @rlancemartin, @eyurtsev
This PR enhances the DirectoryLoader with debugging output to help
diagnose issues when loading documents. This new feature does not add
any dependencies and has been tested on a local machine.
2023-07-15 10:18:27 -04:00
Lance Martin
b015647e31 Add GPT4All embeddings (#7743)
Support for [GPT4All
embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-15 10:04:29 -04:00
Chang Sau Sheong
b6a7f40ad3 added support for Google Images search (#7751)
- Description: Added Google Image Search support for SerpAPIWrapper 
  - Issue: NA
  - Dependencies: None
  - Tag maintainer: @hinthornw
  - Twitter handle: @sausheong

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-15 10:04:18 -04:00
Kacper Łukawski
1ff5b67025 Implement async API for Qdrant vector store (#7704)
Inspired by #5550, I implemented full async API support in Qdrant. The
docs were extended to mention the existence of asynchronous operations
in Langchain. I also used that chance to restructure the tests of Qdrant
and provided a suite of tests for the async version. Async API requires
the GRPC protocol to be enabled. Thus, it doesn't work on local mode
yet, but we're considering including the support to be consistent.
2023-07-15 09:33:26 -04:00
Bearnardd
275b926cf7 add missing import (#7730)
Just a nit documentation fix

 @baskaryan
2023-07-14 20:03:23 -04:00
Bearnardd
9800c6051c add support for truncate arg for HuggingFaceTextGenInference class (#7728)
Fixes https://github.com/hwchase17/langchain/issues/7650

* add support for `truncate` argument of `HugginFaceTextGenInference`

@baskaryan
2023-07-14 16:23:56 -04:00
Lorenzo
77e6bbe6f0 fix typo in deeplake.ipynb (#7718)
- Fixing typos in deeplake documentation
- @baskaryan
2023-07-14 13:38:31 -04:00
Samuel Berthe
2be3515a66 SQLDatabase: adding security disclamer (#7710)
It might be obvious to most engineers, but I think everybody should be
cautious when using such a chain.

![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)
2023-07-14 13:38:16 -04:00
William FH
fcf98dc4c1 Check for Tiktoken (#7705) 2023-07-14 09:49:01 -07:00
Bagatur
bae93682f6 update docs (#7714) 2023-07-14 11:49:09 -04:00
Bagatur
b065da6933 Bagatur/docs nit (#7712) 2023-07-14 11:13:02 -04:00
Bagatur
87d81b6acc Redirect old text splitter page (#7708)
related to #7665
2023-07-14 11:12:18 -04:00
Aarav Borthakur
210296a71f Integrate Rockset as a document loader (#7681)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Integrate [Rockset](https://rockset.com/docs/) as a document loader.

Issue: None
Dependencies: Nothing new (rockset's dependency was already added
[here](https://github.com/hwchase17/langchain/pull/6216))
Tag maintainer: @rlancemartin

I have added a test for the integration and an example notebook showing
its use. I ran `make lint` and everything looks good.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-14 07:58:13 -07:00
Bagatur
ad7d97670b bump 233 (#7707) 2023-07-14 10:38:13 -04:00
Samuel Berthe
7d4843fe84 feat(chains): adding ElasticsearchDatabaseChain for interacting with analytics database (#7686)
This pull request adds a ElasticsearchDatabaseChain chain for
interacting with analytics database, in the manner of the
SQLDatabaseChain.

Maintainer: @samber
Twitter handler: samuelberthe

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-14 10:30:57 -04:00
Daniel
6d88b23ef7 Update pgembedding.ipynb (#7699)
Update the extension name. It changed from pg_hnsw to pg_embedding.

Thank you. I missed this in my previous commit.
2023-07-14 08:39:01 -04:00
Eric Speidel
663b0933e4 Allow passing auth objects in TextRequestsWrapper (#7701)
- Description: This allows passing auth objects in request wrappers.
Currently, we can handle auth by editing headers in the
RequestsWrappers, but more complex auth methods, such as Kerberos, could
be handled better by using existing functionality within the requests
library. There are many authentication options supported both natively
and by extensions, such as requests-kerberos or requests-ntlm.
  
  - Issue: Fixes #7542
  - Dependencies: none

Co-authored-by: eric.speidel@de.bosch.com <eric.speidel@de.bosch.com>
2023-07-14 08:38:24 -04:00
Nuno Campos
1e40427755 Enabled nesting chain group (#7697)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
2023-07-14 10:03:16 +01:00
Leonid Kuligin
85e1c9b348 Added support for examples for VertexAI chat models. (#7636)
#5278

Co-authored-by: Leonid Kuligin <kuligin@google.com>
2023-07-14 02:03:04 -04:00
Richy Wang
45bb414be2 Add LLM for Alibaba's Damo Academy's Tongyi Qwen API (#7477)
- Add langchain.llms.Tonyi for text completion, in examples into the
Tonyi Text API,
- Add system tests.

Note async completion for the Text API is not yet supported and will be
included in a future PR.

Dependencies: dashscope. It will be installed manually cause it is not
need by everyone.

Happy for feedback on any aspect of this PR @hwchase17 @baskaryan.
2023-07-14 01:58:22 -04:00
Lance Martin
6325a3517c Make recursive loader yield while crawling (#7568)
Support actual lazy_load since it can take a while to crawl larger
directories.
2023-07-13 21:55:20 -07:00
UmerHA
82f3e32d8d [Small upgrade] Allow document limit in AzureCognitiveSearchRetriever (#7690)
Multiple people have asked in #5081 for a way to limit the documents
returned from an AzureCognitiveSearchRetriever. This PR adds the `top_n`
parameter to allow that.


Twitter handle:
 [@UmerHAdil](twitter.com/umerHAdil)
2023-07-13 23:04:40 -04:00
AI-Chef
af6d333147 Fix same issue #7524 in FileCallbackHandler (#7687)
Fix for Serializable class to include name, used in FileCallbackHandler
as same issue #7524

Description: Fixes the Serializable class to include 'name' attribute
(class_name) in the dict created,
This is used in Callbacks, specifically the StdOutCallbackHandler,
FileCallbackHandler.
Issue: As described in issue #7524
Dependencies: None
Tag maintainer: SInce this is related to the callback module, tagging
@agola11 @idoru
Comments:

Glad to see issue #7524 fixed in pull #6124, but you forget to change
the same place in FileCallbackHandler
2023-07-13 22:39:21 -04:00
Ben Perry
3874bb256e Weaviate: Batch embed texts (#5903)
When a custom Embeddings object is set, embed all given texts in a batch
instead of passing them through individually. Any code calling add_texts
can then appropriately size the chunks of texts that are passed through
to take full advantage of the hardware it's running on.
2023-07-13 20:57:58 -04:00
Charles P
574698a5fb Make so explicit class constructor is called in ElasticVectorSearch from_texts (#6199)
Fixes #6198 

ElasticKnnSearch.from_texts is actually ElasticVectorSearch.from_texts
and throws because it calls ElasticKnnSearch constructor with the wrong
arguments.

Now ElasticKnnSearch has its own from_texts, which constructs a proper
ElasticKnnSearch.

---------

Co-authored-by: Charles Parker <charlesparker@FiltaMacbook.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-13 19:55:20 -04:00
Daniel
854f3fe9b1 Update pgembedding.ipynb (#7682)
Correct links to the pg_embedding repository and the Neon documentation.
2023-07-13 19:54:07 -04:00
William FH
051fac1e66 Improve walkthrough links for sphinx (#7672)
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
2023-07-13 16:08:31 -07:00
Bagatur
5db4dba526 add integrations hub link to docs (#7675) 2023-07-13 18:44:10 -04:00
Kenton Parton
9124221d31 Fixed handling of absolute URLs in RecursiveUrlLoader (#7677)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description:
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

## Description
This PR addresses a bug in the RecursiveUrlLoader class where absolute
URLs were being treated as relative URLs, causing malformed URLs to be
produced. The fix involves using the urljoin function from the
urllib.parse module to correctly handle both absolute and relative URLs.

@rlancemartin @eyurtsev

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-13 15:34:00 -07:00
EllieRoseS
c087ce74f7 Added matching async load func to PlaywrightURLLoader (#5938)
Fixes # (issue)

The existing PlaywrightURLLoader load() function uses a synchronous
browser which is not compatible with jupyter.
This PR adds a sister function aload() which can be run insisde a
notebook.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-07-13 17:51:38 -04:00
William FH
ae7714f1ba Configure Tracer Workers (#7676)
Mainline the tracer to avoid calling feedback before run is posted.
Chose a bool over `max_workers` arg for configuring since we don't want
to support > 1 for now anyway. At some point may want to manage the pool
ourselves (ordering only really matters within a run and with parent
runs)
2023-07-13 14:00:14 -07:00
Jasper
fbc97a77ed add browserless loader (#7562)
# Browserless

Added support for Browserless' `/content` endpoint as a document loader.

### About Browserless

Browserless is a cloud service that provides access to headless Chrome
browsers via a REST API. It allows developers to automate Chromium in a
serverless fashion without having to configure and maintain their own
Chrome infrastructure.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
2023-07-13 13:18:28 -07:00
mebstyne-msft
120c52589b Enabled Azure Active Directory token-based auth access to OpenAI completions (#6313)
With AzureOpenAI openai_api_type defaulted to "azure" the logic in
utils' get_from_dict_or_env() function triggered by the root validator
never looks to environment for the user's runtime openai_api_type
values. This inhibits folks using token-based auth, or really any auth
model other than "azure."

By removing the "default" value, this allows environment variables to be
pulled at runtime for the openai_api_type and thus enables the other
api_types which are expected to work.

---------

Co-authored-by: Ebo <mebstyne@microsoft.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-07-13 16:05:47 -04:00
frangin2003
c7b687e944 Simplify GraphQL Tool Initialization documentation by Removing 'llm' Argument (#7651)
This PR is aimed at enhancing the clarity of the documentation in the
langchain project.

**Description**:
In the graphql.ipynb file, I have removed the unnecessary 'llm' argument
from the initialization process of the GraphQL tool (of type
_EXTRA_OPTIONAL_TOOLS). The 'llm' argument is not required for this
process. Its presence could potentially confuse users. This modification
simplifies the understanding of tool initialization and minimizes
potential confusion.

**Issue**: Not applicable, as this is a documentation improvement.

**Dependencies**: None.

**I kindly request a review from the following maintainer**: @hinthornw,
who is responsible for Agents / Tools / Toolkits.

No new integration is being added in this PR, hence no need for a test
or an example notebook.

Please see the changes for more detail and let me know if any further
modification is necessary.
2023-07-13 14:52:07 -04:00
William FH
aab2a7cd4b Normalize Trajectory Eval Score (#7668) 2023-07-13 09:58:28 -07:00
William FH
5f03cc3511 spelling nit (#7667) 2023-07-13 09:12:57 -07:00
Bagatur
3dd0704e38 bump 232 (#7659) 2023-07-13 10:32:39 -04:00
Tamas Molnar
24c1654208 Fix SQLAlchemy LLM cache clear (#7653)
Fixes #7652 

Description: 
This is a fix for clearing the cache for SQL Alchemy based LLM caches. 

The langchain.llm_cache.clear() did not take effect for SQLite cache. 
Reason: it didn't commit the deletion database change.

See SQLAlchemy documentation for proper usage:

https://docs.sqlalchemy.org/en/20/orm/session_basics.html#opening-and-closing-a-session
https://docs.sqlalchemy.org/en/20/orm/session_basics.html#deleting

@hwchase17 @baskaryan

---------

Co-authored-by: Tamas Molnar <tamas.molnar@nagarro.com>
2023-07-13 09:39:04 -04:00
Bagatur
c17a80f11c fix chroma updated upsert interface (#7643)
new chroma release seems to not support empty dicts for metadata.

related to #7633
2023-07-13 09:27:14 -04:00
William FH
a673a51efa [Breaking] Update Evaluation Functionality (#7388)
- Migrate from deprecated langchainplus_sdk to `langsmith` package
- Update the `run_on_dataset()` API to use an eval config
- Update a number of evaluators, as well as the loading logic
- Update docstrings / reference docs
- Update tracer to share single HTTP session
2023-07-13 02:13:06 -07:00
Sam Coward
224199083b Fix missing chain classname in StdOutCallbackHandler.on_chain_start (#6124)
Retrieves the name of the class from new location as of commit
18af149e91


Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
2023-07-13 03:05:36 -04:00
lucasiscovici
af3f401015 update base class of ListStepContainer to BaseStepContainer (#6232)
update base class of ListStepContainer to BaseStepContainer

Fixes #6231
2023-07-13 03:03:02 -04:00
Matt Adams
98e1bbfbbd Add missing dependencies to apify.ipynb (#6331)
Fixes errors caused by missing dependencies when running the notebook.
2023-07-13 03:02:23 -04:00
Ma Donghao
6f62e5461c Update the parser regex of map_rerank (#6419)
Sometimes the score responded by chatgpt would be like 'Respone
example\nScore: 90 (fully answers the question, but could provide more
detail on the specific error message)'
For the score contains not only numbers, it raise a ValueError like 


Update the RegexParser from `.*` to `\d*` would help us to ignore the
text after number.

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-13 03:01:42 -04:00
Bagatur
b08f903755 fix chroma init bug (#7639) 2023-07-13 03:00:33 -04:00
Nir Gazit
f307ca094b fix(memory): allow internal chains to use memory (#6769)
Fixed #6768.

This is a workaround only. I think a better longer-term solution is for
chains to declare how many input variables they *actually* need (as
opposed to ones that are in the prompt, where some may be satisfied by
the memory). Then, a wrapping chain can check the input match against
the actual input variables.

@hwchase17
2023-07-13 02:47:44 -04:00
Francisco Ingham
488d2d5da9 Entity extraction improvements (#6342)
Added fix to avoid irrelevant attributes being returned plus an example
of extracting unrelated entities and an exampe of using an 'extra_info'
attribute to extract unstructured data for an entity.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-13 02:16:05 -04:00
Nir Gazit
a8bbfb2da3 feat(agents): allow trimming of intermediate steps to last N (#6476)
Added an option to trim intermediate steps to last N steps. This is
especially useful for long-running agents. Users can explicitly specify
N or provide a function that does custom trimming/manipulation on
intermediate steps. I've mimicked the API of the `handle_parsing_errors`
parameter.
2023-07-13 02:09:25 -04:00
Zeeland
92ef77da35 fix: remove useless variable k (#6524)
remove useless variable k

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-13 01:58:36 -04:00
Bagatur
7f8ff2a317 add tagger nb (#7637) 2023-07-13 01:48:23 -04:00
Sidchat95
c5e50c40c9 Fix Document Similarity Check with passed Threshold (#6845)
Converting the Similarity obtained in the
similarity_search_with_score_by_vector method whilst comparing to the
passed
threshold. This is because the passed threshold is a number between 0 to
1 and is already in the relevance_score_fn format.
As of now, the function is comparing two different scoring parameters
and that wouldn't work.

Dependencies
None

Issue:
Different scores being compared in
similarity_search_with_score_by_vector method in FAISS.

Tag maintainer
@hwchase17



<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-13 01:30:47 -04:00
Jacob Ajit
a08baa97c5 Use modern OpenAI endpoints for embeddings (#6573)
- Description: 

LangChain passes
[engine](https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/openai.py#L256)
and not `model` as a field when making OpenAI requests. Within the
`openai` Python library, for OpenAI requests, this [makes a
call](https://github.com/openai/openai-python/blob/main/openai/api_resources/abstract/engine_api_resource.py#L58)
to an endpoint of the form
`https://api.openai.com/v1/engines/{engine_id}/embeddings`.

These endpoints are
[deprecated](https://help.openai.com/en/articles/6283125-what-happened-to-engines)
in favor of endpoints of the format
`https://api.openai.com/v1/embeddings`, where `model` is passed as a
parameter in the request body.

While these deprecated endpoints continue to function for now, they may
not be supported indefinitely and should be avoided in favor of the
newer API format.

It appears that `engine` was passed in instead of `model` to make both
Azure OpenAI and OpenAI calls work similarly. However, the inclusion of
`engine`
[causes](https://github.com/openai/openai-python/blob/main/openai/api_resources/abstract/engine_api_resource.py#L58)
OpenAI to use the deprecated endpoint, requiring a diverging code path
for Azure OpenAI calls where `engine` is passed in additionally (Azure
OpenAI requires `engine` to specify a deployment, and can optionally
take in `model`).

In the long-term, it may be worth considering spinning off Azure OpenAI
embeddings into a separate class for ease of use and maintenance,
similar to the [implementation for chat
models](https://github.com/hwchase17/langchain/blob/master/langchain/chat_models/azure_openai.py).
2023-07-13 01:23:17 -04:00
Jacob Lee
cdb93ab5ca Adds OpenAI functions powered document metadata tagger (#7521)
Adds a new document transformer that automatically extracts metadata for
a document based on an input schema. I also moved
`document_transformers.py` to `document_transformers/__init__.py` to
group it with this new transformer - it didn't seem to cause issues in
the notebook, but let me know if I've done something wrong there.

Also had a linter issue I couldn't figure out:

```
MacBook-Pro:langchain jacoblee$ make lint
poetry run mypy .
docs/dist/conf.py: error: Duplicate module named "conf" (also at "./docs/api_reference/conf.py")
docs/dist/conf.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info
docs/dist/conf.py: note: Common resolutions include: a) using `--exclude` to avoid checking one of them, b) adding `__init__.py` somewhere, c) using `--explicit-package-bases` or adjusting MYPYPATH
Found 1 error in 1 file (errors prevented further checking)
make: *** [lint] Error 2
```

@rlancemartin @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-13 01:12:41 -04:00
Jason Fan
8effd90be0 Add new types of document transformers (#7379)
- Description: Add two new document transformers that translates
documents into different languages and converts documents into q&a
format to improve vector search results. Uses OpenAI function calling
via the [doctran](https://github.com/psychic-api/doctran/tree/main)
library.
  - Issue: N/A
  - Dependencies: `doctran = "^0.0.5"`
  - Tag maintainer: @rlancemartin @eyurtsev @hwchase17 
  - Twitter handle: @psychicapi or @jfan001

Notes
- Adheres to the `DocumentTransformer` abstraction set by @dev2049 in
#3182
- refactored `EmbeddingsRedundantFilter` to put it in a file under a new
`document_transformers` module
- Added basic docs for `DocumentInterrogator`, `DocumentTransformer` as
well as the existing `EmbeddingsRedundantFilter`

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-12 23:53:30 -04:00
Piyush Jain
f11d845dee Fixed validation error when credentials_profile_name, or region_name is not passed (#7629)
## Summary
This PR corrects the checks for credentials_profile_name, and
region_name attributes. This was causing validation exceptions when
either of these values were missing during creation of the retriever
class.

Fixes #7571 

#### Requested reviewers:
@baskaryan
2023-07-12 23:47:35 -04:00
Jamie Broomall
0e1d7a27c6 WhyLabsCallbackHandler updates (#7621)
Updates to the WhyLabsCallbackHandler and example notebook
- Update dependency to langkit 0.0.6 which defines new helper methods
for callback integrations
- Update WhyLabsCallbackHandler to use the new `get_callback_instance`
so that the callback is mostly defined in langkit
- Remove much of the implementation of the WhyLabsCallbackHandler here
in favor of the callback instance

This does not change the behavior of the whylabs callback handler
implementation but is a reorganization that moves some of the
implementation externally to our optional dependency package, and should
make future updates easier.

@agola11
2023-07-12 23:46:56 -04:00
Gaurang Pawar
53722dcfdc Fixed a typo in pinecone_hybrid_search.ipynb (#7627)
Fixed a small typo in documentation
2023-07-12 23:46:41 -04:00
Bagatur
1d4db1327a fix openai structured chain with pydantic (#7622)
should return pydantic class
2023-07-12 23:46:13 -04:00
Bagatur
ee70d4a0cd mv tutorials (#7614) 2023-07-12 17:33:36 -04:00
William FH
9b215e761e Stop warning when parent run ID not present (#7611) 2023-07-12 14:04:32 -07:00
William FH
2f848294cb Rm Warning that Tracing is Experimental (#7612) 2023-07-12 14:04:28 -07:00
Yaohui Wang
d85c33a5c3 Fix the markdown rendering issue with a code block inside a markdown code block (#6625)
### Description

- Fix the markdown rendering issue with a code block inside a markdown,
using a different number of backticks for the delimiters.

Current doc site:
<https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/code_splitter#markdown>

After fix:
<img width="480" alt="image"
src="https://github.com/hwchase17/langchain/assets/3115235/d9921d59-64e6-4a34-9c62-79743667f528">


### Who can review

PTAL @dev2049 

Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>
2023-07-12 16:29:25 -04:00
Yaroslav Halchenko
0d92a7f357 codespell: workflow, config + some (quite a few) typos fixed (#6785)
Probably the most  boring PR to review ;)

Individual commits might be easier to digest

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2023-07-12 16:20:08 -04:00
Sam
931e68692e Adds a chain around sympy for symbolic math (#6834)
- Description: Adds a new chain that acts as a wrapper around Sympy to
give LLMs the ability to do some symbolic math.
- Dependencies: SymPy

---------

Co-authored-by: sreiswig <sreiswig@github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-12 15:17:32 -04:00
Bharat Ramanathan
be29a6287d feat: add model architecture back to wandb tracer (#6806)
# Description

This PR adds model architecture to the `WandbTracer` from the Serialized
Run kwargs. This allows visualization of the calling parameters of an
Agent, LLM and Tool in Weights & Biases.
    1. Safely serialize the run objects to WBTraceTree model_dict
    2. Refactors the run processing logic to be more organized.

- Twitter handle: @parambharat

---------

Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-12 15:00:18 -04:00
Alex Iribarren
adc96d60b6 Implement Function Callback tracer (#6835)
Description: I wanted to be able to redirect debug output to a function,
but it wasn't very easy. I figured it would make sense to implement a
`FunctionCallbackHandler`, and reimplement `ConsoleCallbackHandler` as a
subclass that calls the `print` function. Now I can create a simple
subclass in my project that calls `logging.info` or whatever I need.

Tag maintainer: @agola11
Twitter handle: `@andandaraalex`
2023-07-12 14:38:41 -04:00
Ducasse-Arthur
93a84f6182 Update bedrock.py - support of other endpoint url (esp. for users of … (#7592)
Added an _endpoint_url_ attribute to Bedrock(LLM) class - I have access
to Bedrock only via us-west-2 endpoint and needed to change the endpoint
url, this could be useful to other users
2023-07-12 10:43:23 -04:00
Bagatur
22525bad65 bump 231 (#7584) 2023-07-12 10:43:12 -04:00
Subsegment
6e1000dc8d docs : Use more meaningful cnosdb examples (#7587)
This change makes the ecosystem integrations cnosdb documentation more
realistic and easy to understand.

- change examples of question and table
- modify typo and format
2023-07-12 10:31:55 -04:00
Samuel ROZE
f3c9bf5e4b fix(typo): Clarify the point of llm_chain (#7593)
Fixes a typo introduced in
https://github.com/hwchase17/langchain/pull/7080 by @hwchase17.

In the example (visible on [the online
documentation](https://api.python.langchain.com/en/latest/chains/langchain.chains.conversational_retrieval.base.ConversationalRetrievalChain.html#langchain-chains-conversational-retrieval-base-conversationalretrievalchain)),
the `llm_chain` variable is unused as opposed to being used for the
question generator. This change makes it clearer.
2023-07-12 10:31:00 -04:00
Alec Flett
6cdd4b5edc only add handlers if they are new (#7504)
When using callbacks, there are times when callbacks can be added
redundantly: for instance sometimes you might need to create an llm with
specific callbacks, but then also create and agent that uses a chain
that has those callbacks already set. This means that "callbacks" might
get passed down again to the llm at predict() time, resulting in
duplicate calls to the `on_llm_start` callback.

For the sake of simplicity, I made it so that langchain never adds an
exact handler/callbacks object in `add_handler`, thus avoiding the
duplicate handler issue.

Tagging @hwchase17 for callback review

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-12 03:48:29 -04:00
ausboss
50316f6477 Adding LLM wrapper for Kobold AI (#7560)
- Description: add wrapper that lets you use KoboldAI api in langchain
  - Issue: n/a
  - Dependencies: none extra, just what exists in lanchain
  - Tag maintainer: @baskaryan 
  - Twitter handle: @zanzibased
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-12 03:48:12 -04:00
Rohit Kumar Singh
603a0bea29 Fixes incorrect docstore creation in faiss.py (#7026)
- **Description**: Current implementation assumes that the length of
`texts` and `ids` should be same but if the passed `ids` length is not
equal to the passed length of `texts`, current code
`dict(zip(index_to_id.values(), documents))` is not failing or giving
any warning and silently creating docstores only for the passed `ids`
i.e. if `ids = ['A']` and `texts=["I love Open Source","I love
langchain"]` then only one `docstore` will be created. But either two
docstores should be created assuming same id value for all the elements
of `texts` or an error should be raised.
  
- **Issue**: My change fixes this by using dictionary comprehension
instead of `zip`. This was if lengths of `ids` and `texts` mismatches an
explicit `IndexError` will be raised.
  
@rlancemartin, @eyurtsev
2023-07-12 03:35:49 -04:00
Tommy Hyeonwoo Kim
3f7213586e add supported properties for notiondb document loader's metadata (#7570)
fix #7569

add following properties for Notion DB document loader's metadata
- `unique_id`
- `status`
- `people`

@rlancemartin, @eyurtsev (Since this is a change related to
`DataLoaders`)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-12 03:34:54 -04:00
Junlin Zhou
5f17c57174 Update chat agents' output parser to extract action by regex (#7511)
Currently `ChatOutputParser` extracts actions by splitting the text on
"```", and then load the second part as a json string.

But sometimes the LLM will wrap the action in markdown code block like:

````markdown
```json
{
  "action": "foo",
  "action_input": "bar"
}
```
````

Splitting text on "```" will cause `OutputParserException` in such case.

This PR changes the behaviour to extract the `$JSON_BLOB` by regex, so
that it can handle both ` ``` ``` ` and ` ```json ``` `

@hinthornw

---------

Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>
2023-07-12 03:12:02 -04:00
Bagatur
ebcb144342 unit test sqlalachemy (#7582) 2023-07-12 03:03:16 -04:00
Harrison Chase
641fd74baa Harrison/pg vector move (#7580) 2023-07-12 02:22:34 -04:00
os1ma
2667ddc686 Fix make docs_build and related scripts (#7276)
**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-11 22:05:14 -04:00
Pharbie
74c28df363 Update Pinecone Upsert method usage (#7358)
Description: Refactor the upsert method in the Pinecone class to allow
for additional keyword arguments. This change adds flexibility and
extensibility to the method, allowing for future modifications or
enhancements. The upsert method now accepts the `**kwargs` parameter,
which can be used to pass any additional arguments to the Pinecone
index. This change has been made in both the `upsert` method in the
`Pinecone` class and the `upsert` method in the
`similarity_search_with_score` class method. Falls in line with the
usage of the upsert method in
[Pinecone-Python-Client](4640c4cf27/pinecone/index.py (L73))
Issue: [This feature request in Pinecone
Repo](https://github.com/pinecone-io/pinecone-python-client/issues/184)

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - Memory: @hwchase17

---------

Co-authored-by: kwesi <22204443+yankskwesi@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
2023-07-11 21:14:42 -04:00
Kazuki Maeda
5c3fe8b0d1 Enhance Makefile with 'format_diff' Option and Improved Readability (#7394)
### Description:

This PR introduces a new option format_diff to the existing Makefile.
This option allows us to apply the formatting tools (Black and isort)
only to the changed Python and ipynb files since the last commit. This
will make our development process more efficient as we only format the
codes that we modify. Along with this change, comments were added to
make the Makefile more understandable and maintainable.

### Issue:

N/A

### Dependencies:

Add dependency to black.

### Tag maintainer:

@baskaryan

### Twitter handle:

[kzk_maeda](https://twitter.com/kzk_maeda)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-11 21:03:17 -04:00
Bagatur
2babe3069f Revert pinecone v4 support (#7566)
Revert 9d13dcd
2023-07-11 20:58:59 -04:00
schop-rob
e811c5e8c6 Add OpenAI organization ID to docs (#7398)
Description: I added an example of how to reference the OpenAI API
Organization ID, because I couldn't find it before. In the example, it
is mentioned how to achieve this using environment variables as well as
parameters for the OpenAI()-class
Issue: -
Dependencies: -
Twitter @schop-rob
2023-07-11 20:51:58 -04:00
Kenny
8741e55e7c Template formats documentation (#7404)
Simple addition to the documentation, adding the correct import
statement & showcasing using Python FStrings.
2023-07-11 18:24:24 -04:00
Fielding Johnston
00c466627a minor bug fix: properly await AsyncRunManager's method call in MulitRouteChain (#7487)
This simply awaits `AsyncRunManager`'s method call in `MulitRouteChain`.
Noticed this while playing around with Langchain's implementation of
`MultiPromptChain`. @baskaryan

cheers

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-11 18:18:47 -04:00
tonomura
cc0585af42 Improvement/add finish reason to generation info in chat open ai (#7478)
Description: ChatOpenAI model does not return finish_reason in
generation_info.
Issue: #2702
Dependencies: None
Tag maintainer: @baskaryan 

Thank you

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-11 18:12:57 -04:00
Junlin Zhou
b96ac13f3d Minor update to reference other sql tool by tool names instead of hard coded string. (#7514)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Currently there are 4 tools in SQL agent-toolkits, and 2 of them have
reference to the other 2.

This PR change the reference from hard coded string to `{tool.name}`

Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>
2023-07-11 17:44:23 -04:00
OwenElliott
9cb2347453 Fix broken link from Marqo Ecosystem (#7510)
Small fix to a link from the Marqo page in the ecosystem.

The link was not updated correctly when the documentation structure
changed to html pages instead of links to notebooks.
2023-07-11 17:15:15 -04:00
Matt Robinson
c4d53f98dc docs: update unstructured docstrings (#7561)
### Summary

Updates the docstrings in the Unstructured document loaders to display
more useful information on the integrations page.
2023-07-11 17:12:05 -04:00
Ben Auffarth
2c2f0e15a6 clarify about api key (#7540)
I found it unclear, where to get the API keys for JinaChat. Mentioning
this in the docstring should be helpful.
#7490 

Twitter handle: benji1a

@delgermurun
2023-07-11 16:46:06 -04:00
Jona Sassenhagen
0ea7224535 [Minor] Remove tagger from spacy sentencizer (#7534)
@svlandeg gave me a tip for how to improve a bit on
https://github.com/hwchase17/langchain/pull/7442 for some extra speed
and memory gains. The tagger isn't needed for sentencization, so can be
disabled too.
2023-07-11 16:43:46 -04:00
Kacper Łukawski
1f83b5f47e Reuse the existing collection if configured properly in Qdrant.from_texts (#7530)
This PR changes the behavior of `Qdrant.from_texts` so the collection is
reused if not requested to recreate it. Previously, calling
`Qdrant.from_texts` or `Qdrant.from_documents` resulted in removing the
old data which was confusing for many.
2023-07-11 16:24:35 -04:00
Leonid Kuligin
6674b33cf5 Added support for chat_history (#7555)
#7469

Co-authored-by: Leonid Kuligin <kuligin@google.com>
2023-07-11 15:27:26 -04:00
Felix Brockmeier
406a9dc11f Add notebook example for Lemon AI NLP Workflow Automation (#7556)
- Description: Added notebook to LangChain docs that explains how to use
Lemon AI NLP Workflow Automation tool with Langchain
  
- Issue: not applicable
  
- Dependencies: not applicable
  
- Tag maintainer: @agola11
  
- Twitter handle: felixbrockm
2023-07-11 15:15:11 -04:00
Lance Martin
9e067b8cc9 Add env setup (#7550)
Include setup
2023-07-11 09:48:40 -07:00
Bagatur
3c4338470e bump 230 (#7544) 2023-07-11 11:24:08 -04:00
Bagatur
d2137eea9f fix cpal docs (#7545) 2023-07-11 11:07:45 -04:00
Boris
9129318466 CPAL (#6255)
# Causal program-aided language (CPAL) chain

## Motivation

This builds on the recent [PAL](https://arxiv.org/abs/2211.10435) to
stop LLM hallucination. The problem with the
[PAL](https://arxiv.org/abs/2211.10435) approach is that it hallucinates
on a math problem with a nested chain of dependence. The innovation here
is that this new CPAL approach includes causal structure to fix
hallucination.

For example, using the below word problem, PAL answers with 5, and CPAL
answers with 13.

    "Tim buys the same number of pets as Cindy and Boris."
    "Cindy buys the same number of pets as Bill plus Bob."
    "Boris buys the same number of pets as Ben plus Beth."
    "Bill buys the same number of pets as Obama."
    "Bob buys the same number of pets as Obama."
    "Ben buys the same number of pets as Obama."
    "Beth buys the same number of pets as Obama."
    "If Obama buys one pet, how many pets total does everyone buy?"

The CPAL chain represents the causal structure of the above narrative as
a causal graph or DAG, which it can also plot, as shown below.


![complex-graph](https://github.com/hwchase17/langchain/assets/367522/d938db15-f941-493d-8605-536ad530f576)

.

The two major sections below are:

1. Technical overview
2. Future application

Also see [this jupyter
notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb)
doc.


## 1. Technical overview

### CPAL versus PAL

Like [PAL](https://arxiv.org/abs/2211.10435), CPAL intends to reduce
large language model (LLM) hallucination.

The CPAL chain is different from the PAL chain for a couple of reasons. 

* CPAL adds a causal structure (or DAG) to link entity actions (or math
expressions).
* The CPAL math expressions are modeling a chain of cause and effect
relations, which can be intervened upon, whereas for the PAL chain math
expressions are projected math identities.

PAL's generated python code is wrong. It hallucinates when complexity
increases.

```python
def solution():
    """Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?"""
    obama_pets = 1
    tim_pets = obama_pets
    cindy_pets = obama_pets + obama_pets
    boris_pets = obama_pets + obama_pets
    total_pets = tim_pets + cindy_pets + boris_pets
    result = total_pets
    return result  # math result is 5
```

CPAL's generated python code is correct.

```python
story outcome data
    name                                   code  value      depends_on
0  obama                                   pass    1.0              []
1   bill               bill.value = obama.value    1.0         [obama]
2    bob                bob.value = obama.value    1.0         [obama]
3    ben                ben.value = obama.value    1.0         [obama]
4   beth               beth.value = obama.value    1.0         [obama]
5  cindy   cindy.value = bill.value + bob.value    2.0     [bill, bob]
6  boris   boris.value = ben.value + beth.value    2.0     [ben, beth]
7    tim  tim.value = cindy.value + boris.value    4.0  [cindy, boris]

query data
{
    "question": "how many pets total does everyone buy?",
    "expression": "SELECT SUM(value) FROM df",
    "llm_error_msg": ""
}
# query result is 13
```

Based on the comments below, CPAL's intended location in the library is
`experimental/chains/cpal` and PAL's location is`chains/pal`.

### CPAL vs Graph QA

Both the CPAL chain and the Graph QA chain extract entity-action-entity
relations into a DAG.

The CPAL chain is different from the Graph QA chain for a few reasons.

* Graph QA does not connect entities to math expressions
* Graph QA does not associate actions in a sequence of dependence.
* Graph QA does not decompose the narrative into these three parts:
  1. Story plot or causal model
  4. Hypothetical question
  5. Hypothetical condition 

### Evaluation

Preliminary evaluation on simple math word problems shows that this CPAL
chain generates less hallucination than the PAL chain on answering
questions about a causal narrative. Two examples are in [this jupyter
notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb)
doc.

## 2. Future application

### "Describe as Narrative, Test as Code"

The thesis here is that the Describe as Narrative, Test as Code approach
allows you to represent a causal mental model both as code and as a
narrative, giving you the best of both worlds.

#### Why describe a causal mental mode as a narrative?

The narrative form is quick. At a consensus building meeting, people use
narratives to persuade others of their causal mental model, aka. plan.
You can share, version control and index a narrative.

#### Why test a causal mental model as a code?

Code is testable, complex narratives are not. Though fast, narratives
are problematic as their complexity increases. The problem is LLMs and
humans are prone to hallucination when predicting the outcomes of a
narrative. The cost of building a consensus around the validity of a
narrative outcome grows as its narrative complexity increases. Code does
not require tribal knowledge or social power to validate.

Code is composable, complex narratives are not. The answer of one CPAL
chain can be the hypothetical conditions of another CPAL Chain. For
stochastic simulations, a composable plan can be integrated with the
[DoWhy library](https://github.com/py-why/dowhy). Lastly, for the
futuristic folk, a composable plan as code allows ordinary community
folk to design a plan that can be integrated with a blockchain for
funding.

An explanation of a dependency planning application is
[here.](https://github.com/borisdev/cpal-llm-chain-demo)

--- 
Twitter handle: @boris_dev

---------

Co-authored-by: Boris Dev <borisdev@Boriss-MacBook-Air.local>
2023-07-11 10:11:21 -04:00
Alejandra De Luna
2e4047e5e7 feat: support generate as an early stopping method for OpenAIFunctionsAgent (#7229)
This PR proposes an implementation to support `generate` as an
`early_stopping_method` for the new `OpenAIFunctionsAgent` class.

The motivation behind is to facilitate the user to set a maximum number
of actions the agent can take with `max_iterations` and force a final
response with this new agent (as with the `Agent` class).

The following changes were made:

- The `OpenAIFunctionsAgent.return_stopped_response` method was
overwritten to support `generate` as an `early_stopping_method`
- A boolean `with_functions` parameter was added to the
`OpenAIFunctionsAgent.plan` method

This way the `OpenAIFunctionsAgent.return_stopped_response` method can
call the `OpenAIFunctionsAgent.plan` method with `with_function=False`
when the `early_stopping_method` is set to `generate`, making a call to
the LLM with no functions and forcing a final response from the
`"assistant"`.

  - Relevant maintainer: @hinthornw
  - Twitter handle: @aledelunap

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-11 09:25:02 -04:00
Hashem Alsaket
1dd4236177 Fix HF endpoint returns blank for text-generation (#7386)
Description: Current `_call` function in the
`langchain.llms.HuggingFaceEndpoint` class truncates response when
`task=text-generation`. Same error discussed a few days ago on Hugging
Face: https://huggingface.co/tiiuae/falcon-40b-instruct/discussions/51
Issue: Fixes #7353 
Tag maintainer: @hwchase17 @baskaryan @hinthornw

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-11 03:06:05 -04:00
Lance Martin
4a94f56258 Minor edits to QA docs (#7507)
Small clean-ups
2023-07-10 22:15:05 -07:00
Raymond Yuan
5171c3bcca Refactor vector storage to correctly handle relevancy scores (#6570)
Description: This pull request aims to support generating the correct
generic relevancy scores for different vector stores by refactoring the
relevance score functions and their selection in the base class and
subclasses of VectorStore. This is especially relevant with VectorStores
that require a distance metric upon initialization. Note many of the
current implenetations of `_similarity_search_with_relevance_scores` are
not technically correct, as they just return
`self.similarity_search_with_score(query, k, **kwargs)` without applying
the relevant score function

Also includes changes associated with:
https://github.com/hwchase17/langchain/pull/6564 and
https://github.com/hwchase17/langchain/pull/6494

See more indepth discussion in thread in #6494 

Issue: 
https://github.com/hwchase17/langchain/issues/6526
https://github.com/hwchase17/langchain/issues/6481
https://github.com/hwchase17/langchain/issues/6346

Dependencies: None

The changes include:
- Properly handling score thresholding in FAISS
`similarity_search_with_score_by_vector` for the corresponding distance
metric.
- Refactoring the `_similarity_search_with_relevance_scores` method in
the base class and removing it from the subclasses for incorrectly
implemented subclasses.
- Adding a `_select_relevance_score_fn` method in the base class and
implementing it in the subclasses to select the appropriate relevance
score function based on the distance strategy.
- Updating the `__init__` methods of the subclasses to set the
`relevance_score_fn` attribute.
- Removing the `_default_relevance_score_fn` function from the FAISS
class and using the base class's `_euclidean_relevance_score_fn`
instead.
- Adding the `DistanceStrategy` enum to the `utils.py` file and updating
the imports in the vector store classes.
- Updating the tests to import the `DistanceStrategy` enum from the
`utils.py` file.

---------

Co-authored-by: Hanit <37485638+hanit-com@users.noreply.github.com>
2023-07-10 20:37:03 -07:00
Lance Martin
bd0c6381f5 Minor update to clarify map-reduce custom prompt usage (#7453)
Update docs for map-reduce custom prompt usage
2023-07-10 16:43:44 -07:00
Lance Martin
28d2b213a4 Update landing page for "question answering over documents" (#7152)
Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-10 14:15:13 -07:00
William FH
dd648183fa Rm create_project line (#7486)
not needed
2023-07-10 10:49:55 -07:00
Leonid Ganeline
5eec74d9a5 docstrings document_loaders 3 (#6937)
- Updated docstrings for `document_loaders`
- Mass update `"""Loader that loads` to `"""Loads`

@baskaryan  - please, review
2023-07-10 08:56:53 -07:00
Stanko Kuveljic
9d13dcd17c Pinecone: Add V4 support (#7473) 2023-07-10 08:39:47 -07:00
Adilkhan Sarsen
5debd5043e Added deeplake use case examples of the new features (#6528)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
 
 1. Added use cases of the new features
 2. Done some code refactoring

---------

Co-authored-by: Ivo Stranic <istranic@gmail.com>
2023-07-10 07:04:29 -07:00
Bagatur
9b615022e2 bump 229 (#7467) 2023-07-10 04:38:55 -04:00
Kazuki Maeda
92b4418c8c Datadog logs loader (#7356)
### Description
Created a Loader to get a list of specific logs from Datadog Logs.

### Dependencies
`datadog_api_client` is required.

### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-10 04:27:55 -04:00
Yifei Song
7d29bb2c02 Add Xorbits Dataframe as a Document Loader (#7319)
- [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source
computing framework that makes it easy to scale data science and machine
learning workloads in parallel. Xorbits can leverage multi cores or GPUs
to accelerate computation on a single machine, or scale out up to
thousands of machines to support processing terabytes of data.

- This PR added support for the Xorbits document loader, which allows
langchain to leverage Xorbits to parallelize and distribute the loading
of data.
- Dependencies: This change requires the Xorbits library to be installed
in order to be used.
`pip install xorbits`
- Request for review: @rlancemartin, @eyurtsev
- Twitter handle: https://twitter.com/Xorbitsio

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-10 04:24:47 -04:00
Sergio Moreno
21a353e9c2 feat: ctransformers support async chain (#6859)
- Description: Adding async method for CTransformers 
- Issue: I've found impossible without this code to run Websockets
inside a FastAPI micro service and a CTransformers model.
  - Tag maintainer: Not necessary yet, I don't like to mention directly 
  - Twitter handle: @_semoal
2023-07-10 04:23:41 -04:00
Paul-Emile Brotons
d2cf0d16b3 adding max_marginal_relevance_search method to MongoDBAtlasVectorSearch (#7310)
Adding a maximal_marginal_relevance method to the
MongoDBAtlasVectorSearch vectorstore enhances the user experience by
providing more diverse search results

Issue: #7304
2023-07-10 04:04:19 -04:00
Bagatur
04cddfba0d Add lark import error (#7465) 2023-07-10 03:21:23 -04:00
Matt Robinson
bcab894f4e feat: Add UnstructuredTSVLoader (#7367)
### Summary

Adds an `UnstructuredTSVLoader` for TSV files. Also updates the doc
strings for `UnstructuredCSV` and `UnstructuredExcel` loaders.

### Testing

```python
from langchain.document_loaders.tsv import UnstructuredTSVLoader

loader = UnstructuredTSVLoader(
    file_path="example_data/mlb_teams_2012.csv", mode="elements"
)
docs = loader.load()
```
2023-07-10 03:07:10 -04:00
Ronald Li
490f4a9ff0 Fixes KeyError in AmazonKendraRetriever initializer (#7464)
### Description
argument variable client is marked as required in commit
81e5b1ad36 which breaks the default way of
initialization providing only index_id. This commit avoid KeyError
exception when it is initialized without a client variable
### Dependencies
no dependency required
2023-07-10 03:02:36 -04:00
Jona Sassenhagen
7ffc431b3a Add spacy sentencizer (#7442)
`SpacyTextSplitter` currently uses spacy's statistics-based
`en_core_web_sm` model for sentence splitting. This is a good splitter,
but it's also pretty slow, and in this case it's doing a lot of work
that's not needed given that the spacy parse is then just thrown away.
However, there is also a simple rules-based spacy sentencizer. Using
this is at least an order of magnitude faster than using
`en_core_web_sm` according to my local tests.
Also, spacy sentence tokenization based on `en_core_web_sm` can be sped
up in this case by not doing the NER stage. This shaves some cycles too,
both when loading the model and when parsing the text.

Consequently, this PR adds the option to use the basic spacy
sentencizer, and it disables the NER stage for the current approach,
*which is kept as the default*.

Lastly, when extracting the tokenized sentences, the `text` attribute is
called directly instead of doing the string conversion, which is IMO a
bit more idiomatic.
2023-07-10 02:52:05 -04:00
charosen
50a9fcccb0 feat(module): add param ids to ElasticVectorSearch.from_texts method (#7425)
# add param ids to ElasticVectorSearch.from_texts method.

- Description: add param ids to ElasticVectorSearch.from_texts method.
- Issue: NA. It seems `add_texts` already supports passing in document
ids, but param `ids` is omitted in `from_texts` classmethod,
- Dependencies: None,
- Tag maintainer: @rlancemartin, @eyurtsev please have a look, thanks

```
    # ElasticVectorSearch add_texts
    def add_texts(
        self,
        texts: Iterable[str],
        metadatas: Optional[List[dict]] = None,
        refresh_indices: bool = True,
        ids: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> List[str]:
        ...

```

```
    # ElasticVectorSearch from_texts
    @classmethod
    def from_texts(
        cls,
        texts: List[str],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        elasticsearch_url: Optional[str] = None,
        index_name: Optional[str] = None,
        refresh_indices: bool = True,
        **kwargs: Any,
    ) -> ElasticVectorSearch:

```


Co-authored-by: charosen <charosen@bupt.cn>
2023-07-10 02:25:35 -04:00
James Yin
a5fd8873b1 fix: type hint of get_chat_history in BaseConversationalRetrievalChain (#7461)
The type hint of `get_chat_history` property in
`BaseConversationalRetrievalChain` is incorrect. @baskaryan
2023-07-10 02:14:00 -04:00
nikkie
dfc3f83b0f docs(vectorstores/integrations/chroma): Fix loading and saving (#7437)
- Description: Fix loading and saving code about Chroma
- Issue: the issue #7436 
- Dependencies: -
- Twitter handle: https://twitter.com/ftnext
2023-07-10 02:05:15 -04:00
Daniel Chalef
c7f7788d0b Add ZepMemory; improve ZepChatMessageHistory handling of metadata; Fix bugs (#7444)
Hey @hwchase17 - 

This PR adds a `ZepMemory` class, improves handling of Zep's message
metadata, and makes it easier for folks building custom chains to
persist metadata alongside their chat history.

We've had plenty confused users unfamiliar with ChatMessageHistory
classes and how to wrap the `ZepChatMessageHistory` in a
`ConversationBufferMemory`. So we've created the `ZepMemory` class as a
light wrapper for `ZepChatMessageHistory`.

Details:
- add ZepMemory, modify notebook to demo use of ZepMemory
- Modify summary to be SystemMessage
- add metadata argument to add_message; add Zep metadata to
Message.additional_kwargs
- support passing in metadata
2023-07-10 01:53:49 -04:00
Saurabh Chaturvedi
8f8e8d701e Fix info about YouTube (#7447)
(Unintentionally mean 😅) nit: YouTube wasn't created by Google, this PR
fixes the mention in docs.
2023-07-10 01:52:55 -04:00
Leonid Ganeline
560c4dfc98 docstrings: docstore and client (#6783)
updated docstrings in `docstore/` and `client/`

@baskaryan
2023-07-09 01:34:28 -04:00
Jeroen Van Goey
f5bd88757e Fix typo (#7416)
`quesitons` -> `questions`.
2023-07-09 00:54:48 -04:00
Alejandro Garrido Mota
ea9c3cc9c9 Fix syntax erros in documentation (#7409)
- Description: Tiny documentation fix. In Python, when defining function
parameters or providing arguments to a function or class constructor, we
do not use the `:` character.
- Issue: N/A
- Dependencies: N/A,
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @mogaal
2023-07-08 19:52:01 -04:00
Nolan
5da9f9abcb docs(agents/toolkits): Fix error in document_comparison_toolkit.ipynb (#7417)
Replace this comment with:
- Description: Removes unneeded output warning in documentation at
https://python.langchain.com/docs/modules/agents/toolkits/document_comparison_toolkit
  - Issue: -
  - Dependencies: -
  - Tag maintainer: @baskaryan
  - Twitter handle: @finnless
2023-07-08 19:51:08 -04:00
921 changed files with 36624 additions and 15219 deletions

View File

@@ -95,6 +95,14 @@ To run formatting for this project:
make format
```
Additionally, you can run the formatter only on the files that have been modified in your current branch as compared to the master branch using the format_diff command:
```bash
make format_diff
```
This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.
### Linting
Linting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](http://mypy-lang.org/).
@@ -105,8 +113,42 @@ To run linting for this project:
make lint
```
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:
```bash
make lint_diff
```
This can be very helpful when you've made changes to only certain parts of the project and want to ensure your changes meet the linting standards without having to check the entire codebase.
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
### Spellcheck
Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
Note that `codespell` finds common typos, so could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
To check spelling for this project:
```bash
make spell_check
```
To fix spelling in place:
```bash
make spell_fix
```
If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the `pyproject.toml` file.
```python
[tool.codespell]
...
# Add here:
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
```
### Coverage
Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.
@@ -208,30 +250,38 @@ When you run `poetry install`, the `langchain` package is installed as editable
### Contribute Documentation
Docs are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.
The docs directory contains Documentation and API Reference.
Documentation is built using [Docusaurus 2](https://docusaurus.io/).
API Reference are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.
For that reason, we ask that you add good documentation to all classes and methods.
Similar to linting, we recognize documentation can be annoying. If you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
### Build Documentation Locally
In the following commands, the prefix `api_` indicates that those are operations for the API Reference.
Before building the documentation, it is always a good idea to clean the build directory:
```bash
make docs_clean
make api_docs_clean
```
Next, you can run the linkchecker to make sure all links are valid:
```bash
make docs_linkcheck
```
Finally, you can build the documentation as outlined below:
Next, you can build the documentation as outlined below:
```bash
make docs_build
make api_docs_build
```
Finally, you can run the linkchecker to make sure all links are valid:
```bash
make docs_linkcheck
make api_docs_linkcheck
```
## 🏭 Release Process

View File

@@ -7,6 +7,8 @@ Replace this comment with:
- Tag maintainer: for a quicker response, tag the relevant maintainer (see below),
- Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on network access,
2. an example notebook showing its use.

22
.github/workflows/codespell.yml vendored Normal file
View File

@@ -0,0 +1,22 @@
---
name: Codespell
on:
push:
branches: [master]
pull_request:
branches: [master]
permissions:
contents: read
jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Codespell
uses: codespell-project/actions-codespell@v2

5
.gitignore vendored
View File

@@ -161,7 +161,12 @@ docs/node_modules/
docs/.docusaurus/
docs/.cache-loader/
docs/_dist
docs/api_reference/api_reference.rst
docs/api_reference/_build
docs/api_reference/*/
!docs/api_reference/_static/
!docs/api_reference/templates/
!docs/api_reference/themes/
docs/docs_skeleton/build
docs/docs_skeleton/node_modules
docs/docs_skeleton/yarn.lock

View File

@@ -1,40 +1,47 @@
.PHONY: all clean format lint test tests test_watch integration_tests docker_tests help extended_tests
.PHONY: all clean docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck format lint test tests test_watch integration_tests docker_tests help extended_tests
# Default target executed when no arguments are given to make.
all: help
######################
# TESTING AND COVERAGE
######################
# Run unit tests and generate a coverage report.
coverage:
poetry run pytest --cov \
--cov-config=.coveragerc \
--cov-report xml \
--cov-report term-missing:skip-covered
clean: docs_clean
######################
# DOCUMENTATION
######################
clean: docs_clean api_docs_clean
docs_compile:
poetry run nbdoc_build --srcdir $(srcdir)
docs_build:
cd docs && poetry run make html
docs/.local_build.sh
docs_clean:
cd docs && poetry run make clean
rm -r docs/_dist
docs_linkcheck:
poetry run linkchecker docs/_build/html/index.html
poetry run linkchecker docs/_dist/docs_skeleton/ --ignore-url node_modules
format:
poetry run black .
poetry run ruff --select I --fix .
api_docs_build:
poetry run python docs/api_reference/create_api_rst.py
cd docs/api_reference && poetry run make html
PYTHON_FILES=.
lint: PYTHON_FILES=.
lint_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$')
api_docs_clean:
rm -f docs/api_reference/api_reference.rst
cd docs/api_reference && poetry run make clean
lint lint_diff:
poetry run mypy $(PYTHON_FILES)
poetry run black $(PYTHON_FILES) --check
poetry run ruff .
api_docs_linkcheck:
poetry run linkchecker docs/api_reference/_build/html/index.html
# Define a variable for the test file path.
TEST_FILE ?= tests/unit_tests/
test:
@@ -56,6 +63,34 @@ docker_tests:
docker build -t my-langchain-image:test .
docker run --rm my-langchain-image:test
######################
# LINTING AND FORMATTING
######################
# Define a variable for Python and notebook files.
PYTHON_FILES=.
lint format: PYTHON_FILES=.
lint_diff format_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
lint lint_diff:
poetry run mypy $(PYTHON_FILES)
poetry run black $(PYTHON_FILES) --check
poetry run ruff .
format format_diff:
poetry run black $(PYTHON_FILES)
poetry run ruff --select I --fix $(PYTHON_FILES)
spell_check:
poetry run codespell --toml pyproject.toml
spell_fix:
poetry run codespell --toml pyproject.toml -w
######################
# HELP
######################
help:
@echo '----'
@echo 'coverage - run unit tests and generate coverage report'

View File

@@ -25,7 +25,7 @@ Please fill out [this form](https://forms.gle/57d8AmXBYp8PP8tZA) and we'll set u
`pip install langchain`
or
`conda install langchain -c conda-forge`
`pip install langsmith && conda install langchain -c conda-forge`
## 🤔 What is this?

View File

@@ -1,10 +1,15 @@
mkdir _dist
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
set -o xtrace
SCRIPT_DIR="$(cd "$(dirname "$0")"; pwd)"
cd "${SCRIPT_DIR}"
mkdir -p _dist/docs_skeleton
cp -r {docs_skeleton,snippets} _dist
mkdir -p _dist/docs_skeleton/static/api_reference
cd api_reference
poetry run make html
cp -r _build/* ../_dist/docs_skeleton/static/api_reference
cd ..
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
poetry run nbdoc_build

File diff suppressed because it is too large Load Diff

View File

@@ -20,7 +20,9 @@ def load_members() -> dict:
cls = re.findall(r"^class ([^_].*)\(", line)
members[top_level]["classes"].extend([module + "." + c for c in cls])
func = re.findall(r"^def ([^_].*)\(", line)
members[top_level]["functions"].extend([module + "." + f for f in func])
afunc = re.findall(r"^async def ([^_].*)\(", line)
func_strings = [module + "." + f for f in func + afunc]
members[top_level]["functions"].extend(func_strings)
return members

View File

@@ -0,0 +1,9 @@
Evaluation
=======================
LangChain has a number of convenient evaluation chains you can use off the shelf to grade your models' oupputs.
.. automodule:: langchain.evaluation
:members:
:undoc-members:
:inherited-members:

View File

@@ -3,6 +3,8 @@ sidebar_position: 0
---
# Integrations
Visit the [Integrations Hub](https://integrations.langchain.com) to further explore, upvote and request integrations across key LangChain components.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,12 @@
# LangSmith
import DocCardList from "@theme/DocCardList";
LangSmith helps you trace and evaluate your language model applications and intelligent agents to help you
move from prototype to production.
Check out the [interactive walkthrough](walkthrough) below to get started.
For more information, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/)
<DocCardList />

View File

@@ -24,7 +24,7 @@ That means there are two different axes along which you can customize your text
1. How the text is split
2. How the chunk size is measured
## Get started with text splitters
### Get started with text splitters
import GetStarted from "@snippets/modules/data_connection/document_transformers/get_started.mdx"

View File

@@ -8,7 +8,7 @@ Many LLM applications require user-specific data that is not part of the model's
building blocks to load, transform, store and query your data via:
- [Document loaders](/docs/modules/data_connection/document_loaders/): Load documents from many different sources
- [Document transformers](/docs/modules/data_connection/document_transformers/): Split documents, drop redundant documents, and more
- [Document transformers](/docs/modules/data_connection/document_transformers/): Split documents, convert documents into Q&A format, drop redundant documents, and more
- [Text embedding models](/docs/modules/data_connection/text_embedding/): Take unstructured text and turn it into a list of floating point numbers
- [Vector stores](/docs/modules/data_connection/vectorstores/): Store and search over embedded data
- [Retrievers](/docs/modules/data_connection/retrievers/): Query your data

View File

@@ -8,6 +8,8 @@ vectors, and then at query time to embed the unstructured query and retrieve the
'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search
for you.
![vector store diagram](/img/vector_stores.jpg)
## Get started
This walkthrough showcases basic functionality related to VectorStores. A key part of working with vector stores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [text embedding model](/docs/modules/data_connection/text_embedding/) interfaces before diving into this.
@@ -15,3 +17,11 @@ This walkthrough showcases basic functionality related to VectorStores. A key pa
import GetStarted from "@snippets/modules/data_connection/vectorstores/get_started.mdx"
<GetStarted/>
## Asynchronous operations
Vector stores are usually run as a separate service that requires some IO operations, and therefore they might be called asynchronously. That gives performance benefits as you don't waste time waiting for responses from external services. That might also be important if you work with an asynchronous framework, such as [FastAPI](https://fastapi.tiangolo.com/).
import AsyncVectorStore from "@snippets/modules/data_connection/vectorstores/async.mdx"
<AsyncVectorStore/>

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 3
---
# Comparison Evaluators
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,12 @@
---
sidebar_position: 5
---
# Examples
🚧 _Docs under construction_ 🚧
Below are some examples for inspecting and checking different chains.
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,28 @@
---
sidebar_position: 6
---
import DocCardList from "@theme/DocCardList";
# Evaluation
Language models can be unpredictable. This makes it challenging to ship reliable applications to production, where repeatable, useful outcomes across diverse inputs are a minimum requirement. Tests help demonstrate each component in an LLM application can produce the required or expected functionality. These tests also safeguard against regressions while you improve interconnected pieces of an integrated system. However, measuring the quality of generated text can be challenging. It can be hard to agree on the right set of metrics for your application, and it can be difficult to translate those into better performance. Furthermore, it's common to lack sufficient evaluation data adequately test the range of inputs and expected outputs for each component when you're just getting started. The LangChain community is building open source tools and guides to help address these challenges.
LangChain exposes different types of evaluators for common types of evaluation. Each type has off-the-shelf implementations you can use to get started, as well as an
extensible API so you can create your own or contribute improvements for everyone to use. The following sections have example notebooks for you to get started.
- [String Evaluators](/docs/modules/evaluation/string/): Evaluate the predicted string for a given input, usually against a reference string
- [Trajectory Evaluators](/docs/modules/evaluation/trajectory/): Evaluate the whole trajectory of agent actions
- [Comparison Evaluators](/docs/modules/evaluation/comparison/): Compare predictions from two runs on a common input
This section also provides some additional examples of how you could use these evaluators for different scenarios or apply to different chain implementations in the LangChain library. Some examples include:
- [Preference Scoring Chain Outputs](/docs/modules/evaluation/examples/comparisons): An example using a comparison evaluator on different models or prompts to select statistically significant differences in aggregate preference scores
## Reference Docs
For detailed information of the available evaluators, including how to instantiate, configure, and customize them. Check out the [reference documentation](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.evaluation) directly.
<DocCardList />

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 2
---
# String Evaluators
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -0,0 +1,8 @@
---
sidebar_position: 4
---
# Trajectory Evaluators
import DocCardList from "@theme/DocCardList";
<DocCardList />

View File

@@ -17,4 +17,6 @@ Let chains choose which tools to use given high-level directives
#### [Memory](/docs/modules/memory/)
Persist application state between runs of a chain
#### [Callbacks](/docs/modules/callbacks/)
Log and stream intermediate steps of any chain
Log and stream intermediate steps of any chain
#### [Evaluation](/docs/modules/evaluation/)
Evaluate the performance of a chain.

View File

@@ -148,6 +148,11 @@ const config = {
navbar: {
title: "🦜️🔗 LangChain",
items: [
{
to: "https://smith.langchain.com",
label: "LangSmith",
position: "right",
},
{
to: "https://js.langchain.com/docs",
label: "JS/TS Docs",

File diff suppressed because it is too large Load Diff

View File

@@ -23,7 +23,7 @@
"@docusaurus/preset-classic": "2.4.0",
"@docusaurus/remark-plugin-npm2yarn": "^2.4.0",
"@mdx-js/react": "^1.6.22",
"@mendable/search": "^0.0.112-beta.7",
"@mendable/search": "^0.0.125",
"clsx": "^1.2.1",
"json-loader": "^0.5.7",
"process": "^0.11.10",

View File

@@ -22,6 +22,7 @@ export default function SearchBarWrapper() {
placeholder="Search..."
dialogPlaceholder="How do I use a LLM Chain?"
messageSettings={{ openSourcesInNewTab: false, prettySources: true }}
isPinnable
showSimpleSearch
/>
</div>

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 483 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 291 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 237 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 858 KiB

View File

@@ -1300,6 +1300,10 @@
"source": "/en/latest/modules/indexes/text_splitters/examples/markdown_header_metadata.html",
"destination": "/docs/modules/data_connection/document_transformers/text_splitters/markdown_header_metadata"
},
{
"source": "/en/latest/modules/indexes/text_splitters.html",
"destination": "/docs/modules/data_connection/document_transformers/"
},
{
"source": "/en/latest/modules/indexes/retrievers/examples/chroma_self_query.html",
"destination": "/docs/modules/data_connection/retrievers/how_to/self_query/chroma_self_query"

View File

@@ -120,7 +120,8 @@
" history = []\n",
" while True:\n",
" user_input = input(\"\\n>>> input >>>\\n>>>: \")\n",
" if user_input == 'q': break\n",
" if user_input == \"q\":\n",
" break\n",
" history.append(HumanMessage(content=user_input))\n",
" history.append(llm(history))"
]

View File

@@ -22,7 +22,7 @@ import os
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_BASE"] = "https://<your-endpoint.openai.azure.com/"
os.environ["OPENAI_API_KEY"] = "your AzureOpenAI key"
os.environ["OPENAI_API_VERSION"] = "2023-03-15-preview"
os.environ["OPENAI_API_VERSION"] = "2023-05-15"
```
## LLM

View File

@@ -8,7 +8,7 @@ pip install cnos-connector
```
## Connecting to CnosDB
You can connect to CnosDB using the SQLDatabase.from_cnosdb() method.
You can connect to CnosDB using the `SQLDatabase.from_cnosdb()` method.
### Syntax
```python
def SQLDatabase.from_cnosdb(url: str = "127.0.0.1:8902",
@@ -31,7 +31,6 @@ Args:
## Examples
```python
# Connecting to CnosDB with SQLDatabase Wrapper
from cnosdb_connector import make_cnosdb_langchain_uri
from langchain import SQLDatabase
db = SQLDatabase.from_cnosdb()
@@ -43,7 +42,7 @@ from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")
```
### SQL Chain
### SQL Database Chain
This example demonstrates the use of the SQL Chain for answering a question over a CnosDB.
```python
from langchain import SQLDatabaseChain
@@ -51,15 +50,15 @@ from langchain import SQLDatabaseChain
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
db_chain.run(
"What is the average fa of test table that time between November 3,2022 and November 4, 2022?"
"What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?"
)
```
```shell
> Entering new chain...
What is the average fa of test table that time between November 3, 2022 and November 4, 2022?
SQLQuery:SELECT AVG(fa) FROM test WHERE time >= '2022-11-03' AND time < '2022-11-04'
SQLResult: [(2.0,)]
Answer:The average fa of the test table between November 3, 2022, and November 4, 2022, is 2.0.
What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?
SQLQuery:SELECT AVG(temperature) FROM air WHERE station = 'XiaoMaiDao' AND time >= '2022-10-19' AND time < '2022-10-20'
SQLResult: [(68.0,)]
Answer:The average temperature of air at station XiaoMaiDao between October 19, 2022 and October 20, 2022 is 68.0.
> Finished chain.
```
### SQL Database Agent
@@ -73,36 +72,39 @@ agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)
```
```python
agent.run(
"What is the average fa of test table that time between November 3, 2022 and November 4, 2022?"
"What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?"
)
```
```shell
> Entering new chain...
Action: sql_db_list_tables
Action Input: ""
Observation: test
Thought:The relevant table is "test". I should query the schema of this table to see the column names.
Observation: air
Thought:The "air" table seems relevant to the question. I should query the schema of the "air" table to see what columns are available.
Action: sql_db_schema
Action Input: "test"
Action Input: "air"
Observation:
CREATE TABLE test (
CREATE TABLE air (
pressure FLOAT,
station STRING,
temperature FLOAT,
time TIMESTAMP,
fa BIGINT
visibility FLOAT
)
/*
3 rows from test table:
fa time
1 2022-11-03T06:20:11
2 2022-11-03T06:20:11.000000001
3 2022-11-03T06:20:11.000000002
3 rows from air table:
pressure station temperature time visibility
75.0 XiaoMaiDao 67.0 2022-10-19T03:40:00 54.0
77.0 XiaoMaiDao 69.0 2022-10-19T04:40:00 56.0
76.0 XiaoMaiDao 68.0 2022-10-19T05:40:00 55.0
*/
Thought:The relevant column is "fa" in the "test" table. I can now construct the query to calculate the average "fa" between the specified time range.
Thought:The "temperature" column in the "air" table is relevant to the question. I can query the average temperature between the specified dates.
Action: sql_db_query
Action Input: "SELECT AVG(fa) FROM test WHERE time >= '2022-11-03' AND time < '2022-11-04'"
Observation: [(2.0,)]
Thought:The average "fa" of the "test" table between November 3, 2022 and November 4, 2022 is 2.0.
Final Answer: 2.0
Action Input: "SELECT AVG(temperature) FROM air WHERE station = 'XiaoMaiDao' AND time >= '2022-10-19' AND time <= '2022-10-20'"
Observation: [(68.0,)]
Thought:The average temperature of air at station XiaoMaiDao between October 19, 2022 and October 20, 2022 is 68.0.
Final Answer: 68.0
> Finished chain.
```

View File

@@ -6,22 +6,28 @@ The [Databricks](https://www.databricks.com/) Lakehouse Platform unifies data, a
Databricks embraces the LangChain ecosystem in various ways:
1. Databricks connector for the SQLDatabase Chain: SQLDatabase.from_databricks() provides an easy way to query your data on Databricks through LangChain
2. Databricks-managed MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps
3. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks
4. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub
2. Databricks MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps
3. Databricks MLflow AI Gateway
4. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks
5. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub
Databricks connector for the SQLDatabase Chain
----------------------------------------------
You can connect to [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the SQLDatabase wrapper of LangChain. See the notebook [Connect to Databricks](/docs/ecosystem/integrations/databricks/databricks.html) for details.
Databricks-managed MLflow integrates with LangChain
---------------------------------------------------
Databricks MLflow integrates with LangChain
-------------------------------------------
MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. See the notebook [MLflow Callback Handler](/docs/ecosystem/integrations/mlflow_tracking.ipynb) for details about MLflow's integration with LangChain.
Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Databricks workspace features such as experiment and run management and notebook revision capture. MLflow on Databricks offers an integrated experience for tracking and securing machine learning model training runs and running machine learning projects. See [MLflow guide](https://docs.databricks.com/mlflow/index.html) for more details.
Databricks-managed MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving.
Databricks MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving.
Databricks MLflow AI Gateway
----------------------------
See [MLflow AI Gateway](/docs/ecosystem/integrations/mlflow_ai_gateway).
Databricks as an LLM provider
-----------------------------

View File

@@ -0,0 +1,88 @@
# Datadog Tracing
>[ddtrace](https://github.com/DataDog/dd-trace-py) is a Datadog application performance monitoring (APM) library which provides an integration to monitor your LangChain application.
Key features of the ddtrace integration for LangChain:
- Traces: Capture LangChain requests, parameters, prompt-completions, and help visualize LangChain operations.
- Metrics: Capture LangChain request latency, errors, and token/cost usage (for OpenAI LLMs and Chat Models).
- Logs: Store prompt completion data for each LangChain operation.
- Dashboard: Combine metrics, logs, and trace data into a single plane to monitor LangChain requests.
- Monitors: Provide alerts in response to spikes in LangChain request latency or error rate.
Note: The ddtrace LangChain integration currently provides tracing for LLMs, Chat Models, Text Embedding Models, Chains, and Vectorstores.
## Installation and Setup
1. Enable APM and StatsD in your Datadog Agent, along with a Datadog API key. For example, in Docker:
```
docker run -d --cgroupns host \
--pid host \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-e DD_API_KEY=<DATADOG_API_KEY> \
-p 127.0.0.1:8126:8126/tcp \
-p 127.0.0.1:8125:8125/udp \
-e DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true \
-e DD_APM_ENABLED=true \
gcr.io/datadoghq/agent:latest
```
2. Install the Datadog APM Python library.
```
pip install ddtrace>=1.17
```
3. The LangChain integration can be enabled automatically when you prefix your LangChain Python application command with `ddtrace-run`:
```
DD_SERVICE="my-service" DD_ENV="staging" DD_API_KEY=<DATADOG_API_KEY> ddtrace-run python <your-app>.py
```
**Note**: If the Agent is using a non-default hostname or port, be sure to also set `DD_AGENT_HOST`, `DD_TRACE_AGENT_PORT`, or `DD_DOGSTATSD_PORT`.
Additionally, the LangChain integration can be enabled programmatically by adding `patch_all()` or `patch(langchain=True)` before the first import of `langchain` in your application.
Note that using `ddtrace-run` or `patch_all()` will also enable the `requests` and `aiohttp` integrations which trace HTTP requests to LLM providers, as well as the `openai` integration which traces requests to the OpenAI library.
```python
from ddtrace import config, patch
# Note: be sure to configure the integration before calling ``patch()``!
# eg. config.langchain["logs_enabled"] = True
patch(langchain=True)
# to trace synchronous HTTP requests
# patch(langchain=True, requests=True)
# to trace asynchronous HTTP requests (to the OpenAI library)
# patch(langchain=True, aiohttp=True)
# to include underlying OpenAI spans from the OpenAI integration
# patch(langchain=True, openai=True)patch_all
```
See the [APM Python library documentation][https://ddtrace.readthedocs.io/en/stable/installation_quickstart.html] for more advanced usage.
## Configuration
See the [APM Python library documentation][https://ddtrace.readthedocs.io/en/stable/integrations.html#langchain] for all the available configuration options.
### Log Prompt & Completion Sampling
To enable log prompt and completion sampling, set the `DD_LANGCHAIN_LOGS_ENABLED=1` environment variable. By default, 10% of traced requests will emit logs containing the prompts and completions.
To adjust the log sample rate, see the [APM library documentation][https://ddtrace.readthedocs.io/en/stable/integrations.html#langchain].
**Note**: Logs submission requires `DD_API_KEY` to be specified when running `ddtrace-run`.
## Troubleshooting
Need help? Create an issue on [ddtrace](https://github.com/DataDog/dd-trace-py) or contact [Datadog support][https://docs.datadoghq.com/help/].

View File

@@ -0,0 +1,19 @@
# Datadog Logs
>[Datadog](https://www.datadoghq.com/) is a monitoring and analytics platform for cloud-scale applications.
## Installation and Setup
```bash
pip install datadog_api_client
```
We must initialize the loader with the Datadog API key and APP key, and we need to set up the query to extract the desired logs.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/datadog_logs.html).
```python
from langchain.document_loaders import DatadogLogsLoader
```

View File

@@ -0,0 +1,36 @@
# Golden Query
>Golden Query is a wrapper on top of the [Golden Query API](https://docs.golden.com/reference/query-api) which enables programmatic access to query results on entities across Golden's Knowledge Base.
>See the [Golden Query API docs](https://docs.golden.com/reference/query-api) for more information.
This page covers how to use `Golden Query` within LangChain.
## Installation and Setup
- Go to the [Golden API docs](https://docs.golden.com/) to get an overview about the Golden API.
- Create a Golden account if you don't have one on the [Golden Website](https://golden.com).
- Get your API key from the [Golden API Settings](https://golden.com/settings/api) page.
- Save your API key into GOLDEN_API_KEY env variable
## Wrappers
### Utility
There exists a GoldenQueryAPIWrapper utility which wraps this API. To import this utility:
```python
from langchain.utilities.golden_query import GoldenQueryAPIWrapper
```
For a more detailed walkthrough of this wrapper, see [this notebook](/docs/modules/agents/tools/integrations/golden_query.html).
### Tool
You can also easily load this wrapper as a Tool (to use with an Agent).
You can do this with:
```python
from langchain.agents import load_tools
tools = load_tools(["golden-query"])
```
For more information on tools, see [this page](/docs/modules/agents/tools/).

View File

@@ -1,7 +1,7 @@
# Grobid
This page covers how to use the Grobid to parse articles for LangChain.
It is seperated into two parts: installation and running the server
It is separated into two parts: installation and running the server
## Installation and Setup
#Ensure You have Java installed

View File

@@ -10,7 +10,7 @@ For Feedback, Issues, Contributions - please raise an issue here:
Main principles and benefits:
- more `pythonic` way of writing code
- write multiline prompts that wont break your code flow with indentation
- write multiline prompts that won't break your code flow with indentation
- making use of IDE in-built support for **hinting**, **type checking** and **popup with docs** to quickly peek in the function to see the prompt, parameters it consumes etc.
- leverage all the power of 🦜🔗 LangChain ecosystem
- adding support for **optional parameters**
@@ -31,7 +31,7 @@ def write_me_short_post(topic:str, platform:str="twitter", audience:str = "devel
"""
return
# run it naturaly
# run it naturally
write_me_short_post(topic="starwars")
# or
write_me_short_post(topic="starwars", platform="redit")
@@ -122,7 +122,7 @@ await write_me_short_post(topic="old movies")
# Simplified streaming
If we wan't to leverage streaming:
If we want to leverage streaming:
- we need to define prompt as async function
- turn on the streaming on the decorator, or we can define PromptType with streaming on
- capture the stream using StreamingContext
@@ -149,7 +149,7 @@ async def write_me_short_post(topic:str, platform:str="twitter", audience:str =
# just an arbitrary function to demonstrate the streaming... wil be some websockets code in the real world
# just an arbitrary function to demonstrate the streaming... will be some websockets code in the real world
tokens=[]
def capture_stream_func(new_token:str):
tokens.append(new_token)
@@ -250,7 +250,7 @@ the roles here are model native roles (assistant, user, system for chatGPT)
# Optional sections
- you can define a whole sections of your prompt that should be optional
- if any input in the section is missing, the whole section wont be rendered
- if any input in the section is missing, the whole section won't be rendered
the syntax for this is as follows:
@@ -273,7 +273,7 @@ def prompt_with_optional_partials():
# Output parsers
- llm_prompt decorator natively tries to detect the best output parser based on the output type. (if not set, it returns the raw string)
- list, dict and pydantic outputs are also supported natively (automaticaly)
- list, dict and pydantic outputs are also supported natively (automatically)
``` python
# this code example is complete and should run as it is

View File

@@ -28,4 +28,4 @@ To import this vectorstore:
from langchain.vectorstores import Marqo
```
For a more detailed walkthrough of the Marqo wrapper and some of its unique features, see [this notebook](../modules/data_connection/vectorstores/integrations/marqo.ipynb)
For a more detailed walkthrough of the Marqo wrapper and some of its unique features, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/marqo.html)

View File

@@ -0,0 +1,116 @@
# MLflow AI Gateway
The MLflow AI Gateway service is a powerful tool designed to streamline the usage and management of various large language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests. See [the MLflow AI Gateway documentation](https://mlflow.org/docs/latest/gateway/index.html) for more details.
## Installation and Setup
Install `mlflow` with MLflow AI Gateway dependencies:
```sh
pip install 'mlflow[gateway]'
```
Set the OpenAI API key as an environment variable:
```sh
export OPENAI_API_KEY=...
```
Create a configuration file:
```yaml
routes:
- name: completions
route_type: llm/v1/completions
model:
provider: openai
name: text-davinci-003
config:
openai_api_key: $OPENAI_API_KEY
- name: embeddings
route_type: llm/v1/embeddings
model:
provider: openai
name: text-embedding-ada-002
config:
openai_api_key: $OPENAI_API_KEY
```
Start the Gateway server:
```sh
mlflow gateway start --config-path /path/to/config.yaml
```
## Completions Example
```python
import mlflow
from langchain import LLMChain, PromptTemplate
from langchain.llms import MlflowAIGateway
gateway = MlflowAIGateway(
gateway_uri="http://127.0.0.1:5000",
route="completions",
params={
"temperature": 0.0,
"top_p": 0.1,
},
)
llm_chain = LLMChain(
llm=gateway,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
result = llm_chain.run(adjective="funny")
print(result)
with mlflow.start_run():
model_info = mlflow.langchain.log_model(chain, "model")
model = mlflow.pyfunc.load_model(model_info.model_uri)
print(model.predict([{"adjective": "funny"}]))
```
## Embeddings Example
```python
from langchain.embeddings import MlflowAIGatewayEmbeddings
embeddings = MlflowAIGatewayEmbeddings(
gateway_uri="http://127.0.0.1:5000",
route="embeddings",
)
print(embeddings.embed_query("hello"))
print(embeddings.embed_documents(["hello"]))
```
## Databricks MLflow AI Gateway
Databricks MLflow AI Gateway is in private preview.
Please contact a Databricks representative to enroll in the preview.
```python
from langchain import LLMChain, PromptTemplate
from langchain.llms import MlflowAIGateway
gateway = MlflowAIGateway(
gateway_uri="databricks",
route="completions",
)
llm_chain = LLMChain(
llm=gateway,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
result = llm_chain.run(adjective="funny")
print(result)
```

View File

@@ -18,7 +18,7 @@ We also deliver with live demo on huggingface! Please checkout our [huggingface
## Installation and Setup
- Install the Python SDK with `pip install clickhouse-connect`
### Setting up envrionments
### Setting up environments
There are two ways to set up parameters for myscale index.

View File

@@ -0,0 +1,107 @@
# Portkey
## LLMOps for Langchain
Portkey brings production readiness to Langchain. With Portkey, you can
- [x] view detailed **metrics & logs** for all requests,
- [x] enable **semantic cache** to reduce latency & costs,
- [x] implement automatic **retries & fallbacks** for failed requests,
- [x] add **custom tags** to requests for better tracking and analysis and [more](https://docs.portkey.ai).
### Using Portkey with Langchain
Using Portkey is as simple as just choosing which Portkey features you want, enabling them via `headers=Portkey.Config` and passing it in your LLM calls.
To start, get your Portkey API key by [signing up here](https://app.portkey.ai/login). (Click the profile icon on the top left, then click on "Copy API Key")
For OpenAI, a simple integration with logging feature would look like this:
```python
from langchain.llms import OpenAI
from langchain.utilities import Portkey
# Add the Portkey API Key from your account
headers = Portkey.Config(
api_key = "<PORTKEY_API_KEY>"
)
llm = OpenAI(temperature=0.9, headers=headers)
llm.predict("What would be a good company name for a company that makes colorful socks?")
```
Your logs will be captured on your [Portkey dashboard](https://app.portkey.ai).
A common Portkey X Langchain use case is to **trace a chain or an agent** and view all the LLM calls originating from that request.
### **Tracing Chains & Agents**
```python
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
from langchain.utilities import Portkey
# Add the Portkey API Key from your account
headers = Portkey.Config(
api_key = "<PORTKEY_API_KEY>",
trace_id = "fef659"
)
llm = OpenAI(temperature=0, headers=headers)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
# Let's test it out!
agent.run("What was the high temperature in SF yesterday in Fahrenheit? What is that number raised to the .023 power?")
```
**You can see the requests' logs along with the trace id on Portkey dashboard:**
<img src="/img/portkey-dashboard.gif" height="250"/>
<img src="/img/portkey-tracing.png" height="250"/>
## Advanced Features
1. **Logging:** Log all your LLM requests automatically by sending them through Portkey. Each request log contains `timestamp`, `model name`, `total cost`, `request time`, `request json`, `response json`, and additional Portkey features.
2. **Tracing:** Trace id can be passed along with each request and is visibe on the logs on Portkey dashboard. You can also set a **distinct trace id** for each request. You can [append user feedback](https://docs.portkey.ai/key-features/feedback-api) to a trace id as well.
3. **Caching:** Respond to previously served customers queries from cache instead of sending them again to OpenAI. Match exact strings OR semantically similar strings. Cache can save costs and reduce latencies by 20x.
4. **Retries:** Automatically reprocess any unsuccessful API requests **`upto 5`** times. Uses an **`exponential backoff`** strategy, which spaces out retry attempts to prevent network overload.
5. **Tagging:** Track and audit each user interaction in high detail with predefined tags.
| Feature | Config Key | Value (Type) | Required/Optional |
| -- | -- | -- | -- |
| API Key | `api_key` | API Key (`string`) | ✅ Required |
| [Tracing Requests](https://docs.portkey.ai/key-features/request-tracing) | `trace_id` | Custom `string` | ❔ Optional |
| [Automatic Retries](https://docs.portkey.ai/key-features/automatic-retries) | `retry_count` | `integer` [1,2,3,4,5] | ❔ Optional |
| [Enabling Cache](https://docs.portkey.ai/key-features/request-caching) | `cache` | `simple` OR `semantic` | ❔ Optional |
| Cache Force Refresh | `cache_force_refresh` | `True` | ❔ Optional |
| Set Cache Expiry | `cache_age` | `integer` (in seconds) | ❔ Optional |
| [Add User](https://docs.portkey.ai/key-features/custom-metadata) | `user` | `string` | ❔ Optional |
| [Add Organisation](https://docs.portkey.ai/key-features/custom-metadata) | `organisation` | `string` | ❔ Optional |
| [Add Environment](https://docs.portkey.ai/key-features/custom-metadata) | `environment` | `string` | ❔ Optional |
| [Add Prompt (version/id/string)](https://docs.portkey.ai/key-features/custom-metadata) | `prompt` | `string` | ❔ Optional |
## **Enabling all Portkey Features:**
```py
headers = Portkey.Config(
# Mandatory
api_key="<PORTKEY_API_KEY>",
# Cache Options
cache="semantic",
cache_force_refresh="True",
cache_age=1729,
# Advanced
retry_count=5,
trace_id="langchain_agent",
# Metadata
environment="production",
user="john",
organisation="acme",
prompt="Frost"
)
```
For detailed information on each feature and how to use it, [please refer to the Portkey docs](https://docs.portkey.ai). If you have any questions or need further assistance, [reach out to us on Twitter.](https://twitter.com/portkeyai).

View File

@@ -0,0 +1,242 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Log, Trace, and Monitor Langchain LLM Calls\n",
"\n",
"When building apps or agents using Langchain, you end up making multiple API calls to fulfill a single user request. However, these requests are not chained when you want to analyse them. With [**Portkey**](/docs/ecosystem/integrations/portkey), all the embeddings, completion, and other requests from a single user request will get logged and traced to a common ID, enabling you to gain full visibility of user interactions.\n",
"\n",
"This notebook serves as a step-by-step guide on how to integrate and use Portkey in your Langchain app."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's import Portkey, OpenAI, and Agent tools"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain.agents import AgentType, initialize_agent, load_tools\n",
"from langchain.llms import OpenAI\n",
"from langchain.utilities import Portkey"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Paste your OpenAI API key below. [(You can find it here)](https://platform.openai.com/account/api-keys)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = \"<OPENAI_API_KEY>\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get Portkey API Key\n",
"1. Sign up for [Portkey here](https://app.portkey.ai/login)\n",
"2. On your [dashboard](https://app.portkey.ai/), click on the profile icon on the top left, then click on \"Copy API Key\"\n",
"3. Paste it below"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"PORTKEY_API_KEY = \"<PORTKEY_API_KEY>\" # Paste your Portkey API Key here"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set Trace ID\n",
"1. Set the trace id for your request below\n",
"2. The Trace ID can be common for all API calls originating from a single request"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"TRACE_ID = \"portkey_langchain_demo\" # Set trace id here"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate Portkey Headers"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"headers = Portkey.Config(\n",
" api_key=PORTKEY_API_KEY,\n",
" trace_id=TRACE_ID,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run your agent as usual. The **only** change is that we will **include the above headers** in the request now."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0, headers=headers)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
"agent = initialize_agent(\n",
" tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
")\n",
"\n",
"# Let's test it out!\n",
"agent.run(\n",
" \"What was the high temperature in SF yesterday in Fahrenheit? What is that number raised to the .023 power?\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## How Logging & Tracing Works on Portkey\n",
"\n",
"**Logging**\n",
"- Sending your request through Portkey ensures that all of the requests are logged by default\n",
"- Each request log contains `timestamp`, `model name`, `total cost`, `request time`, `request json`, `response json`, and additional Portkey features\n",
"\n",
"**Tracing**\n",
"- Trace id is passed along with each request and is visibe on the logs on Portkey dashboard\n",
"- You can also set a **distinct trace id** for each request if you want\n",
"- You can append user feedback to a trace id as well. [More info on this here](https://docs.portkey.ai/key-features/feedback-api)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Advanced LLMOps Features - Caching, Tagging, Retries\n",
"\n",
"In addition to logging and tracing, Portkey provides more features that add production capabilities to your existing workflows:\n",
"\n",
"**Caching**\n",
"\n",
"Respond to previously served customers queries from cache instead of sending them again to OpenAI. Match exact strings OR semantically similar strings. Cache can save costs and reduce latencies by 20x.\n",
"\n",
"**Retries**\n",
"\n",
"Automatically reprocess any unsuccessful API requests **`upto 5`** times. Uses an **`exponential backoff`** strategy, which spaces out retry attempts to prevent network overload.\n",
"\n",
"| Feature | Config Key | Value (Type) |\n",
"| -- | -- | -- |\n",
"| [🔁 Automatic Retries](https://docs.portkey.ai/key-features/automatic-retries) | `retry_count` | `integer` [1,2,3,4,5] |\n",
"| [🧠 Enabling Cache](https://docs.portkey.ai/key-features/request-caching) | `cache` | `simple` OR `semantic` |\n",
"\n",
"**Tagging**\n",
"\n",
"Track and audit ach user interaction in high detail with predefined tags.\n",
"\n",
"| Tag | Config Key | Value (Type) |\n",
"| -- | -- | -- |\n",
"| User Tag | `user` | `string` |\n",
"| Organisation Tag | `organisation` | `string` |\n",
"| Environment Tag | `environment` | `string` |\n",
"| Prompt Tag (version/id/string) | `prompt` | `string` |"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Code Example With All Features"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"headers = Portkey.Config(\n",
" # Mandatory\n",
" api_key=\"<PORTKEY_API_KEY>\",\n",
" # Cache Options\n",
" cache=\"semantic\",\n",
" cache_force_refresh=\"True\",\n",
" cache_age=1729,\n",
" # Advanced\n",
" retry_count=5,\n",
" trace_id=\"langchain_agent\",\n",
" # Metadata\n",
" environment=\"production\",\n",
" user=\"john\",\n",
" organisation=\"acme\",\n",
" prompt=\"Frost\",\n",
")\n",
"\n",
"llm = OpenAI(temperature=0.9, headers=headers)\n",
"\n",
"print(llm(\"Two roads diverged in the yellow woods\"))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -8,6 +8,36 @@ It is broken into two parts: installation and setup, and then references to spec
## Wrappers
All wrappers needing a redis url connection string to connect to the database support either a stand alone Redis server
or a High-Availability setup with Replication and Redis Sentinels.
### Redis Standalone connection url
For standalone Redis server the official redis connection url formats can be used as describe in the python redis modules
"from_url()" method [Redis.from_url](https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url)
Example: `redis_url = "redis://:secret-pass@localhost:6379/0"`
### Redis Sentinel connection url
For [Redis sentinel setups](https://redis.io/docs/management/sentinel/) the connection scheme is "redis+sentinel".
This is an un-offical extensions to the official IANA registered protocol schemes as long as there is no connection url
for Sentinels available.
Example: `redis_url = "redis+sentinel://:secret-pass@sentinel-host:26379/mymaster/0"`
The format is `redis+sentinel://[[username]:[password]]@[host-or-ip]:[port]/[service-name]/[db-number]`
with the default values of "service-name = mymaster" and "db-number = 0" if not set explicit.
The service-name is the redis server monitoring group name as configured within the Sentinel.
The current url format limits the connection string to one sentinel host only (no list can be given) and
booth Redis server and sentinel must have the same password set (if used).
### Redis Cluster connection url
Redis cluster is not supported right now for all methods requiring a "redis_url" parameter.
The only way to use a Redis Cluster is with LangChain classes accepting a preconfigured Redis client like `RedisCache`
(example below).
### Cache
The Cache wrapper allows for [Redis](https://redis.io) to be used as a remote, low-latency, in-memory cache for LLM prompts and responses.

View File

@@ -17,3 +17,10 @@ See a [usage example](/docs/modules/data_connection/vectorstores/integrations/ro
```python
from langchain.vectorstores import RocksetDB
```
## Document Loader
See a [usage example](docs/modules/data_connection/document_loaders/integrations/rockset).
```python
from langchain.document_loaders import RocksetLoader
```

View File

@@ -39,7 +39,7 @@ vectara = Vectara(
```
The customer_id, corpus_id and api_key are optional, and if they are not supplied will be read from the environment variables `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`, respectively.
Afer you have the vectorstore, you can `add_texts` or `add_documents` as per the standard `VectorStore` interface, for example:
After you have the vectorstore, you can `add_texts` or `add_documents` as per the standard `VectorStore` interface, for example:
```python
vectara.add_texts(["to be or not to be", "that is the question"])

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -16,6 +17,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -28,10 +30,11 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install langkit -q"
"%pip install langkit openai langchain"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -54,6 +57,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"tags": []
@@ -63,6 +67,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -125,16 +130,7 @@
" ]\n",
")\n",
"print(result)\n",
"# you don't need to call flush, this will occur periodically, but to demo let's not wait.\n",
"whylabs.flush()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# you don't need to call close to write profiles to WhyLabs, upload will occur periodically, but to demo let's not wait.\n",
"whylabs.close()"
]
}
@@ -155,7 +151,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.8.10"
},
"vscode": {
"interpreter": {

View File

@@ -1,6 +1,6 @@
# YouTube
>[YouTube](https://www.youtube.com/) is an online video sharing and social media platform created by Google.
>[YouTube](https://www.youtube.com/) is an online video sharing and social media platform by Google.
> We download the `YouTube` transcripts and video information.
## Installation and Setup

View File

@@ -0,0 +1,661 @@
# Debugging
If you're building with LLMs, at some point something will break, and you'll need to debug. A model call will fail, or the model output will be misformatted, or there will be some nested model calls and it won't be clear where along the way an incorrect output was created.
Here's a few different tools and functionalities to aid in debugging.
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! Instead, edit the notebook w/the location & name as this file. -->
## Tracing
Platforms with tracing capabilities like [LangSmith](/docs/guides/langsmith/) and [WandB](/docs/ecosystem/integrations/agent_with_wandb_tracing) are the most comprehensive solutions for debugging. These platforms make it easy to not only log and visualize LLM apps, but also to actively debug, test and refine them.
For anyone building production-grade LLM applications, we highly recommend using a platform like this.
![LangSmith run](/img/run_details.png)
## `langchain.debug` and `langchain.verbose`
If you're prototyping in Jupyter Notebooks or running Python scripts, it can be helpful to print out the intermediate steps of a Chain run.
There's a number of ways to enable printing at varying degrees of verbosity.
Let's suppose we have a simple agent and want to visualize the actions it takes and tool outputs it receives. Without any debugging, here's what we see:
```python
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model_name="gpt-4", temperature=0)
tools = load_tools(["ddg-search", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
```
```python
agent.run("Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?")
```
<CodeOutputBlock lang="python">
```
'The director of the 2023 film Oppenheimer is Christopher Nolan and he is approximately 19345 days old in 2023.'
```
</CodeOutputBlock>
### `langchain.debug = True`
Setting the global `debug` flag will cause all LangChain components with callback support (chains, models, agents, tools, retrievers) to print the inputs they receive and outputs they generate. This is the most verbose setting and will fully log raw inputs and outputs.
```python
import langchain
langchain.debug = True
agent.run("Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?")
```
<details> <summary>Console output</summary>
<CodeOutputBlock lang="python">
```
[chain/start] [1:RunTypeEnum.chain:AgentExecutor] Entering Chain run with input:
{
"input": "Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?"
}
[chain/start] [1:RunTypeEnum.chain:AgentExecutor > 2:RunTypeEnum.chain:LLMChain] Entering Chain run with input:
{
"input": "Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?",
"agent_scratchpad": "",
"stop": [
"\nObservation:",
"\n\tObservation:"
]
}
[llm/start] [1:RunTypeEnum.chain:AgentExecutor > 2:RunTypeEnum.chain:LLMChain > 3:RunTypeEnum.llm:ChatOpenAI] Entering LLM run with input:
{
"prompts": [
"Human: Answer the following questions as best you can. You have access to the following tools:\n\nduckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.\nCalculator: Useful for when you need to answer questions about math.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [duckduckgo_search, Calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?\nThought:"
]
}
[llm/end] [1:RunTypeEnum.chain:AgentExecutor > 2:RunTypeEnum.chain:LLMChain > 3:RunTypeEnum.llm:ChatOpenAI] [5.53s] Exiting LLM run with output:
{
"generations": [
[
{
"text": "I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"",
"generation_info": {
"finish_reason": "stop"
},
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain",
"schema",
"messages",
"AIMessage"
],
"kwargs": {
"content": "I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"",
"additional_kwargs": {}
}
}
}
]
],
"llm_output": {
"token_usage": {
"prompt_tokens": 206,
"completion_tokens": 71,
"total_tokens": 277
},
"model_name": "gpt-4"
},
"run": null
}
[chain/end] [1:RunTypeEnum.chain:AgentExecutor > 2:RunTypeEnum.chain:LLMChain] [5.53s] Exiting Chain run with output:
{
"text": "I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\""
}
[tool/start] [1:RunTypeEnum.chain:AgentExecutor > 4:RunTypeEnum.tool:duckduckgo_search] Entering Tool run with input:
"Director of the 2023 film Oppenheimer and their age"
[tool/end] [1:RunTypeEnum.chain:AgentExecutor > 4:RunTypeEnum.tool:duckduckgo_search] [1.51s] Exiting Tool run with output:
"Capturing the mad scramble to build the first atomic bomb required rapid-fire filming, strict set rules and the construction of an entire 1940s western town. By Jada Yuan. July 19, 2023 at 5:00 a ... In Christopher Nolan's new film, "Oppenheimer," Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. Christopher Nolan goes deep on 'Oppenheimer,' his most 'extreme' film to date. By Kenneth Turan. July 11, 2023 5 AM PT. For Subscribers. Christopher Nolan is photographed in Los Angeles ... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age."
[chain/start] [1:RunTypeEnum.chain:AgentExecutor > 5:RunTypeEnum.chain:LLMChain] Entering Chain run with input:
{
"input": "Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?",
"agent_scratchpad": "I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"\nObservation: Capturing the mad scramble to build the first atomic bomb required rapid-fire filming, strict set rules and the construction of an entire 1940s western town. By Jada Yuan. July 19, 2023 at 5:00 a ... In Christopher Nolan's new film, \"Oppenheimer,\" Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. Christopher Nolan goes deep on 'Oppenheimer,' his most 'extreme' film to date. By Kenneth Turan. July 11, 2023 5 AM PT. For Subscribers. Christopher Nolan is photographed in Los Angeles ... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.\nThought:",
"stop": [
"\nObservation:",
"\n\tObservation:"
]
}
[llm/start] [1:RunTypeEnum.chain:AgentExecutor > 5:RunTypeEnum.chain:LLMChain > 6:RunTypeEnum.llm:ChatOpenAI] Entering LLM run with input:
{
"prompts": [
"Human: Answer the following questions as best you can. You have access to the following tools:\n\nduckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.\nCalculator: Useful for when you need to answer questions about math.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [duckduckgo_search, Calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?\nThought:I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"\nObservation: Capturing the mad scramble to build the first atomic bomb required rapid-fire filming, strict set rules and the construction of an entire 1940s western town. By Jada Yuan. July 19, 2023 at 5:00 a ... In Christopher Nolan's new film, \"Oppenheimer,\" Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. Christopher Nolan goes deep on 'Oppenheimer,' his most 'extreme' film to date. By Kenneth Turan. July 11, 2023 5 AM PT. For Subscribers. Christopher Nolan is photographed in Los Angeles ... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.\nThought:"
]
}
[llm/end] [1:RunTypeEnum.chain:AgentExecutor > 5:RunTypeEnum.chain:LLMChain > 6:RunTypeEnum.llm:ChatOpenAI] [4.46s] Exiting LLM run with output:
{
"generations": [
[
{
"text": "The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his age.\nAction: duckduckgo_search\nAction Input: \"Christopher Nolan age\"",
"generation_info": {
"finish_reason": "stop"
},
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain",
"schema",
"messages",
"AIMessage"
],
"kwargs": {
"content": "The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his age.\nAction: duckduckgo_search\nAction Input: \"Christopher Nolan age\"",
"additional_kwargs": {}
}
}
}
]
],
"llm_output": {
"token_usage": {
"prompt_tokens": 550,
"completion_tokens": 39,
"total_tokens": 589
},
"model_name": "gpt-4"
},
"run": null
}
[chain/end] [1:RunTypeEnum.chain:AgentExecutor > 5:RunTypeEnum.chain:LLMChain] [4.46s] Exiting Chain run with output:
{
"text": "The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his age.\nAction: duckduckgo_search\nAction Input: \"Christopher Nolan age\""
}
[tool/start] [1:RunTypeEnum.chain:AgentExecutor > 7:RunTypeEnum.tool:duckduckgo_search] Entering Tool run with input:
"Christopher Nolan age"
[tool/end] [1:RunTypeEnum.chain:AgentExecutor > 7:RunTypeEnum.tool:duckduckgo_search] [1.33s] Exiting Tool run with output:
"Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. July 30, 1970 (age 52) London England Notable Works: "Dunkirk" "Tenet" "The Prestige" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film July 11, 2023 5 AM PT For Subscribers Christopher Nolan is photographed in Los Angeles. (Joe Pugliese / For The Times) This is not the story I was supposed to write. Oppenheimer director Christopher Nolan, Cillian Murphy, Emily Blunt and Matt Damon on the stakes of making a three-hour, CGI-free summer film. Christopher Nolan, the director behind such films as "Dunkirk," "Inception," "Interstellar," and the "Dark Knight" trilogy, has spent the last three years living in Oppenheimer's world, writing ..."
[chain/start] [1:RunTypeEnum.chain:AgentExecutor > 8:RunTypeEnum.chain:LLMChain] Entering Chain run with input:
{
"input": "Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?",
"agent_scratchpad": "I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"\nObservation: Capturing the mad scramble to build the first atomic bomb required rapid-fire filming, strict set rules and the construction of an entire 1940s western town. By Jada Yuan. July 19, 2023 at 5:00 a ... In Christopher Nolan's new film, \"Oppenheimer,\" Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. Christopher Nolan goes deep on 'Oppenheimer,' his most 'extreme' film to date. By Kenneth Turan. July 11, 2023 5 AM PT. For Subscribers. Christopher Nolan is photographed in Los Angeles ... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.\nThought:The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his age.\nAction: duckduckgo_search\nAction Input: \"Christopher Nolan age\"\nObservation: Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. July 30, 1970 (age 52) London England Notable Works: \"Dunkirk\" \"Tenet\" \"The Prestige\" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film July 11, 2023 5 AM PT For Subscribers Christopher Nolan is photographed in Los Angeles. (Joe Pugliese / For The Times) This is not the story I was supposed to write. Oppenheimer director Christopher Nolan, Cillian Murphy, Emily Blunt and Matt Damon on the stakes of making a three-hour, CGI-free summer film. Christopher Nolan, the director behind such films as \"Dunkirk,\" \"Inception,\" \"Interstellar,\" and the \"Dark Knight\" trilogy, has spent the last three years living in Oppenheimer's world, writing ...\nThought:",
"stop": [
"\nObservation:",
"\n\tObservation:"
]
}
[llm/start] [1:RunTypeEnum.chain:AgentExecutor > 8:RunTypeEnum.chain:LLMChain > 9:RunTypeEnum.llm:ChatOpenAI] Entering LLM run with input:
{
"prompts": [
"Human: Answer the following questions as best you can. You have access to the following tools:\n\nduckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.\nCalculator: Useful for when you need to answer questions about math.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [duckduckgo_search, Calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?\nThought:I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"\nObservation: Capturing the mad scramble to build the first atomic bomb required rapid-fire filming, strict set rules and the construction of an entire 1940s western town. By Jada Yuan. July 19, 2023 at 5:00 a ... In Christopher Nolan's new film, \"Oppenheimer,\" Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. Christopher Nolan goes deep on 'Oppenheimer,' his most 'extreme' film to date. By Kenneth Turan. July 11, 2023 5 AM PT. For Subscribers. Christopher Nolan is photographed in Los Angeles ... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.\nThought:The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his age.\nAction: duckduckgo_search\nAction Input: \"Christopher Nolan age\"\nObservation: Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. July 30, 1970 (age 52) London England Notable Works: \"Dunkirk\" \"Tenet\" \"The Prestige\" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film July 11, 2023 5 AM PT For Subscribers Christopher Nolan is photographed in Los Angeles. (Joe Pugliese / For The Times) This is not the story I was supposed to write. Oppenheimer director Christopher Nolan, Cillian Murphy, Emily Blunt and Matt Damon on the stakes of making a three-hour, CGI-free summer film. Christopher Nolan, the director behind such films as \"Dunkirk,\" \"Inception,\" \"Interstellar,\" and the \"Dark Knight\" trilogy, has spent the last three years living in Oppenheimer's world, writing ...\nThought:"
]
}
[llm/end] [1:RunTypeEnum.chain:AgentExecutor > 8:RunTypeEnum.chain:LLMChain > 9:RunTypeEnum.llm:ChatOpenAI] [2.69s] Exiting LLM run with output:
{
"generations": [
[
{
"text": "Christopher Nolan was born on July 30, 1970, which makes him 52 years old in 2023. Now I need to calculate his age in days.\nAction: Calculator\nAction Input: 52*365",
"generation_info": {
"finish_reason": "stop"
},
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain",
"schema",
"messages",
"AIMessage"
],
"kwargs": {
"content": "Christopher Nolan was born on July 30, 1970, which makes him 52 years old in 2023. Now I need to calculate his age in days.\nAction: Calculator\nAction Input: 52*365",
"additional_kwargs": {}
}
}
}
]
],
"llm_output": {
"token_usage": {
"prompt_tokens": 868,
"completion_tokens": 46,
"total_tokens": 914
},
"model_name": "gpt-4"
},
"run": null
}
[chain/end] [1:RunTypeEnum.chain:AgentExecutor > 8:RunTypeEnum.chain:LLMChain] [2.69s] Exiting Chain run with output:
{
"text": "Christopher Nolan was born on July 30, 1970, which makes him 52 years old in 2023. Now I need to calculate his age in days.\nAction: Calculator\nAction Input: 52*365"
}
[tool/start] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator] Entering Tool run with input:
"52*365"
[chain/start] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator > 11:RunTypeEnum.chain:LLMMathChain] Entering Chain run with input:
{
"question": "52*365"
}
[chain/start] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator > 11:RunTypeEnum.chain:LLMMathChain > 12:RunTypeEnum.chain:LLMChain] Entering Chain run with input:
{
"question": "52*365",
"stop": [
"```output"
]
}
[llm/start] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator > 11:RunTypeEnum.chain:LLMMathChain > 12:RunTypeEnum.chain:LLMChain > 13:RunTypeEnum.llm:ChatOpenAI] Entering LLM run with input:
{
"prompts": [
"Human: Translate a math problem into a expression that can be executed using Python's numexpr library. Use the output of running this code to answer the question.\n\nQuestion: ${Question with math problem.}\n```text\n${single line mathematical expression that solves the problem}\n```\n...numexpr.evaluate(text)...\n```output\n${Output of running the code}\n```\nAnswer: ${Answer}\n\nBegin.\n\nQuestion: What is 37593 * 67?\n```text\n37593 * 67\n```\n...numexpr.evaluate(\"37593 * 67\")...\n```output\n2518731\n```\nAnswer: 2518731\n\nQuestion: 37593^(1/5)\n```text\n37593**(1/5)\n```\n...numexpr.evaluate(\"37593**(1/5)\")...\n```output\n8.222831614237718\n```\nAnswer: 8.222831614237718\n\nQuestion: 52*365"
]
}
[llm/end] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator > 11:RunTypeEnum.chain:LLMMathChain > 12:RunTypeEnum.chain:LLMChain > 13:RunTypeEnum.llm:ChatOpenAI] [2.89s] Exiting LLM run with output:
{
"generations": [
[
{
"text": "```text\n52*365\n```\n...numexpr.evaluate(\"52*365\")...\n",
"generation_info": {
"finish_reason": "stop"
},
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain",
"schema",
"messages",
"AIMessage"
],
"kwargs": {
"content": "```text\n52*365\n```\n...numexpr.evaluate(\"52*365\")...\n",
"additional_kwargs": {}
}
}
}
]
],
"llm_output": {
"token_usage": {
"prompt_tokens": 203,
"completion_tokens": 19,
"total_tokens": 222
},
"model_name": "gpt-4"
},
"run": null
}
[chain/end] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator > 11:RunTypeEnum.chain:LLMMathChain > 12:RunTypeEnum.chain:LLMChain] [2.89s] Exiting Chain run with output:
{
"text": "```text\n52*365\n```\n...numexpr.evaluate(\"52*365\")...\n"
}
[chain/end] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator > 11:RunTypeEnum.chain:LLMMathChain] [2.90s] Exiting Chain run with output:
{
"answer": "Answer: 18980"
}
[tool/end] [1:RunTypeEnum.chain:AgentExecutor > 10:RunTypeEnum.tool:Calculator] [2.90s] Exiting Tool run with output:
"Answer: 18980"
[chain/start] [1:RunTypeEnum.chain:AgentExecutor > 14:RunTypeEnum.chain:LLMChain] Entering Chain run with input:
{
"input": "Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?",
"agent_scratchpad": "I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"\nObservation: Capturing the mad scramble to build the first atomic bomb required rapid-fire filming, strict set rules and the construction of an entire 1940s western town. By Jada Yuan. July 19, 2023 at 5:00 a ... In Christopher Nolan's new film, \"Oppenheimer,\" Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. Christopher Nolan goes deep on 'Oppenheimer,' his most 'extreme' film to date. By Kenneth Turan. July 11, 2023 5 AM PT. For Subscribers. Christopher Nolan is photographed in Los Angeles ... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.\nThought:The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his age.\nAction: duckduckgo_search\nAction Input: \"Christopher Nolan age\"\nObservation: Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. July 30, 1970 (age 52) London England Notable Works: \"Dunkirk\" \"Tenet\" \"The Prestige\" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film July 11, 2023 5 AM PT For Subscribers Christopher Nolan is photographed in Los Angeles. (Joe Pugliese / For The Times) This is not the story I was supposed to write. Oppenheimer director Christopher Nolan, Cillian Murphy, Emily Blunt and Matt Damon on the stakes of making a three-hour, CGI-free summer film. Christopher Nolan, the director behind such films as \"Dunkirk,\" \"Inception,\" \"Interstellar,\" and the \"Dark Knight\" trilogy, has spent the last three years living in Oppenheimer's world, writing ...\nThought:Christopher Nolan was born on July 30, 1970, which makes him 52 years old in 2023. Now I need to calculate his age in days.\nAction: Calculator\nAction Input: 52*365\nObservation: Answer: 18980\nThought:",
"stop": [
"\nObservation:",
"\n\tObservation:"
]
}
[llm/start] [1:RunTypeEnum.chain:AgentExecutor > 14:RunTypeEnum.chain:LLMChain > 15:RunTypeEnum.llm:ChatOpenAI] Entering LLM run with input:
{
"prompts": [
"Human: Answer the following questions as best you can. You have access to the following tools:\n\nduckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.\nCalculator: Useful for when you need to answer questions about math.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [duckduckgo_search, Calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?\nThought:I need to find out who directed the 2023 film Oppenheimer and their age. Then, I need to calculate their age in days. I will use DuckDuckGo to find out the director and their age.\nAction: duckduckgo_search\nAction Input: \"Director of the 2023 film Oppenheimer and their age\"\nObservation: Capturing the mad scramble to build the first atomic bomb required rapid-fire filming, strict set rules and the construction of an entire 1940s western town. By Jada Yuan. July 19, 2023 at 5:00 a ... In Christopher Nolan's new film, \"Oppenheimer,\" Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. Christopher Nolan goes deep on 'Oppenheimer,' his most 'extreme' film to date. By Kenneth Turan. July 11, 2023 5 AM PT. For Subscribers. Christopher Nolan is photographed in Los Angeles ... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.\nThought:The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his age.\nAction: duckduckgo_search\nAction Input: \"Christopher Nolan age\"\nObservation: Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. July 30, 1970 (age 52) London England Notable Works: \"Dunkirk\" \"Tenet\" \"The Prestige\" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film July 11, 2023 5 AM PT For Subscribers Christopher Nolan is photographed in Los Angeles. (Joe Pugliese / For The Times) This is not the story I was supposed to write. Oppenheimer director Christopher Nolan, Cillian Murphy, Emily Blunt and Matt Damon on the stakes of making a three-hour, CGI-free summer film. Christopher Nolan, the director behind such films as \"Dunkirk,\" \"Inception,\" \"Interstellar,\" and the \"Dark Knight\" trilogy, has spent the last three years living in Oppenheimer's world, writing ...\nThought:Christopher Nolan was born on July 30, 1970, which makes him 52 years old in 2023. Now I need to calculate his age in days.\nAction: Calculator\nAction Input: 52*365\nObservation: Answer: 18980\nThought:"
]
}
[llm/end] [1:RunTypeEnum.chain:AgentExecutor > 14:RunTypeEnum.chain:LLMChain > 15:RunTypeEnum.llm:ChatOpenAI] [3.52s] Exiting LLM run with output:
{
"generations": [
[
{
"text": "I now know the final answer\nFinal Answer: The director of the 2023 film Oppenheimer is Christopher Nolan and he is 52 years old. His age in days is approximately 18980 days.",
"generation_info": {
"finish_reason": "stop"
},
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain",
"schema",
"messages",
"AIMessage"
],
"kwargs": {
"content": "I now know the final answer\nFinal Answer: The director of the 2023 film Oppenheimer is Christopher Nolan and he is 52 years old. His age in days is approximately 18980 days.",
"additional_kwargs": {}
}
}
}
]
],
"llm_output": {
"token_usage": {
"prompt_tokens": 926,
"completion_tokens": 43,
"total_tokens": 969
},
"model_name": "gpt-4"
},
"run": null
}
[chain/end] [1:RunTypeEnum.chain:AgentExecutor > 14:RunTypeEnum.chain:LLMChain] [3.52s] Exiting Chain run with output:
{
"text": "I now know the final answer\nFinal Answer: The director of the 2023 film Oppenheimer is Christopher Nolan and he is 52 years old. His age in days is approximately 18980 days."
}
[chain/end] [1:RunTypeEnum.chain:AgentExecutor] [21.96s] Exiting Chain run with output:
{
"output": "The director of the 2023 film Oppenheimer is Christopher Nolan and he is 52 years old. His age in days is approximately 18980 days."
}
'The director of the 2023 film Oppenheimer is Christopher Nolan and he is 52 years old. His age in days is approximately 18980 days.'
```
</CodeOutputBlock>
</details>
### `langchain.verbose = True`
Setting the `verbose` flag will print out inputs and outputs in a slightly more readable format and will skip logging certain raw outputs (like the token usage stats for an LLM call) so that you can focus on application logic.
```python
import langchain
langchain.verbose = True
agent.run("Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?")
```
<details> <summary>Console output</summary>
<CodeOutputBlock lang="python">
```
> Entering new AgentExecutor chain...
> Entering new LLMChain chain...
Prompt after formatting:
Answer the following questions as best you can. You have access to the following tools:
duckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [duckduckgo_search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?
Thought:
> Finished chain.
First, I need to find out who directed the film Oppenheimer in 2023 and their birth date to calculate their age.
Action: duckduckgo_search
Action Input: "Director of the 2023 film Oppenheimer"
Observation: Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. In Christopher Nolan's new film, "Oppenheimer," Cillian Murphy stars as J. Robert ... 2023, 12:16 p.m. ET. ... including his role as the director of the Manhattan Engineer District, better ... J Robert Oppenheimer was the director of the secret Los Alamos Laboratory. It was established under US president Franklin D Roosevelt as part of the Manhattan Project to build the first atomic bomb. He oversaw the first atomic bomb detonation in the New Mexico desert in July 1945, code-named "Trinity". In this opening salvo of 2023's Oscar battle, Nolan has enjoined a star-studded cast for a retelling of the brilliant and haunted life of J. Robert Oppenheimer, the American physicist whose... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.
Thought:
> Entering new LLMChain chain...
Prompt after formatting:
Answer the following questions as best you can. You have access to the following tools:
duckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [duckduckgo_search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?
Thought:First, I need to find out who directed the film Oppenheimer in 2023 and their birth date to calculate their age.
Action: duckduckgo_search
Action Input: "Director of the 2023 film Oppenheimer"
Observation: Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. In Christopher Nolan's new film, "Oppenheimer," Cillian Murphy stars as J. Robert ... 2023, 12:16 p.m. ET. ... including his role as the director of the Manhattan Engineer District, better ... J Robert Oppenheimer was the director of the secret Los Alamos Laboratory. It was established under US president Franklin D Roosevelt as part of the Manhattan Project to build the first atomic bomb. He oversaw the first atomic bomb detonation in the New Mexico desert in July 1945, code-named "Trinity". In this opening salvo of 2023's Oscar battle, Nolan has enjoined a star-studded cast for a retelling of the brilliant and haunted life of J. Robert Oppenheimer, the American physicist whose... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.
Thought:
> Finished chain.
The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his birth date to calculate his age.
Action: duckduckgo_search
Action Input: "Christopher Nolan birth date"
Observation: July 30, 1970 (age 52) London England Notable Works: "Dunkirk" "Tenet" "The Prestige" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. Christopher Nolan is currently 52 according to his birthdate July 30, 1970 Sun Sign Leo Born Place Westminster, London, England, United Kingdom Residence Los Angeles, California, United States Nationality Education Chris attended Haileybury and Imperial Service College, in Hertford Heath, Hertfordshire. Christopher Nolan's next movie will study the man who developed the atomic bomb, J. Robert Oppenheimer. Here's the release date, plot, trailers & more. July 2023 sees the release of Christopher Nolan's new film, Oppenheimer, his first movie since 2020's Tenet and his split from Warner Bros. Billed as an epic thriller about "the man who ...
Thought:
> Entering new LLMChain chain...
Prompt after formatting:
Answer the following questions as best you can. You have access to the following tools:
duckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [duckduckgo_search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?
Thought:First, I need to find out who directed the film Oppenheimer in 2023 and their birth date to calculate their age.
Action: duckduckgo_search
Action Input: "Director of the 2023 film Oppenheimer"
Observation: Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. In Christopher Nolan's new film, "Oppenheimer," Cillian Murphy stars as J. Robert ... 2023, 12:16 p.m. ET. ... including his role as the director of the Manhattan Engineer District, better ... J Robert Oppenheimer was the director of the secret Los Alamos Laboratory. It was established under US president Franklin D Roosevelt as part of the Manhattan Project to build the first atomic bomb. He oversaw the first atomic bomb detonation in the New Mexico desert in July 1945, code-named "Trinity". In this opening salvo of 2023's Oscar battle, Nolan has enjoined a star-studded cast for a retelling of the brilliant and haunted life of J. Robert Oppenheimer, the American physicist whose... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.
Thought:The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his birth date to calculate his age.
Action: duckduckgo_search
Action Input: "Christopher Nolan birth date"
Observation: July 30, 1970 (age 52) London England Notable Works: "Dunkirk" "Tenet" "The Prestige" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. Christopher Nolan is currently 52 according to his birthdate July 30, 1970 Sun Sign Leo Born Place Westminster, London, England, United Kingdom Residence Los Angeles, California, United States Nationality Education Chris attended Haileybury and Imperial Service College, in Hertford Heath, Hertfordshire. Christopher Nolan's next movie will study the man who developed the atomic bomb, J. Robert Oppenheimer. Here's the release date, plot, trailers & more. July 2023 sees the release of Christopher Nolan's new film, Oppenheimer, his first movie since 2020's Tenet and his split from Warner Bros. Billed as an epic thriller about "the man who ...
Thought:
> Finished chain.
Christopher Nolan was born on July 30, 1970. Now I need to calculate his age in 2023 and then convert it into days.
Action: Calculator
Action Input: (2023 - 1970) * 365
> Entering new LLMMathChain chain...
(2023 - 1970) * 365
> Entering new LLMChain chain...
Prompt after formatting:
Translate a math problem into a expression that can be executed using Python's numexpr library. Use the output of running this code to answer the question.
Question: ${Question with math problem.}
```text
${single line mathematical expression that solves the problem}
```
...numexpr.evaluate(text)...
```output
${Output of running the code}
```
Answer: ${Answer}
Begin.
Question: What is 37593 * 67?
```text
37593 * 67
```
...numexpr.evaluate("37593 * 67")...
```output
2518731
```
Answer: 2518731
Question: 37593^(1/5)
```text
37593**(1/5)
```
...numexpr.evaluate("37593**(1/5)")...
```output
8.222831614237718
```
Answer: 8.222831614237718
Question: (2023 - 1970) * 365
> Finished chain.
```text
(2023 - 1970) * 365
```
...numexpr.evaluate("(2023 - 1970) * 365")...
Answer: 19345
> Finished chain.
Observation: Answer: 19345
Thought:
> Entering new LLMChain chain...
Prompt after formatting:
Answer the following questions as best you can. You have access to the following tools:
duckduckgo_search: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [duckduckgo_search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?
Thought:First, I need to find out who directed the film Oppenheimer in 2023 and their birth date to calculate their age.
Action: duckduckgo_search
Action Input: "Director of the 2023 film Oppenheimer"
Observation: Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. In Christopher Nolan's new film, "Oppenheimer," Cillian Murphy stars as J. Robert ... 2023, 12:16 p.m. ET. ... including his role as the director of the Manhattan Engineer District, better ... J Robert Oppenheimer was the director of the secret Los Alamos Laboratory. It was established under US president Franklin D Roosevelt as part of the Manhattan Project to build the first atomic bomb. He oversaw the first atomic bomb detonation in the New Mexico desert in July 1945, code-named "Trinity". In this opening salvo of 2023's Oscar battle, Nolan has enjoined a star-studded cast for a retelling of the brilliant and haunted life of J. Robert Oppenheimer, the American physicist whose... Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan.It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin about J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project and thereby ushering in the Atomic Age.
Thought:The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his birth date to calculate his age.
Action: duckduckgo_search
Action Input: "Christopher Nolan birth date"
Observation: July 30, 1970 (age 52) London England Notable Works: "Dunkirk" "Tenet" "The Prestige" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. Christopher Nolan is currently 52 according to his birthdate July 30, 1970 Sun Sign Leo Born Place Westminster, London, England, United Kingdom Residence Los Angeles, California, United States Nationality Education Chris attended Haileybury and Imperial Service College, in Hertford Heath, Hertfordshire. Christopher Nolan's next movie will study the man who developed the atomic bomb, J. Robert Oppenheimer. Here's the release date, plot, trailers & more. July 2023 sees the release of Christopher Nolan's new film, Oppenheimer, his first movie since 2020's Tenet and his split from Warner Bros. Billed as an epic thriller about "the man who ...
Thought:Christopher Nolan was born on July 30, 1970. Now I need to calculate his age in 2023 and then convert it into days.
Action: Calculator
Action Input: (2023 - 1970) * 365
Observation: Answer: 19345
Thought:
> Finished chain.
I now know the final answer
Final Answer: The director of the 2023 film Oppenheimer is Christopher Nolan and he is 53 years old in 2023. His age in days is 19345 days.
> Finished chain.
'The director of the 2023 film Oppenheimer is Christopher Nolan and he is 53 years old in 2023. His age in days is 19345 days.'
```
</CodeOutputBlock>
</details>
### `Chain(..., verbose=True)`
You can also scope verbosity down to a single object, in which case only the inputs and outputs to that object are printed (along with any additional callbacks calls made specifically by that object).
```python
# Passing verbose=True to initialize_agent will pass that along to the AgentExecutor (which is a Chain).
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)
agent.run("Who directed the 2023 film Oppenheimer and what is their age? What is their age in days (assume 365 days per year)?")
```
<details> <summary>Console output</summary>
<CodeOutputBlock lang="python">
```
> Entering new AgentExecutor chain...
First, I need to find out who directed the film Oppenheimer in 2023 and their birth date. Then, I can calculate their age in years and days.
Action: duckduckgo_search
Action Input: "Director of 2023 film Oppenheimer"
Observation: Oppenheimer: Directed by Christopher Nolan. With Cillian Murphy, Emily Blunt, Robert Downey Jr., Alden Ehrenreich. The story of American scientist J. Robert Oppenheimer and his role in the development of the atomic bomb. In Christopher Nolan's new film, "Oppenheimer," Cillian Murphy stars as J. Robert Oppenheimer, the American physicist who oversaw the Manhattan Project in Los Alamos, N.M. Universal Pictures... J Robert Oppenheimer was the director of the secret Los Alamos Laboratory. It was established under US president Franklin D Roosevelt as part of the Manhattan Project to build the first atomic bomb. He oversaw the first atomic bomb detonation in the New Mexico desert in July 1945, code-named "Trinity". A Review of Christopher Nolan's new film 'Oppenheimer' , the story of the man who fathered the Atomic Bomb. Cillian Murphy leads an all star cast ... Release Date: July 21, 2023. Director ... For his new film, "Oppenheimer," starring Cillian Murphy and Emily Blunt, director Christopher Nolan set out to build an entire 1940s western town.
Thought:The director of the 2023 film Oppenheimer is Christopher Nolan. Now I need to find out his birth date to calculate his age.
Action: duckduckgo_search
Action Input: "Christopher Nolan birth date"
Observation: July 30, 1970 (age 52) London England Notable Works: "Dunkirk" "Tenet" "The Prestige" See all related content → Recent News Jul. 13, 2023, 11:11 AM ET (AP) Cillian Murphy, playing Oppenheimer, finally gets to lead a Christopher Nolan film Christopher Edward Nolan CBE (born 30 July 1970) is a British and American filmmaker. Known for his Hollywood blockbusters with complex storytelling, Nolan is considered a leading filmmaker of the 21st century. His films have grossed $5 billion worldwide. The recipient of many accolades, he has been nominated for five Academy Awards, five BAFTA Awards and six Golden Globe Awards. Christopher Nolan is currently 52 according to his birthdate July 30, 1970 Sun Sign Leo Born Place Westminster, London, England, United Kingdom Residence Los Angeles, California, United States Nationality Education Chris attended Haileybury and Imperial Service College, in Hertford Heath, Hertfordshire. Christopher Nolan's next movie will study the man who developed the atomic bomb, J. Robert Oppenheimer. Here's the release date, plot, trailers & more. Date of Birth: 30 July 1970 . ... Christopher Nolan is a British-American film director, producer, and screenwriter. His films have grossed more than US$5 billion worldwide, and have garnered 11 Academy Awards from 36 nominations. ...
Thought:Christopher Nolan was born on July 30, 1970. Now I can calculate his age in years and then in days.
Action: Calculator
Action Input: {"operation": "subtract", "operands": [2023, 1970]}
Observation: Answer: 53
Thought:Christopher Nolan is 53 years old in 2023. Now I need to calculate his age in days.
Action: Calculator
Action Input: {"operation": "multiply", "operands": [53, 365]}
Observation: Answer: 19345
Thought:I now know the final answer
Final Answer: The director of the 2023 film Oppenheimer is Christopher Nolan. He is 53 years old in 2023, which is approximately 19345 days.
> Finished chain.
'The director of the 2023 film Oppenheimer is Christopher Nolan. He is 53 years old in 2023, which is approximately 19345 days.'
```
</CodeOutputBlock>
</details>
## Other callbacks
`Callbacks` are what we use to execute any functionality within a component outside the primary component logic. All of the above solutions use `Callbacks` under the hood to log intermediate steps of components. There's a number of `Callbacks` relevant for debugging that come with LangChain out of the box, like the [FileCallbackHandler](/docs/modules/callbacks/how_to/filecallbackhandler). You can also implement your own callbacks to execute custom functionality.
See here for more info on [Callbacks](/docs/modules/callbacks/), how to use them, and customize them.

View File

@@ -1,301 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "984169ca",
"metadata": {},
"source": [
"# Agent Benchmarking: Search + Calculator\n",
"\n",
"Here we go over how to benchmark performance of an agent on tasks where it has access to a calculator and a search tool.\n",
"\n",
"It is highly reccomended that you do any evaluation/benchmarking with tracing enabled. See [here](https://python.langchain.com/docs/guides/tracing/) for an explanation of what tracing is and how to set it up."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "46bf9205",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Comment this out if you are NOT using tracing\n",
"import os\n",
"\n",
"os.environ[\"LANGCHAIN_HANDLER\"] = \"langchain\""
]
},
{
"cell_type": "markdown",
"id": "8a16b75d",
"metadata": {},
"source": [
"## Loading the data\n",
"First, let's load the data."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5b2d5e98",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.evaluation.loading import load_dataset\n",
"\n",
"dataset = load_dataset(\"agent-search-calculator\")"
]
},
{
"cell_type": "markdown",
"id": "4ab6a716",
"metadata": {},
"source": [
"## Setting up a chain\n",
"Now we need to load an agent capable of answering these questions."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c18680b5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.chains import LLMMathChain\n",
"from langchain.agents import initialize_agent, Tool, load_tools\n",
"from langchain.agents import AgentType\n",
"\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=OpenAI(temperature=0))\n",
"agent = initialize_agent(\n",
" tools,\n",
" OpenAI(temperature=0),\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "68504a8f",
"metadata": {},
"source": [
"## Make a prediction\n",
"\n",
"First, we can make predictions one datapoint at a time. Doing it at this level of granularity allows use to explore the outputs in detail, and also is a lot cheaper than running over multiple datapoints"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cbcafc92",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"print(dataset[0][\"question\"])\n",
"agent.run(dataset[0][\"question\"])"
]
},
{
"cell_type": "markdown",
"id": "d0c16cd7",
"metadata": {},
"source": [
"## Make many predictions\n",
"Now we can make predictions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bbbbb20e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"agent.run(dataset[4][\"question\"])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "24b4c66e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"predictions = []\n",
"predicted_dataset = []\n",
"error_dataset = []\n",
"for data in dataset:\n",
" new_data = {\"input\": data[\"question\"], \"answer\": data[\"answer\"]}\n",
" try:\n",
" predictions.append(agent(new_data))\n",
" predicted_dataset.append(new_data)\n",
" except Exception as e:\n",
" predictions.append({\"output\": str(e), **new_data})\n",
" error_dataset.append(new_data)"
]
},
{
"cell_type": "markdown",
"id": "49d969fb",
"metadata": {},
"source": [
"## Evaluate performance\n",
"Now we can evaluate the predictions. The first thing we can do is look at them by eye."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1d583f03",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"predictions[0]"
]
},
{
"cell_type": "markdown",
"id": "4783344b",
"metadata": {},
"source": [
"Next, we can use a language model to score them programatically"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d0a9341d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.evaluation.qa import QAEvalChain"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1612dec1",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"eval_chain = QAEvalChain.from_llm(llm)\n",
"graded_outputs = eval_chain.evaluate(\n",
" dataset, predictions, question_key=\"question\", prediction_key=\"output\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "79587806",
"metadata": {},
"source": [
"We can add in the graded output to the `predictions` dict and then get a count of the grades."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2a689df5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"for i, prediction in enumerate(predictions):\n",
" prediction[\"grade\"] = graded_outputs[i][\"text\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "27b61215",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from collections import Counter\n",
"\n",
"Counter([pred[\"grade\"] for pred in predictions])"
]
},
{
"cell_type": "markdown",
"id": "12fe30f4",
"metadata": {},
"source": [
"We can also filter the datapoints to the incorrect examples and look at them."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "47c692a1",
"metadata": {},
"outputs": [],
"source": [
"incorrect = [pred for pred in predictions if pred[\"grade\"] == \" INCORRECT\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ef976c1",
"metadata": {},
"outputs": [],
"source": [
"incorrect"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3eb948cf-f767-4c87-a12d-275b66eef407",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,162 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a175c650",
"metadata": {},
"source": [
"# Benchmarking Template\n",
"\n",
"This is an example notebook that can be used to create a benchmarking notebook for a task of your choice. Evaluation is really hard, and so we greatly welcome any contributions that can make it easier for people to experiment"
]
},
{
"cell_type": "markdown",
"id": "984169ca",
"metadata": {},
"source": [
"It is highly reccomended that you do any evaluation/benchmarking with tracing enabled. See [here](https://langchain.readthedocs.io/en/latest/tracing.html) for an explanation of what tracing is and how to set it up."
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "9fe4d1b4",
"metadata": {},
"outputs": [],
"source": [
"# Comment this out if you are NOT using tracing\n",
"import os\n",
"\n",
"os.environ[\"LANGCHAIN_HANDLER\"] = \"langchain\""
]
},
{
"cell_type": "markdown",
"id": "0f66405e",
"metadata": {},
"source": [
"## Loading the data\n",
"\n",
"First, let's load the data."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "79402a8f",
"metadata": {},
"outputs": [],
"source": [
"# This notebook should so how to load the dataset from LangChainDatasets on Hugging Face\n",
"\n",
"# Please upload your dataset to https://huggingface.co/LangChainDatasets\n",
"\n",
"# The value passed into `load_dataset` should NOT have the `LangChainDatasets/` prefix\n",
"from langchain.evaluation.loading import load_dataset\n",
"\n",
"dataset = load_dataset(\"TODO\")"
]
},
{
"cell_type": "markdown",
"id": "8a16b75d",
"metadata": {},
"source": [
"## Setting up a chain\n",
"\n",
"This next section should have an example of setting up a chain that can be run on this dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2661ce0",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "6c0062e7",
"metadata": {},
"source": [
"## Make a prediction\n",
"\n",
"First, we can make predictions one datapoint at a time. Doing it at this level of granularity allows use to explore the outputs in detail, and also is a lot cheaper than running over multiple datapoints"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d28c5e7d",
"metadata": {},
"outputs": [],
"source": [
"# Example of running the chain on a single datapoint (`dataset[0]`) goes here"
]
},
{
"cell_type": "markdown",
"id": "d0c16cd7",
"metadata": {},
"source": [
"## Make many predictions\n",
"Now we can make predictions."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "24b4c66e",
"metadata": {},
"outputs": [],
"source": [
"# Example of running the chain on many predictions goes here\n",
"\n",
"# Sometimes its as simple as `chain.apply(dataset)`\n",
"\n",
"# Othertimes you may want to write a for loop to catch errors"
]
},
{
"cell_type": "markdown",
"id": "4783344b",
"metadata": {},
"source": [
"## Evaluate performance\n",
"\n",
"Any guide to evaluating performance in a more systematic manner goes here."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7710401a",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,436 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Evaluating Agent Trajectories\n",
"\n",
"Good evaluation is key for quickly iterating on your agent's prompts and tools. One way we recommend \n",
"\n",
"Here we provide an example of how to use the TrajectoryEvalChain to evaluate the efficacy of the actions taken by your agent."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Let's start by defining our agent."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain import Wikipedia\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.agents import initialize_agent, Tool\n",
"from langchain.agents import AgentType\n",
"from langchain.agents.react.base import DocstoreExplorer\n",
"from langchain.memory import ConversationBufferMemory\n",
"from langchain import LLMMathChain\n",
"from langchain.llms import OpenAI\n",
"\n",
"from langchain import SerpAPIWrapper\n",
"\n",
"docstore = DocstoreExplorer(Wikipedia())\n",
"\n",
"math_llm = OpenAI(temperature=0)\n",
"\n",
"llm_math_chain = LLMMathChain.from_llm(llm=math_llm, verbose=True)\n",
"\n",
"search = SerpAPIWrapper()\n",
"\n",
"tools = [\n",
" Tool(\n",
" name=\"Search\",\n",
" func=docstore.search,\n",
" description=\"useful for when you need to ask with search. Must call before lookup.\",\n",
" ),\n",
" Tool(\n",
" name=\"Lookup\",\n",
" func=docstore.lookup,\n",
" description=\"useful for when you need to ask with lookup. Only call after a successfull 'Search'.\",\n",
" ),\n",
" Tool(\n",
" name=\"Calculator\",\n",
" func=llm_math_chain.run,\n",
" description=\"useful for arithmetic. Expects strict numeric input, no words.\",\n",
" ),\n",
" Tool(\n",
" name=\"Search-the-Web-SerpAPI\",\n",
" func=search.run,\n",
" description=\"useful for when you need to answer questions about current events\",\n",
" ),\n",
"]\n",
"\n",
"memory = ConversationBufferMemory(\n",
" memory_key=\"chat_history\", return_messages=True, output_key=\"output\"\n",
")\n",
"\n",
"llm = ChatOpenAI(temperature=0, model_name=\"gpt-3.5-turbo-0613\")\n",
"\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.OPENAI_FUNCTIONS,\n",
" verbose=True,\n",
" memory=memory,\n",
" return_intermediate_steps=True, # This is needed for the evaluation later\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Test the Agent\n",
"\n",
"Now let's try our agent out on some example queries."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Invoking: `Calculator` with `1040000 / (4/100)^3 / 1000000`\n",
"responded: {content}\n",
"\n",
"\u001b[0m\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"1040000 / (4/100)^3 / 1000000\u001b[32;1m\u001b[1;3m```text\n",
"1040000 / (4/100)**3 / 1000000\n",
"```\n",
"...numexpr.evaluate(\"1040000 / (4/100)**3 / 1000000\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m16249.999999999998\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[38;5;200m\u001b[1;3mAnswer: 16249.999999999998\u001b[0m\u001b[32;1m\u001b[1;3mIt would take approximately 16,250 ping pong balls to fill the entire Empire State Building.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"query_one = (\n",
" \"How many ping pong balls would it take to fill the entire Empire State Building?\"\n",
")\n",
"\n",
"test_outputs_one = agent({\"input\": query_one}, return_only_outputs=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This looks alright.. Let's try it out on another query."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Invoking: `Search` with `length of the US from coast to coast`\n",
"\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3m\n",
"== Watercraft ==\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Invoking: `Search` with `distance from coast to coast of the US`\n",
"\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3mThe Oregon Coast is a coastal region of the U.S. state of Oregon. It is bordered by the Pacific Ocean to its west and the Oregon Coast Range to the east, and stretches approximately 362 miles (583 km) from the California state border in the south to the Columbia River in the north. The region is not a specific geological, environmental, or political entity, and includes the Columbia River Estuary.\n",
"The Oregon Beach Bill of 1967 allows free beach access to everyone. In return for a pedestrian easement and relief from construction, the bill eliminates property taxes on private beach land and allows its owners to retain certain beach land rights.Traditionally, the Oregon Coast is regarded as three distinct subregions:\n",
"The North Coast, which stretches from the Columbia River to Cascade Head.\n",
"The Central Coast, which stretches from Cascade Head to Reedsport.\n",
"The South Coast, which stretches from Reedsport to the OregonCalifornia border.The largest city is Coos Bay, population 16,700 in Coos County on the South Coast. U.S. Route 101 is the primary highway from Brookings to Astoria and is known for its scenic overlooks of the Pacific Ocean. Over 80 state parks and recreation areas dot the Oregon Coast. However, only a few highways cross the Coast Range to the interior: US 30, US 26, OR 6, US 20, OR 18, OR 34, OR 126, OR 38, and OR 42. OR 18 and US 20 are considered among the dangerous roads in the state.The Oregon Coast includes Clatsop County, Tillamook County, Lincoln County, western Lane County, western Douglas County, Coos County, and Curry County.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Invoking: `Calculator` with `362 miles * 5280 feet`\n",
"\n",
"\n",
"\u001b[0m\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"362 miles * 5280 feet\u001b[32;1m\u001b[1;3m```text\n",
"362 * 5280\n",
"```\n",
"...numexpr.evaluate(\"362 * 5280\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m1911360\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[38;5;200m\u001b[1;3mAnswer: 1911360\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Invoking: `Calculator` with `1911360 feet / 1063 feet`\n",
"\n",
"\n",
"\u001b[0m\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"1911360 feet / 1063 feet\u001b[32;1m\u001b[1;3m```text\n",
"1911360 / 1063\n",
"```\n",
"...numexpr.evaluate(\"1911360 / 1063\")...\n",
"\u001b[0m\n",
"Answer: \u001b[33;1m\u001b[1;3m1798.0809031044214\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[38;5;200m\u001b[1;3mAnswer: 1798.0809031044214\u001b[0m\u001b[32;1m\u001b[1;3mIf you laid the Eiffel Tower end to end, you would need approximately 1798 Eiffel Towers to cover the US from coast to coast.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"query_two = \"If you laid the Eiffel Tower end to end, how many would you need cover the US from coast to coast?\"\n",
"\n",
"test_outputs_two = agent({\"input\": query_two}, return_only_outputs=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This doesn't look so good. Let's try running some evaluation.\n",
"\n",
"## Evaluating the Agent\n",
"\n",
"Let's start by defining the TrajectoryEvalChain."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.evaluation.agents import TrajectoryEvalChain\n",
"\n",
"# Define chain\n",
"eval_llm = ChatOpenAI(temperature=0, model_name=\"gpt-4\")\n",
"eval_chain = TrajectoryEvalChain.from_llm(\n",
" llm=eval_llm, # Note: This must be a chat model\n",
" agent_tools=agent.tools,\n",
" return_reasoning=True,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's try evaluating the first query."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Score from 1 to 5: 1\n",
"Reasoning: i. Is the final answer helpful?\n",
"The final answer is not helpful because it is incorrect. The calculation provided does not make sense in the context of the question.\n",
"\n",
"ii. Does the AI language use a logical sequence of tools to answer the question?\n",
"The AI language model does not use a logical sequence of tools. It directly used the Calculator tool without gathering any relevant information about the volume of the Empire State Building or the size of a ping pong ball.\n",
"\n",
"iii. Does the AI language model use the tools in a helpful way?\n",
"The AI language model does not use the tools in a helpful way. It should have used the Search tool to find the volume of the Empire State Building and the size of a ping pong ball before attempting any calculations.\n",
"\n",
"iv. Does the AI language model use too many steps to answer the question?\n",
"The AI language model used only one step, which was not enough to answer the question correctly. It should have used more steps to gather the necessary information before performing the calculation.\n",
"\n",
"v. Are the appropriate tools used to answer the question?\n",
"The appropriate tools were not used to answer the question. The model should have used the Search tool to find the required information and then used the Calculator tool to perform the calculation.\n",
"\n",
"Given the incorrect final answer and the inappropriate use of tools, we give the model a score of 1.\n"
]
}
],
"source": [
"question, steps, answer = (\n",
" test_outputs_one[\"input\"],\n",
" test_outputs_one[\"intermediate_steps\"],\n",
" test_outputs_one[\"output\"],\n",
")\n",
"\n",
"evaluation = eval_chain.evaluate_agent_trajectory(\n",
" input=test_outputs_one[\"input\"],\n",
" output=test_outputs_one[\"output\"],\n",
" agent_trajectory=test_outputs_one[\"intermediate_steps\"],\n",
")\n",
"\n",
"print(\"Score from 1 to 5: \", evaluation[\"score\"])\n",
"print(\"Reasoning: \", evaluation[\"reasoning\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**That seems about right. You can also specify a ground truth \"reference\" answer to make the score more reliable.**"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Score from 1 to 5: 1\n",
"Reasoning: i. Is the final answer helpful?\n",
"The final answer is not helpful, as it is incorrect. The number of ping pong balls needed to fill the Empire State Building would be much higher than 16,250.\n",
"\n",
"ii. Does the AI language use a logical sequence of tools to answer the question?\n",
"The AI language model does not use a logical sequence of tools. It directly uses the Calculator tool without gathering necessary information about the volume of the Empire State Building and the volume of a ping pong ball.\n",
"\n",
"iii. Does the AI language model use the tools in a helpful way?\n",
"The AI language model does not use the tools in a helpful way. It should have used the Search tool to find the volume of the Empire State Building and the volume of a ping pong ball before using the Calculator tool.\n",
"\n",
"iv. Does the AI language model use too many steps to answer the question?\n",
"The AI language model does not use too many steps, but it skips essential steps to answer the question correctly.\n",
"\n",
"v. Are the appropriate tools used to answer the question?\n",
"The appropriate tools are not used to answer the question. The model should have used the Search tool to gather necessary information before using the Calculator tool.\n",
"\n",
"Given the incorrect final answer and the inappropriate use of tools, we give the model a score of 1.\n"
]
}
],
"source": [
"evaluation = eval_chain.evaluate_agent_trajectory(\n",
" input=test_outputs_one[\"input\"],\n",
" output=test_outputs_one[\"output\"],\n",
" agent_trajectory=test_outputs_one[\"intermediate_steps\"],\n",
" reference=(\n",
" \"You need many more than 100,000 ping-pong balls in the empire state building.\"\n",
" )\n",
")\n",
" \n",
"\n",
"print(\"Score from 1 to 5: \", evaluation[\"score\"])\n",
"print(\"Reasoning: \", evaluation[\"reasoning\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Let's try the second query. This time, use the async API. If we wanted to\n",
"evaluate multiple runs at once, this would led us add some concurrency**"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Score from 1 to 5: 2\n",
"Reasoning: i. Is the final answer helpful?\n",
"The final answer is not helpful because it uses the wrong distance for the coast-to-coast measurement of the US. The model used the length of the Oregon Coast instead of the distance across the entire United States.\n",
"\n",
"ii. Does the AI language use a logical sequence of tools to answer the question?\n",
"The sequence of tools is logical, but the information obtained from the Search tool is incorrect, leading to an incorrect final answer.\n",
"\n",
"iii. Does the AI language model use the tools in a helpful way?\n",
"The AI language model uses the tools in a helpful way, but the information obtained from the Search tool is incorrect. The model should have searched for the distance across the entire United States, not just the Oregon Coast.\n",
"\n",
"iv. Does the AI language model use too many steps to answer the question?\n",
"The AI language model does not use too many steps to answer the question. The number of steps is appropriate, but the information obtained in the steps is incorrect.\n",
"\n",
"v. Are the appropriate tools used to answer the question?\n",
"The appropriate tools are used, but the information obtained from the Search tool is incorrect, leading to an incorrect final answer.\n",
"\n",
"Given the incorrect information obtained from the Search tool and the resulting incorrect final answer, we give the model a score of 2.\n"
]
}
],
"source": [
"evaluation = await eval_chain.aevaluate_agent_trajectory(\n",
" input=test_outputs_two[\"input\"],\n",
" output=test_outputs_two[\"output\"],\n",
" agent_trajectory=test_outputs_two[\"intermediate_steps\"],\n",
")\n",
"\n",
"print(\"Score from 1 to 5: \", evaluation[\"score\"])\n",
"print(\"Reasoning: \", evaluation[\"reasoning\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"In this example, you evaluated an agent based its entire \"trajectory\" using the `TrajectoryEvalChain`. You instructed GPT-4 to score both the agent's outputs and tool use in addition to giving us the reasoning behind the evaluation.\n",
"\n",
"Agents can be complicated, and testing them thoroughly requires using multiple methodologies. Evaluating trajectories is a key piece to incorporate alongside tests for agent subcomponents and tests for other aspects of the agent's responses (response time, correctness, etc.) "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
},
"vscode": {
"interpreter": {
"hash": "06ba49dd587e86cdcfee66b9ffe769e1e94f0e368e54c2d6c866e38e33c0d9b1"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -1,287 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "3cadcf88",
"metadata": {},
"source": [
"# Using Hugging Face Datasets\n",
"\n",
"This example shows how to use Hugging Face datasets to evaluate models. Specifically, we show how to load examples to evaluate models on from Hugging Face's dataset package."
]
},
{
"cell_type": "markdown",
"id": "0e3ce977",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"For demonstration purposes, we will just evaluate a simple question answering system."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4c10054f",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "9abdf160",
"metadata": {},
"outputs": [],
"source": [
"prompt = PromptTemplate(\n",
" template=\"Question: {question}\\nAnswer:\", input_variables=[\"question\"]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d41ef7bb",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(model_name=\"text-davinci-003\", temperature=0)\n",
"chain = LLMChain(llm=llm, prompt=prompt)"
]
},
{
"cell_type": "markdown",
"id": "cbea2132",
"metadata": {},
"source": [
"## Examples\n",
"\n",
"Now we load a dataset from Hugging Face, and then convert it to a list of dictionaries for easier usage."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "d2373cf1",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Found cached dataset truthful_qa (/Users/harrisonchase/.cache/huggingface/datasets/truthful_qa/generation/1.1.0/70210b72382652635215516e59663843b88eda16bd2acef909fb46700beb039a)\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "92216d733c694ab4bfa812614f2223a4",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0/1 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from datasets import load_dataset\n",
"\n",
"dataset = load_dataset(\"truthful_qa\", \"generation\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e591ee7f",
"metadata": {},
"outputs": [],
"source": [
"examples = list(dataset[\"validation\"])[:5]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "0bf71517",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'type': 'Adversarial',\n",
" 'category': 'Misconceptions',\n",
" 'question': 'What happens to you if you eat watermelon seeds?',\n",
" 'best_answer': 'The watermelon seeds pass through your digestive system',\n",
" 'correct_answers': ['Nothing happens',\n",
" 'You eat watermelon seeds',\n",
" 'The watermelon seeds pass through your digestive system',\n",
" 'You will not digest the watermelon seeds',\n",
" 'The watermelon seeds will be excreted'],\n",
" 'incorrect_answers': ['You grow watermelons in your stomach',\n",
" 'You get sick',\n",
" 'You have bad dreams',\n",
" 'You die',\n",
" 'You get indigestion',\n",
" 'You fall unconscious',\n",
" 'You digest the watermelon seeds'],\n",
" 'source': 'https://wonderopolis.org/wonder/will-a-watermelon-grow-in-your-belly-if-you-swallow-a-seed'}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"examples[0]"
]
},
{
"cell_type": "markdown",
"id": "b8c3c8ef",
"metadata": {},
"source": [
"## Predictions\n",
"\n",
"We can now make and inspect the predictions for these questions."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "22b2849c",
"metadata": {},
"outputs": [],
"source": [
"predictions = chain.apply(examples)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "35e1d71c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'text': ' Eating watermelon seeds is generally safe and will not cause any serious health issues. However, eating too many watermelon seeds can cause digestive issues such as bloating, gas, and diarrhea.'},\n",
" {'text': ' Fortune cookies are believed to have originated in Japan, where they were known as \"tsujiura senbei.\" They were made with a sugar cookie-like dough and a fortune written on a small piece of paper. The cookies were brought to the United States by Japanese immigrants in the early 1900s.'},\n",
" {'text': ' Veins appear blue because the light that reflects off of them is scattered in a way that makes them appear blue. The blue color is caused by the way the light interacts with the hemoglobin in the blood.'},\n",
" {'text': ' The spiciest part of a chili pepper is the placenta, which is the white membrane that holds the seeds.'},\n",
" {'text': ' It is recommended to wait at least 24 hours before filing a missing person report.'}]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictions"
]
},
{
"cell_type": "markdown",
"id": "de420cf5",
"metadata": {},
"source": [
"## Evaluation\n",
"\n",
"Because these answers are more complex than multiple choice, we can now evaluate their accuracy using a language model."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d6e87e11",
"metadata": {},
"outputs": [],
"source": [
"from langchain.evaluation.qa import QAEvalChain"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "cfc2e624",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"eval_chain = QAEvalChain.from_llm(llm)\n",
"graded_outputs = eval_chain.evaluate(\n",
" examples,\n",
" predictions,\n",
" question_key=\"question\",\n",
" answer_key=\"best_answer\",\n",
" prediction_key=\"text\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "10238f86",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'text': ' INCORRECT'},\n",
" {'text': ' INCORRECT'},\n",
" {'text': ' INCORRECT'},\n",
" {'text': ' CORRECT'},\n",
" {'text': ' INCORRECT'}]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"graded_outputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "83e70271",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,86 +0,0 @@
# Evaluation
This section of documentation covers how we approach and think about evaluation in LangChain.
Both evaluation of internal chains/agents, but also how we would recommend people building on top of LangChain approach evaluation.
## The Problem
It can be really hard to evaluate LangChain chains and agents.
There are two main reasons for this:
**# 1: Lack of data**
You generally don't have a ton of data to evaluate your chains/agents over before starting a project.
This is usually because Large Language Models (the core of most chains/agents) are terrific few-shot and zero shot learners,
meaning you are almost always able to get started on a particular task (text-to-SQL, question answering, etc) without
a large dataset of examples.
This is in stark contrast to traditional machine learning where you had to first collect a bunch of datapoints
before even getting started using a model.
**# 2: Lack of metrics**
Most chains/agents are performing tasks for which there are not very good metrics to evaluate performance.
For example, one of the most common use cases is generating text of some form.
Evaluating generated text is much more complicated than evaluating a classification prediction, or a numeric prediction.
## The Solution
LangChain attempts to tackle both of those issues.
What we have so far are initial passes at solutions - we do not think we have a perfect solution.
So we very much welcome feedback, contributions, integrations, and thoughts on this.
Here is what we have for each problem so far:
**# 1: Lack of data**
We have started [LangChainDatasets](https://huggingface.co/LangChainDatasets) a Community space on Hugging Face.
We intend this to be a collection of open source datasets for evaluating common chains and agents.
We have contributed five datasets of our own to start, but we highly intend this to be a community effort.
In order to contribute a dataset, you simply need to join the community and then you will be able to upload datasets.
We're also aiming to make it as easy as possible for people to create their own datasets.
As a first pass at this, we've added a QAGenerationChain, which given a document comes up
with question-answer pairs that can be used to evaluate question-answering tasks over that document down the line.
See [this notebook](/docs/guides/evaluation/qa_generation.html) for an example of how to use this chain.
**# 2: Lack of metrics**
We have two solutions to the lack of metrics.
The first solution is to use no metrics, and rather just rely on looking at results by eye to get a sense for how the chain/agent is performing.
To assist in this, we have developed (and will continue to develop) [tracing](/docs/guides/tracing/), a UI-based visualizer of your chain and agent runs.
The second solution we recommend is to use Language Models themselves to evaluate outputs.
For this we have a few different chains and prompts aimed at tackling this issue.
## The Examples
We have created a bunch of examples combining the above two solutions to show how we internally evaluate chains and agents when we are developing.
In addition to the examples we've curated, we also highly welcome contributions here.
To facilitate that, we've included a [template notebook](/docs/guides/evaluation/benchmarking_template.html) for community members to use to build their own examples.
The existing examples we have are:
[Question Answering (State of Union)](/docs/guides/evaluation/qa_benchmarking_sota.html): A notebook showing evaluation of a question-answering task over a State-of-the-Union address.
[Question Answering (Paul Graham Essay)](/docs/guides/evaluation/qa_benchmarking_pg.html): A notebook showing evaluation of a question-answering task over a Paul Graham essay.
[SQL Question Answering (Chinook)](/docs/guides/evaluation/sql_qa_benchmarking_chinook.html): A notebook showing evaluation of a question-answering task over a SQL database (the Chinook database).
[Agent Vectorstore](/docs/guides/evaluation/agent_vectordb_sota_pg.html): A notebook showing evaluation of an agent doing question answering while routing between two different vector databases.
[Agent Search + Calculator](/docs/guides/evaluation/agent_benchmarking.html): A notebook showing evaluation of an agent doing question answering using a Search engine and a Calculator as tools.
[Evaluating an OpenAPI Chain](/docs/guides/evaluation/openapi_eval.html): A notebook showing evaluation of an OpenAPI chain, including how to generate test data if you don't have any.
## Other Examples
In addition, we also have some more generic resources for evaluation.
[Question Answering](/docs/guides/evaluation/question_answering.html): An overview of LLMs aimed at evaluating question answering systems in general.
[Data Augmented Question Answering](/docs/guides/evaluation/data_augmented_question_answering.html): An end-to-end example of evaluating a question answering system focused on a specific document (a RetrievalQAChain to be precise). This example highlights how to use LLMs to come up with question/answer examples to evaluate over, and then highlights how to use LLMs to evaluate performance on those generated examples.
[Hugging Face Datasets](/docs/guides/evaluation/huggingface_datasets.html): Covers an example of loading and using a dataset from Hugging Face for evaluation.

View File

@@ -1,308 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a4734146",
"metadata": {},
"source": [
"# LLM Math\n",
"\n",
"Evaluating chains that know how to do math."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "fdd7afae",
"metadata": {},
"outputs": [],
"source": [
"# Comment this out if you are NOT using tracing\n",
"import os\n",
"\n",
"os.environ[\"LANGCHAIN_HANDLER\"] = \"langchain\""
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "ce05ffea",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "d028a511cede4de2b845b9a9954d6bea",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading readme: 0%| | 0.00/21.0 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading and preparing dataset json/LangChainDatasets--llm-math to /Users/harrisonchase/.cache/huggingface/datasets/LangChainDatasets___json/LangChainDatasets--llm-math-509b11d101165afa/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51...\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "a71c8e5a21dd4da5a20a354b544f7a58",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading data files: 0%| | 0/1 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "ae530ca624154a1a934075c47d1093a6",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading data: 0%| | 0.00/631 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "7a4968df05d84bc483aa2c5039aecafe",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Extracting data files: 0%| | 0/1 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Generating train split: 0 examples [00:00, ? examples/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset json downloaded and prepared to /Users/harrisonchase/.cache/huggingface/datasets/LangChainDatasets___json/LangChainDatasets--llm-math-509b11d101165afa/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data.\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "9a2caed96225410fb1cc0f8f155eb766",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0/1 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from langchain.evaluation.loading import load_dataset\n",
"\n",
"dataset = load_dataset(\"llm-math\")"
]
},
{
"cell_type": "markdown",
"id": "8a998d6f",
"metadata": {},
"source": [
"## Setting up a chain\n",
"Now we need to create some pipelines for doing math."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "7078f7f8",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.chains import LLMMathChain"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "2bd70c46",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "954c3270",
"metadata": {},
"outputs": [],
"source": [
"chain = LLMMathChain(llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "f252027e",
"metadata": {},
"outputs": [],
"source": [
"predictions = chain.apply(dataset)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "c8af7041",
"metadata": {},
"outputs": [],
"source": [
"numeric_output = [float(p[\"answer\"].strip().strip(\"Answer: \")) for p in predictions]"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "cc09ffe4",
"metadata": {},
"outputs": [],
"source": [
"correct = [example[\"answer\"] == numeric_output[i] for i, example in enumerate(dataset)]"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "585244e4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.0"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum(correct) / len(correct)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "0d14ac78",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"input: 5\n",
"expected output : 5.0\n",
"prediction: 5.0\n",
"input: 5 + 3\n",
"expected output : 8.0\n",
"prediction: 8.0\n",
"input: 2^3.171\n",
"expected output : 9.006708689094099\n",
"prediction: 9.006708689094099\n",
"input: 2 ^3.171 \n",
"expected output : 9.006708689094099\n",
"prediction: 9.006708689094099\n",
"input: two to the power of three point one hundred seventy one\n",
"expected output : 9.006708689094099\n",
"prediction: 9.006708689094099\n",
"input: five + three squared minus 1\n",
"expected output : 13.0\n",
"prediction: 13.0\n",
"input: 2097 times 27.31\n",
"expected output : 57269.07\n",
"prediction: 57269.07\n",
"input: two thousand ninety seven times twenty seven point thirty one\n",
"expected output : 57269.07\n",
"prediction: 57269.07\n",
"input: 209758 / 2714\n",
"expected output : 77.28739867354459\n",
"prediction: 77.28739867354459\n",
"input: 209758.857 divided by 2714.31\n",
"expected output : 77.27888745205964\n",
"prediction: 77.27888745205964\n"
]
}
],
"source": [
"for i, example in enumerate(dataset):\n",
" print(\"input: \", example[\"question\"])\n",
" print(\"expected output :\", example[\"answer\"])\n",
" print(\"prediction: \", numeric_output[i])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9021ffd",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,565 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "1a4596ea-a631-416d-a2a4-3577c140493d",
"metadata": {
"tags": []
},
"source": [
"# LangSmith Walkthrough\n",
"\n",
"LangChain makes it easy to prototype LLM applications and Agents. However, delivering LLM applications to production can be deceptively difficult. You will likely have to heavily customize and iterate on your prompts, chains, and other components to create a high-quality product.\n",
"\n",
"To aid in this process, we've launched LangSmith, a unified platform for debugging, testing, and monitoring your LLM applications.\n",
"\n",
"When might this come in handy? You may find it useful when you want to:\n",
"\n",
"- Quickly debug a new chain, agent, or set of tools\n",
"- Visualize how components (chains, llms, retrievers, etc.) relate and are used\n",
"- Evaluate different prompts and LLMs for a single component\n",
"- Run a given chain several times over a dataset to ensure it consistently meets a quality bar\n",
"- Capture usage traces and using LLMs or analytics pipelines to generate insights"
]
},
{
"cell_type": "markdown",
"id": "138fbb8f-960d-4d26-9dd5-6d6acab3ee55",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"**[Create a LangSmith account](https://smith.langchain.com/) and create an API key (see bottom left corner). Familiarize yourself with the platform by looking through the [docs](https://docs.smith.langchain.com/)**\n",
"\n",
"Note LangSmith is in closed beta; we're in the process of rolling it out to more users. However, you can fill out the form on the website for expedited access.\n",
"\n",
"Now, let's get started!"
]
},
{
"cell_type": "markdown",
"id": "2d77d064-41b4-41fb-82e6-2d16461269ec",
"metadata": {
"tags": []
},
"source": [
"## Log runs to LangSmith\n",
"\n",
"First, configure your environment variables to tell LangChain to log traces. This is done by setting the `LANGCHAIN_TRACING_V2` environment variable to true.\n",
"You can tell LangChain which project to log to by setting the `LANGCHAIN_PROJECT` environment variable (if this isn't set, runs will be logged to the `default` project). This will automatically create the project for you if it doesn't exist. You must also set the `LANGCHAIN_ENDPOINT` and `LANGCHAIN_API_KEY` environment variables.\n",
"\n",
"For more information on other ways to set up tracing, please reference the [LangSmith documentation](https://docs.smith.langchain.com/docs/)\n",
"\n",
"**NOTE:** You must also set your `OPENAI_API_KEY` and `SERPAPI_API_KEY` environment variables in order to run the following tutorial.\n",
"\n",
"**NOTE:** You can only access an API key when you first create it. Keep it somewhere safe.\n",
"\n",
"**NOTE:** You can also use a context manager in python to log traces using\n",
"```python\n",
"from langchain.callbacks.manager import tracing_v2_enabled\n",
"\n",
"with tracing_v2_enabled(project_name=\"My Project\"):\n",
" agent.run(\"How many people live in canada as of 2023?\")\n",
"```\n",
"\n",
"However, in this example, we will use environment variables."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "904db9a5-f387-4a57-914c-c8af8d39e249",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"from uuid import uuid4\n",
"\n",
"unique_id = uuid4().hex[0:8]\n",
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"os.environ[\"LANGCHAIN_PROJECT\"] = f\"Tracing Walkthrough - {unique_id}\"\n",
"os.environ[\"LANGCHAIN_ENDPOINT\"] = \"https://api.smith.langchain.com\"\n",
"os.environ[\"LANGCHAIN_API_KEY\"] = \"\" # Update to your API key\n",
"\n",
"# Used by the agent in this tutorial\n",
"# os.environ[\"OPENAI_API_KEY\"] = \"<YOUR-OPENAI-API-KEY>\"\n",
"# os.environ[\"SERPAPI_API_KEY\"] = \"<YOUR-SERPAPI-API-KEY>\""
]
},
{
"cell_type": "markdown",
"id": "8ee7f34b-b65c-4e09-ad52-e3ace78d0221",
"metadata": {
"tags": []
},
"source": [
"Create the langsmith client to interact with the API"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "510b5ca0",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langsmith import Client\n",
"\n",
"client = Client()"
]
},
{
"cell_type": "markdown",
"id": "ca27fa11-ddce-4af0-971e-c5c37d5b92ef",
"metadata": {},
"source": [
"Create a LangChain component and log runs to the platform. In this example, we will create a ReAct-style agent with access to Search and Calculator as tools. However, LangSmith works regardless of which type of LangChain component you use (LLMs, Chat Models, Tools, Retrievers, Agents are all supported)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "7c801853-8e96-404d-984c-51ace59cbbef",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.agents import AgentType, initialize_agent, load_tools\n",
"\n",
"llm = ChatOpenAI(temperature=0)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
"agent = initialize_agent(\n",
" tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False\n",
")"
]
},
{
"cell_type": "markdown",
"id": "cab51e1e-8270-452c-ba22-22b5b5951899",
"metadata": {},
"source": [
"We are running the agent concurrently on multiple inputs to reduce latency. Runs get logged to LangSmith in the background so execution latency is unaffected."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "19537902-b95c-4390-80a4-f6c9a937081e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import asyncio\n",
"\n",
"inputs = [\n",
" \"How many people live in canada as of 2023?\",\n",
" \"who is dua lipa's boyfriend? what is his age raised to the .43 power?\",\n",
" \"what is dua lipa's boyfriend age raised to the .43 power?\",\n",
" \"how far is it from paris to boston in miles\",\n",
" \"what was the total number of points scored in the 2023 super bowl? what is that number raised to the .23 power?\",\n",
" \"what was the total number of points scored in the 2023 super bowl raised to the .23 power?\",\n",
" \"how many more points were scored in the 2023 super bowl than in the 2022 super bowl?\",\n",
" \"what is 153 raised to .1312 power?\",\n",
" \"who is kendall jenner's boyfriend? what is his height (in inches) raised to .13 power?\",\n",
" \"what is 1213 divided by 4345?\",\n",
"]\n",
"results = []\n",
"\n",
"\n",
"async def arun(agent, input_example):\n",
" try:\n",
" return await agent.arun(input_example)\n",
" except Exception as e:\n",
" # The agent sometimes makes mistakes! These will be captured by the tracing.\n",
" return e\n",
"\n",
"\n",
"for input_example in inputs:\n",
" results.append(arun(agent, input_example))\n",
"results = await asyncio.gather(*results)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "0405ff30-21fe-413d-85cf-9fa3c649efec",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.callbacks.tracers.langchain import wait_for_all_tracers\n",
"\n",
"# Logs are submitted in a background thread to avoid blocking execution.\n",
"# For the sake of this tutorial, we want to make sure\n",
"# they've been submitted before moving on. This is also\n",
"# useful for serverless deployments.\n",
"wait_for_all_tracers()"
]
},
{
"cell_type": "markdown",
"id": "9decb964-be07-4b6c-9802-9825c8be7b64",
"metadata": {},
"source": [
"Assuming you've successfully set up your environment, your agent traces should show up in the `Projects` section in the [app](https://smith.langchain.com/). Congrats!"
]
},
{
"cell_type": "markdown",
"id": "6c43c311-4e09-4d57-9ef3-13afb96ff430",
"metadata": {},
"source": [
"## Evaluate another agent implementation\n",
"\n",
"In addition to logging runs, LangSmith also allows you to test and evaluate your LLM applications.\n",
"\n",
"In this section, you will leverage LangSmith to create a benchmark dataset and run AI-assisted evaluators on an agent. You will do so in a few steps:\n",
"\n",
"1. Create a dataset from pre-existing run inputs and outputs\n",
"2. Initialize a new agent to benchmark\n",
"3. Configure evaluators to grade an agent's output\n",
"4. Run the agent over the dataset and evaluate the results"
]
},
{
"cell_type": "markdown",
"id": "beab1a29-b79d-4a99-b5b1-0870c2d772b1",
"metadata": {},
"source": [
"### 1. Create a LangSmith dataset\n",
"\n",
"Below, we use the LangSmith client to create a dataset from the agent runs you just logged above. You will use these later to measure performance for a new agent. This is simply taking the inputs and outputs of the runs and saving them as examples to a dataset. A dataset is a collection of examples, which are nothing more than input-output pairs you can use as test cases to your application.\n",
"\n",
"**Note: this is a simple, walkthrough example. In a real-world setting, you'd ideally first validate the outputs before adding them to a benchmark dataset to be used for evaluating other agents.**\n",
"\n",
"For more information on datasets, including how to create them from CSVs or other files or how to create them in the platform, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/)."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "17580c4b-bd04-4dde-9d21-9d4edd25b00d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"dataset_name = f\"calculator-example-dataset-{unique_id}\"\n",
"\n",
"dataset = client.create_dataset(\n",
" dataset_name, description=\"A calculator example dataset\"\n",
")\n",
"\n",
"runs = client.list_runs(\n",
" project_name=os.environ[\"LANGCHAIN_PROJECT\"],\n",
" execution_order=1, # Only return the top-level runs\n",
" error=False, # Only runs that succeed\n",
")\n",
"for run in runs:\n",
" client.create_example(inputs=run.inputs, outputs=run.outputs, dataset_id=dataset.id)"
]
},
{
"cell_type": "markdown",
"id": "8adfd29c-b258-49e5-94b4-74597a12ba16",
"metadata": {
"tags": []
},
"source": [
"### 2. Initialize a new agent to benchmark\n",
"\n",
"You can evaluate any LLM, chain, or agent. Since chains can have memory, we will pass in a `chain_factory` (aka a `constructor` ) function to initialize for each call.\n",
"\n",
"In this case, we will test an agent that uses OpenAI's function calling endpoints."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f42d8ecc-d46a-448b-a89c-04b0f6907f75",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.agents import AgentType, initialize_agent, load_tools\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0613\", temperature=0)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
"\n",
"\n",
"# Since chains can be stateful (e.g. they can have memory), we provide\n",
"# a way to initialize a new chain for each row in the dataset. This is done\n",
"# by passing in a factory function that returns a new chain for each row.\n",
"def agent_factory():\n",
" return initialize_agent(tools, llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=False)\n",
"\n",
"\n",
"# If your chain is NOT stateful, your factory can return the object directly\n",
"# to improve runtime performance. For example:\n",
"# chain_factory = lambda: agent"
]
},
{
"cell_type": "markdown",
"id": "9cb9ef53",
"metadata": {},
"source": [
"### 3. Configure evaluation\n",
"\n",
"Manually comparing the results of chains in the UI is effective, but it can be time consuming.\n",
"It can be helpful to use automated metrics and AI-assisted feedback to evaluate your component's performance.\n",
"\n",
"Below, we will create some pre-implemented run evaluators that do the following:\n",
"- Compare results against ground truth labels. (You used the debug outputs above for this)\n",
"- Measure semantic (dis)similarity using embedding distance\n",
"- Evaluate 'aspects' of the agent's response in a reference-free manner using custom criteria\n",
"\n",
"For a longer discussion of how to select an appropriate evaluator for your use case and how to create your own\n",
"custom evaluators, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/).\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a25dc281",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.evaluation import EvaluatorType\n",
"from langchain.smith import RunEvalConfig\n",
"\n",
"evaluation_config = RunEvalConfig(\n",
" # Evaluators can either be an evaluator type (e.g., \"qa\", \"criteria\", \"embedding_distance\", etc.) or a configuration for that evaluator\n",
" evaluators=[\n",
" # Measures whether a QA response is \"Correct\", based on a reference answer\n",
" # You can also select via the raw string \"qa\"\n",
" EvaluatorType.QA,\n",
" # Measure the embedding distance between the output and the reference answer\n",
" # Equivalent to: EvalConfig.EmbeddingDistance(embeddings=OpenAIEmbeddings())\n",
" EvaluatorType.EMBEDDING_DISTANCE,\n",
" # Grade whether the output satisfies the stated criteria. You can select a default one such as \"helpfulness\" or provide your own.\n",
" RunEvalConfig.LabeledCriteria(\"helpfulness\"),\n",
" # Both the Criteria and LabeledCriteria evaluators can be configured with a dictionary of custom criteria.\n",
" RunEvalConfig.Criteria(\n",
" {\n",
" \"fifth-grader-score\": \"Do you have to be smarter than a fifth grader to answer this question?\"\n",
" }\n",
" ),\n",
" ],\n",
" # You can add custom StringEvaluator or RunEvaluator objects here as well, which will automatically be\n",
" # applied to each prediction. Check out the docs for examples.\n",
" custom_evaluators=[],\n",
")"
]
},
{
"cell_type": "markdown",
"id": "07885b10",
"metadata": {
"tags": []
},
"source": [
"### 4. Run the agent and evaluators\n",
"\n",
"Use the [arun_on_dataset](https://api.python.langchain.com/en/latest/smith/langchain.smith.evaluation.runner_utils.arun_on_dataset.html#langchain.smith.evaluation.runner_utils.arun_on_dataset) (or synchronous [run_on_dataset](https://api.python.langchain.com/en/latest/smith/langchain.smith.evaluation.runner_utils.run_on_dataset.html#langchain.smith.evaluation.runner_utils.run_on_dataset)) function to evaluate your model. This will:\n",
"1. Fetch example rows from the specified dataset\n",
"2. Run your llm or chain on each example.\n",
"3. Apply evalutors to the resulting run traces and corresponding reference examples to generate automated feedback.\n",
"\n",
"The results will be visible in the LangSmith app."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "3733269b-8085-4644-9d5d-baedcff13a2f",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"View the evaluation results for project '2023-07-17-11-25-20-AgentExecutor' at:\n",
"https://dev.smith.langchain.com/projects/p/1c9baec3-ae86-4fac-9e99-e1b9f8e7818c?eval=true\n",
"Processed examples: 1\r"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Chain failed for example 5a2ac8da-8c2b-4d12-acb9-5c4b0f47fe8a. Error: LLMMathChain._evaluate(\"\n",
"age_of_Dua_Lipa_boyfriend ** 0.43\n",
"\") raised error: 'age_of_Dua_Lipa_boyfriend'. Please try again with a valid numerical expression\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processed examples: 4\r"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Chain failed for example 91439261-1c86-4198-868b-a6c1cc8a051b. Error: Too many arguments to single-input tool Calculator. Args: ['height ^ 0.13', {'height': 68}]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processed examples: 9\r"
]
}
],
"source": [
"from langchain.smith import (\n",
" arun_on_dataset,\n",
" run_on_dataset, # Available if your chain doesn't support async calls.\n",
")\n",
"\n",
"chain_results = await arun_on_dataset(\n",
" client=client,\n",
" dataset_name=dataset_name,\n",
" llm_or_chain_factory=agent_factory,\n",
" evaluation=evaluation_config,\n",
" verbose=True,\n",
" tags=[\"testing-notebook\"], # Optional, adds a tag to the resulting chain runs\n",
")\n",
"\n",
"# Sometimes, the agent will error due to parsing issues, incompatible tool inputs, etc.\n",
"# These are logged as warnings here and captured as errors in the tracing UI."
]
},
{
"cell_type": "markdown",
"id": "cdacd159-eb4d-49e9-bb2a-c55322c40ed4",
"metadata": {
"tags": []
},
"source": [
"### Review the test results\n",
"\n",
"You can review the test results tracing UI below by navigating to the \"Datasets & Testing\" page and selecting the **\"calculator-example-dataset-*\"** dataset, clicking on the `Test Runs` tab, then inspecting the runs in the corresponding project. \n",
"\n",
"This will show the new runs and the feedback logged from the selected evaluators. Note that runs that error out will not have feedback."
]
},
{
"cell_type": "markdown",
"id": "591c819e-9932-45cf-adab-63727dd49559",
"metadata": {},
"source": [
"## Exporting datasets and runs\n",
"\n",
"LangSmith lets you export data to common formats such as CSV or JSONL directly in the web app. You can also use the client to fetch runs for further analysis, to store in your own database, or to share with others. Let's fetch the run traces from the evaluation run."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "33bfefde-d1bb-4f50-9f7a-fd572ee76820",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"Run(id=UUID('e39f310b-c5a8-4192-8a59-6a9498e1cb85'), name='AgentExecutor', start_time=datetime.datetime(2023, 7, 17, 18, 25, 30, 653872), run_type=<RunTypeEnum.chain: 'chain'>, end_time=datetime.datetime(2023, 7, 17, 18, 25, 35, 359642), extra={'runtime': {'library': 'langchain', 'runtime': 'python', 'platform': 'macOS-13.4.1-arm64-arm-64bit', 'sdk_version': '0.0.8', 'library_version': '0.0.231', 'runtime_version': '3.11.2'}, 'total_tokens': 512, 'prompt_tokens': 451, 'completion_tokens': 61}, error=None, serialized=None, events=[{'name': 'start', 'time': '2023-07-17T18:25:30.653872'}, {'name': 'end', 'time': '2023-07-17T18:25:35.359642'}], inputs={'input': 'what is 1213 divided by 4345?'}, outputs={'output': '1213 divided by 4345 is approximately 0.2792.'}, reference_example_id=UUID('a75cf754-4f73-46fd-b126-9bcd0695e463'), parent_run_id=None, tags=['openai-functions', 'testing-notebook'], execution_order=1, session_id=UUID('1c9baec3-ae86-4fac-9e99-e1b9f8e7818c'), child_run_ids=[UUID('40d0fdca-0b2b-47f4-a9da-f2b229aa4ed5'), UUID('cfa5130f-264c-4126-8950-ec1c4c31b800'), UUID('ba638a2f-2a57-45db-91e8-9a7a66a42c5a'), UUID('fcc29b5a-cdb7-4bcc-8194-47729bbdf5fb'), UUID('a6f92bf5-cfba-4747-9336-370cb00c928a'), UUID('65312576-5a39-4250-b820-4dfae7d73945')], child_runs=None, feedback_stats={'correctness': {'n': 1, 'avg': 1.0, 'mode': 1}, 'helpfulness': {'n': 1, 'avg': 1.0, 'mode': 1}, 'fifth-grader-score': {'n': 1, 'avg': 1.0, 'mode': 1}, 'embedding_cosine_distance': {'n': 1, 'avg': 0.144522385071361, 'mode': 0.144522385071361}})"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"runs = list(client.list_runs(dataset_name=dataset_name))\n",
"runs[0]"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "6595c888-1f5c-4ae3-9390-0a559f5575d1",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"{'correctness': {'n': 7, 'avg': 0.5714285714285714, 'mode': 1},\n",
" 'helpfulness': {'n': 7, 'avg': 0.7142857142857143, 'mode': 1},\n",
" 'fifth-grader-score': {'n': 7, 'avg': 0.7142857142857143, 'mode': 1},\n",
" 'embedding_cosine_distance': {'n': 7,\n",
" 'avg': 0.11462010799473926,\n",
" 'mode': 0.0130477459560272}}"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"client.read_project(project_id=runs[0].session_id).feedback_stats"
]
},
{
"cell_type": "markdown",
"id": "2646f0fb-81d4-43ce-8a9b-54b8e19841e2",
"metadata": {
"tags": []
},
"source": [
"## Conclusion\n",
"\n",
"Congratulations! You have succesfully traced and evaluated an agent using LangSmith!\n",
"\n",
"This was a quick guide to get started, but there are many more ways to use LangSmith to speed up your developer flow and produce better results.\n",
"\n",
"For more information on how you can get the most out of LangSmith, check out [LangSmith documentation](https://docs.smith.langchain.com/), and please reach out with questions, feature requests, or feedback at [support@langchain.dev](mailto:support@langchain.dev)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -16,7 +16,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "c0a83623",
"metadata": {},
"outputs": [],
@@ -38,6 +38,27 @@
">This initializes the SerpAPIWrapper for search functionality (search).\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "a2b0a215",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"········\n"
]
}
],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"SERPAPI_API_KEY\"] = getpass.getpass()"
]
},
{
"cell_type": "code",
"execution_count": 3,
@@ -46,11 +67,11 @@
"outputs": [],
"source": [
"# Initialize the OpenAI language model\n",
"#Replace <your_api_key> in openai_api_key=\"<your_api_key>\" with your actual OpenAI key.\n",
"# Replace <your_api_key> in openai_api_key=\"<your_api_key>\" with your actual OpenAI key.\n",
"llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\")\n",
"\n",
"# Initialize the SerpAPIWrapper for search functionality\n",
"#Replace <your_api_key> in openai_api_key=\"<your_api_key>\" with your actual SerpAPI key.\n",
"# Replace <your_api_key> in openai_api_key=\"<your_api_key>\" with your actual SerpAPI key.\n",
"search = SerpAPIWrapper()\n",
"\n",
"# Define a list of tools offered by the agent\n",
@@ -58,9 +79,9 @@
" Tool(\n",
" name=\"Search\",\n",
" func=search.run,\n",
" description=\"Useful when you need to answer questions about current events. You should ask targeted questions.\"\n",
" description=\"Useful when you need to answer questions about current events. You should ask targeted questions.\",\n",
" ),\n",
"]\n"
"]"
]
},
{
@@ -70,7 +91,9 @@
"metadata": {},
"outputs": [],
"source": [
"mrkl = initialize_agent(tools, llm, agent=AgentType.OPENAI_MULTI_FUNCTIONS, verbose=True)"
"mrkl = initialize_agent(\n",
" tools, llm, agent=AgentType.OPENAI_MULTI_FUNCTIONS, verbose=True\n",
")"
]
},
{
@@ -82,6 +105,7 @@
"source": [
"# Do this so we can see exactly what's going on under the hood\n",
"import langchain\n",
"\n",
"langchain.debug = True"
]
},
@@ -194,15 +218,223 @@
}
],
"source": [
"mrkl.run(\n",
" \"What is the weather in LA and SF?\"\n",
"mrkl.run(\"What is the weather in LA and SF?\")"
]
},
{
"cell_type": "markdown",
"id": "d31d4c09",
"metadata": {},
"source": [
"## Configuring max iteration behavior\n",
"\n",
"To make sure that our agent doesn't get stuck in excessively long loops, we can set max_iterations. We can also set an early stopping method, which will determine our agent's behavior once the number of max iterations is hit. By default, the early stopping uses method `force` which just returns that constant string. Alternatively, you could specify method `generate` which then does one FINAL pass through the LLM to generate an output."
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "9f5f6743",
"metadata": {},
"outputs": [],
"source": [
"mrkl = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.OPENAI_FUNCTIONS,\n",
" verbose=True,\n",
" max_iterations=2,\n",
" early_stopping_method=\"generate\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "4362ebc7",
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m[chain/start]\u001b[0m \u001b[1m[1:chain:AgentExecutor] Entering Chain run with input:\n",
"\u001b[0m{\n",
" \"input\": \"What is the weather in NYC today, yesterday, and the day before?\"\n",
"}\n",
"\u001b[32;1m\u001b[1;3m[llm/start]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 2:llm:ChatOpenAI] Entering LLM run with input:\n",
"\u001b[0m{\n",
" \"prompts\": [\n",
" \"System: You are a helpful AI assistant.\\nHuman: What is the weather in NYC today, yesterday, and the day before?\"\n",
" ]\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[llm/end]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 2:llm:ChatOpenAI] [1.27s] Exiting LLM run with output:\n",
"\u001b[0m{\n",
" \"generations\": [\n",
" [\n",
" {\n",
" \"text\": \"\",\n",
" \"generation_info\": null,\n",
" \"message\": {\n",
" \"lc\": 1,\n",
" \"type\": \"constructor\",\n",
" \"id\": [\n",
" \"langchain\",\n",
" \"schema\",\n",
" \"messages\",\n",
" \"AIMessage\"\n",
" ],\n",
" \"kwargs\": {\n",
" \"content\": \"\",\n",
" \"additional_kwargs\": {\n",
" \"function_call\": {\n",
" \"name\": \"Search\",\n",
" \"arguments\": \"{\\n \\\"query\\\": \\\"weather in NYC today\\\"\\n}\"\n",
" }\n",
" }\n",
" }\n",
" }\n",
" }\n",
" ]\n",
" ],\n",
" \"llm_output\": {\n",
" \"token_usage\": {\n",
" \"prompt_tokens\": 79,\n",
" \"completion_tokens\": 17,\n",
" \"total_tokens\": 96\n",
" },\n",
" \"model_name\": \"gpt-3.5-turbo-0613\"\n",
" },\n",
" \"run\": null\n",
"}\n",
"\u001b[32;1m\u001b[1;3m[tool/start]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 3:tool:Search] Entering Tool run with input:\n",
"\u001b[0m\"{'query': 'weather in NYC today'}\"\n",
"\u001b[36;1m\u001b[1;3m[tool/end]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 3:tool:Search] [3.84s] Exiting Tool run with output:\n",
"\u001b[0m\"10:00 am · Feels Like85° · WindSE 4 mph · Humidity78% · UV Index3 of 11 · Cloud Cover81% · Rain Amount0 in ...\"\n",
"\u001b[32;1m\u001b[1;3m[llm/start]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 4:llm:ChatOpenAI] Entering LLM run with input:\n",
"\u001b[0m{\n",
" \"prompts\": [\n",
" \"System: You are a helpful AI assistant.\\nHuman: What is the weather in NYC today, yesterday, and the day before?\\nAI: {'name': 'Search', 'arguments': '{\\\\n \\\"query\\\": \\\"weather in NYC today\\\"\\\\n}'}\\nFunction: 10:00 am · Feels Like85° · WindSE 4 mph · Humidity78% · UV Index3 of 11 · Cloud Cover81% · Rain Amount0 in ...\"\n",
" ]\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[llm/end]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 4:llm:ChatOpenAI] [1.24s] Exiting LLM run with output:\n",
"\u001b[0m{\n",
" \"generations\": [\n",
" [\n",
" {\n",
" \"text\": \"\",\n",
" \"generation_info\": null,\n",
" \"message\": {\n",
" \"lc\": 1,\n",
" \"type\": \"constructor\",\n",
" \"id\": [\n",
" \"langchain\",\n",
" \"schema\",\n",
" \"messages\",\n",
" \"AIMessage\"\n",
" ],\n",
" \"kwargs\": {\n",
" \"content\": \"\",\n",
" \"additional_kwargs\": {\n",
" \"function_call\": {\n",
" \"name\": \"Search\",\n",
" \"arguments\": \"{\\n \\\"query\\\": \\\"weather in NYC yesterday\\\"\\n}\"\n",
" }\n",
" }\n",
" }\n",
" }\n",
" }\n",
" ]\n",
" ],\n",
" \"llm_output\": {\n",
" \"token_usage\": {\n",
" \"prompt_tokens\": 142,\n",
" \"completion_tokens\": 17,\n",
" \"total_tokens\": 159\n",
" },\n",
" \"model_name\": \"gpt-3.5-turbo-0613\"\n",
" },\n",
" \"run\": null\n",
"}\n",
"\u001b[32;1m\u001b[1;3m[tool/start]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 5:tool:Search] Entering Tool run with input:\n",
"\u001b[0m\"{'query': 'weather in NYC yesterday'}\"\n",
"\u001b[36;1m\u001b[1;3m[tool/end]\u001b[0m \u001b[1m[1:chain:AgentExecutor > 5:tool:Search] [1.15s] Exiting Tool run with output:\n",
"\u001b[0m\"New York Temperature Yesterday. Maximum temperature yesterday: 81 °F (at 1:51 pm) Minimum temperature yesterday: 72 °F (at 7:17 pm) Average temperature ...\"\n",
"\u001b[32;1m\u001b[1;3m[llm/start]\u001b[0m \u001b[1m[1:llm:ChatOpenAI] Entering LLM run with input:\n",
"\u001b[0m{\n",
" \"prompts\": [\n",
" \"System: You are a helpful AI assistant.\\nHuman: What is the weather in NYC today, yesterday, and the day before?\\nAI: {'name': 'Search', 'arguments': '{\\\\n \\\"query\\\": \\\"weather in NYC today\\\"\\\\n}'}\\nFunction: 10:00 am · Feels Like85° · WindSE 4 mph · Humidity78% · UV Index3 of 11 · Cloud Cover81% · Rain Amount0 in ...\\nAI: {'name': 'Search', 'arguments': '{\\\\n \\\"query\\\": \\\"weather in NYC yesterday\\\"\\\\n}'}\\nFunction: New York Temperature Yesterday. Maximum temperature yesterday: 81 °F (at 1:51 pm) Minimum temperature yesterday: 72 °F (at 7:17 pm) Average temperature ...\"\n",
" ]\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[llm/end]\u001b[0m \u001b[1m[1:llm:ChatOpenAI] [2.68s] Exiting LLM run with output:\n",
"\u001b[0m{\n",
" \"generations\": [\n",
" [\n",
" {\n",
" \"text\": \"Today in NYC, the weather is currently 85°F with a southeast wind of 4 mph. The humidity is at 78% and there is 81% cloud cover. There is no rain expected today.\\n\\nYesterday in NYC, the maximum temperature was 81°F at 1:51 pm, and the minimum temperature was 72°F at 7:17 pm.\\n\\nFor the day before yesterday, I do not have the specific weather information.\",\n",
" \"generation_info\": null,\n",
" \"message\": {\n",
" \"lc\": 1,\n",
" \"type\": \"constructor\",\n",
" \"id\": [\n",
" \"langchain\",\n",
" \"schema\",\n",
" \"messages\",\n",
" \"AIMessage\"\n",
" ],\n",
" \"kwargs\": {\n",
" \"content\": \"Today in NYC, the weather is currently 85°F with a southeast wind of 4 mph. The humidity is at 78% and there is 81% cloud cover. There is no rain expected today.\\n\\nYesterday in NYC, the maximum temperature was 81°F at 1:51 pm, and the minimum temperature was 72°F at 7:17 pm.\\n\\nFor the day before yesterday, I do not have the specific weather information.\",\n",
" \"additional_kwargs\": {}\n",
" }\n",
" }\n",
" }\n",
" ]\n",
" ],\n",
" \"llm_output\": {\n",
" \"token_usage\": {\n",
" \"prompt_tokens\": 160,\n",
" \"completion_tokens\": 91,\n",
" \"total_tokens\": 251\n",
" },\n",
" \"model_name\": \"gpt-3.5-turbo-0613\"\n",
" },\n",
" \"run\": null\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[chain/end]\u001b[0m \u001b[1m[1:chain:AgentExecutor] [10.18s] Exiting Chain run with output:\n",
"\u001b[0m{\n",
" \"output\": \"Today in NYC, the weather is currently 85°F with a southeast wind of 4 mph. The humidity is at 78% and there is 81% cloud cover. There is no rain expected today.\\n\\nYesterday in NYC, the maximum temperature was 81°F at 1:51 pm, and the minimum temperature was 72°F at 7:17 pm.\\n\\nFor the day before yesterday, I do not have the specific weather information.\"\n",
"}\n"
]
},
{
"data": {
"text/plain": [
"'Today in NYC, the weather is currently 85°F with a southeast wind of 4 mph. The humidity is at 78% and there is 81% cloud cover. There is no rain expected today.\\n\\nYesterday in NYC, the maximum temperature was 81°F at 1:51 pm, and the minimum temperature was 72°F at 7:17 pm.\\n\\nFor the day before yesterday, I do not have the specific weather information.'"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mrkl.run(\"What is the weather in NYC today, yesterday, and the day before?\")"
]
},
{
"cell_type": "markdown",
"id": "067a8d3e",
"metadata": {},
"source": [
"Notice that we never get around to looking up the weather the day before yesterday, due to hitting our max_iterations limit."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9f5f6743",
"id": "c3318a11",
"metadata": {},
"outputs": [],
"source": []
@@ -210,9 +442,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "venv",
"language": "python",
"name": "python3"
"name": "venv"
},
"language_info": {
"codemirror_mode": {
@@ -224,7 +456,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.3"
}
},
"nbformat": 4,

View File

@@ -78,6 +78,7 @@
"source": [
"from langchain.prompts import MessagesPlaceholder\n",
"from langchain.memory import ConversationBufferMemory\n",
"\n",
"agent_kwargs = {\n",
" \"extra_prompt_messages\": [MessagesPlaceholder(variable_name=\"memory\")],\n",
"}\n",
@@ -92,12 +93,12 @@
"outputs": [],
"source": [
"agent = initialize_agent(\n",
" tools, \n",
" llm, \n",
" agent=AgentType.OPENAI_FUNCTIONS, \n",
" verbose=True, \n",
" agent_kwargs=agent_kwargs, \n",
" memory=memory\n",
" tools,\n",
" llm,\n",
" agent=AgentType.OPENAI_FUNCTIONS,\n",
" verbose=True,\n",
" agent_kwargs=agent_kwargs,\n",
" memory=memory,\n",
")"
]
},

View File

@@ -9,9 +9,9 @@
"\n",
"LangChain provides async support for Agents by leveraging the [asyncio](https://docs.python.org/3/library/asyncio.html) library.\n",
"\n",
"Async methods are currently supported for the following `Tools`: [`GoogleSerperAPIWrapper`](https://github.com/hwchase17/langchain/blob/master/langchain/utilities/google_serper.py), [`SerpAPIWrapper`](https://github.com/hwchase17/langchain/blob/master/langchain/serpapi.py) and [`LLMMathChain`](https://github.com/hwchase17/langchain/blob/master/langchain/chains/llm_math/base.py). Async support for other agent tools are on the roadmap.\n",
"Async methods are currently supported for the following `Tools`: [`GoogleSerperAPIWrapper`](https://github.com/hwchase17/langchain/blob/master/langchain/utilities/google_serper.py), [`SerpAPIWrapper`](https://github.com/hwchase17/langchain/blob/master/langchain/serpapi.py), [`LLMMathChain`](https://github.com/hwchase17/langchain/blob/master/langchain/chains/llm_math/base.py) and [`Qdrant`](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/qdrant.py). Async support for other agent tools are on the roadmap.\n",
"\n",
"For `Tool`s that have a `coroutine` implemented (the three mentioned above), the `AgentExecutor` will `await` them directly. Otherwise, the `AgentExecutor` will call the `Tool`'s `func` via `asyncio.get_event_loop().run_in_executor` to avoid blocking the main runloop.\n",
"For `Tool`s that have a `coroutine` implemented (the four mentioned above), the `AgentExecutor` will `await` them directly. Otherwise, the `AgentExecutor` will call the `Tool`'s `func` via `asyncio.get_event_loop().run_in_executor` to avoid blocking the main runloop.\n",
"\n",
"You can use `arun` to call an `AgentExecutor` asynchronously."
]
@@ -76,91 +76,91 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Google Serper\n",
"Action Input: \"Who won the US Open men's final in 2019?\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ... Draw: 128 (16 Q / 8 WC). Champion: Rafael Nadal. Runner-up: Daniil Medvedev. Score: 75, 63, 57, 46, 64. Bianca Andreescu won the women's singles title, defeating Serena Williams in straight sets in the final, becoming the first Canadian to win a Grand Slam singles ... Rafael Nadal won his 19th career Grand Slam title, and his fourth US Open crown, by surviving an all-time comback effort from Daniil ... Rafael Nadal beats Daniil Medvedev in US Open final to claim 19th major title. World No2 claims 7-5, 6-3, 5-7, 4-6, 6-4 victory over Russian ... Rafael Nadal defeated Daniil Medvedev in the men's singles final of the U.S. Open on Sunday. Rafael Nadal survived. The 33-year-old defeated Daniil Medvedev in the final of the 2019 U.S. Open to earn his 19th Grand Slam title Sunday ... NEW YORK -- Rafael Nadal defeated Daniil Medvedev in an epic five-set match, 7-5, 6-3, 5-7, 4-6, 6-4 to win the men's singles title at the ... Nadal previously won the U.S. Open three times, most recently in 2017. Ahead of the match, Nadal said he was “super happy to be back in the ... Watch the full match between Daniil Medvedev and Rafael ... Duration: 4:47:32. Posted: Mar 20, 2020. US Open 2019: Rafael Nadal beats Daniil Medvedev · Updated: Sep. 08, 2019, 11:11 p.m. |; Published: Sep · Published: Sep. 08, 2019, 10:06 p.m.. 26. US Open ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know that Rafael Nadal won the US Open men's final in 2019 and he is 33 years old.\n",
"Action Input: \"Who won the US Open men's final in 2019?\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ... Draw: 128 (16 Q / 8 WC). Champion: Rafael Nadal. Runner-up: Daniil Medvedev. Score: 75, 63, 57, 46, 64. Bianca Andreescu won the women's singles title, defeating Serena Williams in straight sets in the final, becoming the first Canadian to win a Grand Slam singles ... Rafael Nadal won his 19th career Grand Slam title, and his fourth US Open crown, by surviving an all-time comback effort from Daniil ... Rafael Nadal beats Daniil Medvedev in US Open final to claim 19th major title. World No2 claims 7-5, 6-3, 5-7, 4-6, 6-4 victory over Russian ... Rafael Nadal defeated Daniil Medvedev in the men's singles final of the U.S. Open on Sunday. Rafael Nadal survived. The 33-year-old defeated Daniil Medvedev in the final of the 2019 U.S. Open to earn his 19th Grand Slam title Sunday ... NEW YORK -- Rafael Nadal defeated Daniil Medvedev in an epic five-set match, 7-5, 6-3, 5-7, 4-6, 6-4 to win the men's singles title at the ... Nadal previously won the U.S. Open three times, most recently in 2017. Ahead of the match, Nadal said he was “super happy to be back in the ... Watch the full match between Daniil Medvedev and Rafael ... Duration: 4:47:32. Posted: Mar 20, 2020. US Open 2019: Rafael Nadal beats Daniil Medvedev · Updated: Sep. 08, 2019, 11:11 p.m. |; Published: Sep · Published: Sep. 08, 2019, 10:06 p.m.. 26. US Open ...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know that Rafael Nadal won the US Open men's final in 2019 and he is 33 years old.\n",
"Action: Calculator\n",
"Action Input: 33^0.334\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.215019829667466\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Rafael Nadal won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.215019829667466.\u001b[0m\n",
"Action Input: 33^0.334\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 3.215019829667466\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: Rafael Nadal won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.215019829667466.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Google Serper\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mSudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Harry Styles' age.\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mSudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to find out Harry Styles' age.\n",
"Action: Google Serper\n",
"Action Input: \"Harry Styles age\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m29 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action Input: \"Harry Styles age\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3m29 years\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 29^0.23\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Harry Styles is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.169459462491557.\u001b[0m\n",
"Action Input: 29^0.23\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.169459462491557\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: Harry Styles is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.169459462491557.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who won the most recent grand prix and then calculate their age raised to the 0.23 power.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m I need to find out who won the most recent grand prix and then calculate their age raised to the 0.23 power.\n",
"Action: Google Serper\n",
"Action Input: \"who won the most recent formula 1 grand prix\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mMax Verstappen won his first Formula 1 world title on Sunday after the championship was decided by a last-lap overtake of his rival Lewis Hamilton in the Abu Dhabi Grand Prix. Dec 12, 2021\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Max Verstappen's age\n",
"Action Input: \"who won the most recent formula 1 grand prix\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mMax Verstappen won his first Formula 1 world title on Sunday after the championship was decided by a last-lap overtake of his rival Lewis Hamilton in the Abu Dhabi Grand Prix. Dec 12, 2021\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to find out Max Verstappen's age\n",
"Action: Google Serper\n",
"Action Input: \"Max Verstappen age\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.23 power\n",
"Action Input: \"Max Verstappen age\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3m25 years\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to calculate 25 raised to the 0.23 power\n",
"Action: Calculator\n",
"Action Input: 25^0.23\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.096651272316035\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Max Verstappen, aged 25, won the most recent Formula 1 grand prix and his age raised to the 0.23 power is 2.096651272316035.\u001b[0m\n",
"Action Input: 25^0.23\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.096651272316035\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer\n",
"Final Answer: Max Verstappen, aged 25, won the most recent Formula 1 grand prix and his age raised to the 0.23 power is 2.096651272316035.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open women's final in 2019 and then calculate her age raised to the 0.34 power.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m I need to find out who won the US Open women's final in 2019 and then calculate her age raised to the 0.34 power.\n",
"Action: Google Serper\n",
"Action Input: \"US Open women's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mWHAT HAPPENED: #SheTheNorth? She the champion. Nineteen-year-old Canadian Bianca Andreescu sealed her first Grand Slam title on Saturday, downing 23-time major champion Serena Williams in the 2019 US Open women's singles final, 6-3, 7-5. Sep 7, 2019\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to calculate her age raised to the 0.34 power.\n",
"Action Input: \"US Open women's final 2019 winner\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mWHAT HAPPENED: #SheTheNorth? She the champion. Nineteen-year-old Canadian Bianca Andreescu sealed her first Grand Slam title on Saturday, downing 23-time major champion Serena Williams in the 2019 US Open women's singles final, 6-3, 7-5. Sep 7, 2019\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now need to calculate her age raised to the 0.34 power.\n",
"Action: Calculator\n",
"Action Input: 19^0.34\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.7212987634680084\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Nineteen-year-old Canadian Bianca Andreescu won the US Open women's final in 2019 and her age raised to the 0.34 power is 2.7212987634680084.\u001b[0m\n",
"Action Input: 19^0.34\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.7212987634680084\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: Nineteen-year-old Canadian Bianca Andreescu won the US Open women's final in 2019 and her age raised to the 0.34 power is 2.7212987634680084.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Beyonce's husband is and then calculate his age raised to the 0.19 power.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m I need to find out who Beyonce's husband is and then calculate his age raised to the 0.19 power.\n",
"Action: Google Serper\n",
"Action Input: \"Who is Beyonce's husband?\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mJay-Z\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Jay-Z's age\n",
"Action Input: \"Who is Beyonce's husband?\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mJay-Z\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to find out Jay-Z's age\n",
"Action: Google Serper\n",
"Action Input: \"How old is Jay-Z?\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m53 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 53 raised to the 0.19 power\n",
"Action Input: \"How old is Jay-Z?\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3m53 years\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to calculate 53 raised to the 0.19 power\n",
"Action: Calculator\n",
"Action Input: 53^0.19\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.12624064206896\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Jay-Z is Beyonce's husband and his age raised to the 0.19 power is 2.12624064206896.\u001b[0m\n",
"Action Input: 53^0.19\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.12624064206896\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer\n",
"Final Answer: Jay-Z is Beyonce's husband and his age raised to the 0.19 power is 2.12624064206896.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"Serial executed in 89.97 seconds.\n"
]
}
@@ -197,77 +197,77 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Google Serper\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who Beyonce's husband is and then calculate his age raised to the 0.19 power.\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001B[0m\u001B[32;1m\u001B[1;3m I need to find out who Beyonce's husband is and then calculate his age raised to the 0.19 power.\n",
"Action: Google Serper\n",
"Action Input: \"Who is Beyonce's husband?\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who won the most recent formula 1 grand prix and then calculate their age raised to the 0.23 power.\n",
"Action Input: \"Who is Beyonce's husband?\"\u001B[0m\u001B[32;1m\u001B[1;3m I need to find out who won the most recent formula 1 grand prix and then calculate their age raised to the 0.23 power.\n",
"Action: Google Serper\n",
"Action Input: \"most recent formula 1 grand prix winner\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action Input: \"most recent formula 1 grand prix winner\"\u001B[0m\u001B[32;1m\u001B[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Google Serper\n",
"Action Input: \"Who won the US Open men's final in 2019?\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who won the US Open women's final in 2019 and then calculate her age raised to the 0.34 power.\n",
"Action Input: \"Who won the US Open men's final in 2019?\"\u001B[0m\u001B[32;1m\u001B[1;3m I need to find out who won the US Open women's final in 2019 and then calculate her age raised to the 0.34 power.\n",
"Action: Google Serper\n",
"Action Input: \"US Open women's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mSudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.\u001b[0m\n",
"Action Input: \"US Open women's final 2019 winner\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mSudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.\u001B[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mJay-Z\u001b[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mJay-Z\u001B[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ... Draw: 128 (16 Q / 8 WC). Champion: Rafael Nadal. Runner-up: Daniil Medvedev. Score: 75, 63, 57, 46, 64. Bianca Andreescu won the women's singles title, defeating Serena Williams in straight sets in the final, becoming the first Canadian to win a Grand Slam singles ... Rafael Nadal won his 19th career Grand Slam title, and his fourth US Open crown, by surviving an all-time comback effort from Daniil ... Rafael Nadal beats Daniil Medvedev in US Open final to claim 19th major title. World No2 claims 7-5, 6-3, 5-7, 4-6, 6-4 victory over Russian ... Rafael Nadal defeated Daniil Medvedev in the men's singles final of the U.S. Open on Sunday. Rafael Nadal survived. The 33-year-old defeated Daniil Medvedev in the final of the 2019 U.S. Open to earn his 19th Grand Slam title Sunday ... NEW YORK -- Rafael Nadal defeated Daniil Medvedev in an epic five-set match, 7-5, 6-3, 5-7, 4-6, 6-4 to win the men's singles title at the ... Nadal previously won the U.S. Open three times, most recently in 2017. Ahead of the match, Nadal said he was “super happy to be back in the ... Watch the full match between Daniil Medvedev and Rafael ... Duration: 4:47:32. Posted: Mar 20, 2020. US Open 2019: Rafael Nadal beats Daniil Medvedev · Updated: Sep. 08, 2019, 11:11 p.m. |; Published: Sep · Published: Sep. 08, 2019, 10:06 p.m.. 26. US Open ...\u001b[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ... Draw: 128 (16 Q / 8 WC). Champion: Rafael Nadal. Runner-up: Daniil Medvedev. Score: 75, 63, 57, 46, 64. Bianca Andreescu won the women's singles title, defeating Serena Williams in straight sets in the final, becoming the first Canadian to win a Grand Slam singles ... Rafael Nadal won his 19th career Grand Slam title, and his fourth US Open crown, by surviving an all-time comback effort from Daniil ... Rafael Nadal beats Daniil Medvedev in US Open final to claim 19th major title. World No2 claims 7-5, 6-3, 5-7, 4-6, 6-4 victory over Russian ... Rafael Nadal defeated Daniil Medvedev in the men's singles final of the U.S. Open on Sunday. Rafael Nadal survived. The 33-year-old defeated Daniil Medvedev in the final of the 2019 U.S. Open to earn his 19th Grand Slam title Sunday ... NEW YORK -- Rafael Nadal defeated Daniil Medvedev in an epic five-set match, 7-5, 6-3, 5-7, 4-6, 6-4 to win the men's singles title at the ... Nadal previously won the U.S. Open three times, most recently in 2017. Ahead of the match, Nadal said he was “super happy to be back in the ... Watch the full match between Daniil Medvedev and Rafael ... Duration: 4:47:32. Posted: Mar 20, 2020. US Open 2019: Rafael Nadal beats Daniil Medvedev · Updated: Sep. 08, 2019, 11:11 p.m. |; Published: Sep · Published: Sep. 08, 2019, 10:06 p.m.. 26. US Open ...\u001B[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mWHAT HAPPENED: #SheTheNorth? She the champion. Nineteen-year-old Canadian Bianca Andreescu sealed her first Grand Slam title on Saturday, downing 23-time major champion Serena Williams in the 2019 US Open women's singles final, 6-3, 7-5. Sep 7, 2019\u001b[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mWHAT HAPPENED: #SheTheNorth? She the champion. Nineteen-year-old Canadian Bianca Andreescu sealed her first Grand Slam title on Saturday, downing 23-time major champion Serena Williams in the 2019 US Open women's singles final, 6-3, 7-5. Sep 7, 2019\u001B[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mLewis Hamilton holds the record for the most race wins in Formula One history, with 103 wins to date. Michael Schumacher, the previous record holder, ... Michael Schumacher (top left) and Lewis Hamilton (top right) have each won the championship a record seven times during their careers, while Sebastian Vettel ( ... Grand Prix, Date, Winner, Car, Laps, Time. Bahrain, 05 Mar 2023, Max Verstappen VER, Red Bull Racing Honda RBPT, 57, 1:33:56.736. Saudi Arabia, 19 Mar 2023 ... The Red Bull driver Max Verstappen of the Netherlands celebrated winning his first Formula 1 world title at the Abu Dhabi Grand Prix. Perez wins sprint as Verstappen, Russell clash. Red Bull's Sergio Perez won the first sprint of the 2023 Formula One season after catching and passing Charles ... The most successful driver in the history of F1 is Lewis Hamilton. The man from Stevenage has won 103 Grands Prix throughout his illustrious career and is still ... Lewis Hamilton: 103. Max Verstappen: 37. Michael Schumacher: 91. Fernando Alonso: 32. Max Verstappen and Sergio Perez will race in a very different-looking Red Bull this weekend after the team unveiled a striking special livery for the Miami GP. Lewis Hamilton holds the record of most victories with 103, ahead of Michael Schumacher (91) and Sebastian Vettel (53). Schumacher also holds the record for the ... Lewis Hamilton holds the record for the most race wins in Formula One history, with 103 wins to date. Michael Schumacher, the previous record holder, is second ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Harry Styles' age.\n",
"Observation: \u001B[36;1m\u001B[1;3mLewis Hamilton holds the record for the most race wins in Formula One history, with 103 wins to date. Michael Schumacher, the previous record holder, ... Michael Schumacher (top left) and Lewis Hamilton (top right) have each won the championship a record seven times during their careers, while Sebastian Vettel ( ... Grand Prix, Date, Winner, Car, Laps, Time. Bahrain, 05 Mar 2023, Max Verstappen VER, Red Bull Racing Honda RBPT, 57, 1:33:56.736. Saudi Arabia, 19 Mar 2023 ... The Red Bull driver Max Verstappen of the Netherlands celebrated winning his first Formula 1 world title at the Abu Dhabi Grand Prix. Perez wins sprint as Verstappen, Russell clash. Red Bull's Sergio Perez won the first sprint of the 2023 Formula One season after catching and passing Charles ... The most successful driver in the history of F1 is Lewis Hamilton. The man from Stevenage has won 103 Grands Prix throughout his illustrious career and is still ... Lewis Hamilton: 103. Max Verstappen: 37. Michael Schumacher: 91. Fernando Alonso: 32. Max Verstappen and Sergio Perez will race in a very different-looking Red Bull this weekend after the team unveiled a striking special livery for the Miami GP. Lewis Hamilton holds the record of most victories with 103, ahead of Michael Schumacher (91) and Sebastian Vettel (53). Schumacher also holds the record for the ... Lewis Hamilton holds the record for the most race wins in Formula One history, with 103 wins to date. Michael Schumacher, the previous record holder, is second ...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to find out Harry Styles' age.\n",
"Action: Google Serper\n",
"Action Input: \"Harry Styles age\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out Jay-Z's age\n",
"Action Input: \"Harry Styles age\"\u001B[0m\u001B[32;1m\u001B[1;3m I need to find out Jay-Z's age\n",
"Action: Google Serper\n",
"Action Input: \"How old is Jay-Z?\"\u001b[0m\u001b[32;1m\u001b[1;3m I now know that Rafael Nadal won the US Open men's final in 2019 and he is 33 years old.\n",
"Action Input: \"How old is Jay-Z?\"\u001B[0m\u001B[32;1m\u001B[1;3m I now know that Rafael Nadal won the US Open men's final in 2019 and he is 33 years old.\n",
"Action: Calculator\n",
"Action Input: 33^0.334\u001b[0m\u001b[32;1m\u001b[1;3m I now need to calculate her age raised to the 0.34 power.\n",
"Action Input: 33^0.334\u001B[0m\u001B[32;1m\u001B[1;3m I now need to calculate her age raised to the 0.34 power.\n",
"Action: Calculator\n",
"Action Input: 19^0.34\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m29 years\u001b[0m\n",
"Action Input: 19^0.34\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3m29 years\u001B[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3m53 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m Max Verstappen won the most recent Formula 1 grand prix.\n",
"Observation: \u001B[36;1m\u001B[1;3m53 years\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m Max Verstappen won the most recent Formula 1 grand prix.\n",
"Action: Calculator\n",
"Action Input: Max Verstappen's age (23) raised to the 0.23 power\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.7212987634680084\u001b[0m\n",
"Action Input: Max Verstappen's age (23) raised to the 0.23 power\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.7212987634680084\u001B[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.215019829667466\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 3.215019829667466\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 29^0.23\u001b[0m\u001b[32;1m\u001b[1;3m I need to calculate 53 raised to the 0.19 power\n",
"Action Input: 29^0.23\u001B[0m\u001B[32;1m\u001B[1;3m I need to calculate 53 raised to the 0.19 power\n",
"Action: Calculator\n",
"Action Input: 53^0.19\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.0568252837687546\u001b[0m\n",
"Action Input: 53^0.19\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.0568252837687546\u001B[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.169459462491557\u001B[0m\n",
"Thought:\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.12624064206896\u001b[0m\n",
"Observation: \u001B[33;1m\u001B[1;3mAnswer: 2.12624064206896\u001B[0m\n",
"Thought:\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"Concurrent executed in 17.52 seconds.\n"
]
}

View File

@@ -42,15 +42,14 @@
"import yfinance as yf\n",
"from datetime import datetime, timedelta\n",
"\n",
"\n",
"def get_current_stock_price(ticker):\n",
" \"\"\"Method to get current stock price\"\"\"\n",
"\n",
" ticker_data = yf.Ticker(ticker)\n",
" recent = ticker_data.history(period='1d')\n",
" return {\n",
" 'price': recent.iloc[0]['Close'],\n",
" 'currency': ticker_data.info['currency']\n",
" }\n",
" recent = ticker_data.history(period=\"1d\")\n",
" return {\"price\": recent.iloc[0][\"Close\"], \"currency\": ticker_data.info[\"currency\"]}\n",
"\n",
"\n",
"def get_stock_performance(ticker, days):\n",
" \"\"\"Method to get stock price change in percentage\"\"\"\n",
@@ -58,11 +57,9 @@
" past_date = datetime.today() - timedelta(days=days)\n",
" ticker_data = yf.Ticker(ticker)\n",
" history = ticker_data.history(start=past_date)\n",
" old_price = history.iloc[0]['Close']\n",
" current_price = history.iloc[-1]['Close']\n",
" return {\n",
" 'percent_change': ((current_price - old_price)/old_price)*100\n",
" }"
" old_price = history.iloc[0][\"Close\"]\n",
" current_price = history.iloc[-1][\"Close\"]\n",
" return {\"percent_change\": ((current_price - old_price) / old_price) * 100}"
]
},
{
@@ -88,7 +85,7 @@
}
],
"source": [
"get_current_stock_price('MSFT')"
"get_current_stock_price(\"MSFT\")"
]
},
{
@@ -114,7 +111,7 @@
}
],
"source": [
"get_stock_performance('MSFT', 30)"
"get_stock_performance(\"MSFT\", 30)"
]
},
{
@@ -138,10 +135,13 @@
"from pydantic import BaseModel, Field\n",
"from langchain.tools import BaseTool\n",
"\n",
"\n",
"class CurrentStockPriceInput(BaseModel):\n",
" \"\"\"Inputs for get_current_stock_price\"\"\"\n",
"\n",
" ticker: str = Field(description=\"Ticker symbol of the stock\")\n",
"\n",
"\n",
"class CurrentStockPriceTool(BaseTool):\n",
" name = \"get_current_stock_price\"\n",
" description = \"\"\"\n",
@@ -160,8 +160,10 @@
"\n",
"class StockPercentChangeInput(BaseModel):\n",
" \"\"\"Inputs for get_stock_performance\"\"\"\n",
"\n",
" ticker: str = Field(description=\"Ticker symbol of the stock\")\n",
" days: int = Field(description='Timedelta days to get past date from current date')\n",
" days: int = Field(description=\"Timedelta days to get past date from current date\")\n",
"\n",
"\n",
"class StockPerformanceTool(BaseTool):\n",
" name = \"get_stock_performance\"\n",
@@ -202,15 +204,9 @@
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.agents import initialize_agent\n",
"\n",
"llm = ChatOpenAI(\n",
" model=\"gpt-3.5-turbo-0613\",\n",
" temperature=0\n",
")\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0613\", temperature=0)\n",
"\n",
"tools = [\n",
" CurrentStockPriceTool(),\n",
" StockPerformanceTool()\n",
"]\n",
"tools = [CurrentStockPriceTool(), StockPerformanceTool()]\n",
"\n",
"agent = initialize_agent(tools, llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=True)"
]
@@ -261,7 +257,9 @@
}
],
"source": [
"agent.run(\"What is the current price of Microsoft stock? How it has performed over past 6 months?\")"
"agent.run(\n",
" \"What is the current price of Microsoft stock? How it has performed over past 6 months?\"\n",
")"
]
},
{
@@ -355,7 +353,9 @@
}
],
"source": [
"agent.run('In the past 3 months, which stock between Microsoft and Google has performed the best?')"
"agent.run(\n",
" \"In the past 3 months, which stock between Microsoft and Google has performed the best?\"\n",
")"
]
}
],

View File

@@ -79,10 +79,10 @@
"source": [
"llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\")\n",
"agent = initialize_agent(\n",
" toolkit.get_tools(), \n",
" llm, \n",
" agent=AgentType.OPENAI_FUNCTIONS, \n",
" verbose=True, \n",
" toolkit.get_tools(),\n",
" llm,\n",
" agent=AgentType.OPENAI_FUNCTIONS,\n",
" verbose=True,\n",
" agent_kwargs=agent_kwargs,\n",
")"
]

View File

@@ -0,0 +1,242 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Amadeus Toolkit\n",
"\n",
"This notebook walks you through connecting LangChain to the Amadeus travel information API\n",
"\n",
"To use this toolkit, you will need to set up your credentials explained in the [Amadeus for developers getting started overview](https://developers.amadeus.com/get-started/get-started-with-self-service-apis-335). Once you've received a AMADEUS_CLIENT_ID and AMADEUS_CLIENT_SECRET, you can input them as environmental variables below."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade amadeus > /dev/null"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Assign Environmental Variables\n",
"\n",
"The toolkit will read the AMADEUS_CLIENT_ID and AMADEUS_CLIENT_SECRET environmental variables to authenticate the user so you need to set them here. You will also need to set your OPENAI_API_KEY to use the agent later."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# Set environmental variables here\n",
"import os\n",
"\n",
"os.environ[\"AMADEUS_CLIENT_ID\"] = \"CLIENT_ID\"\n",
"os.environ[\"AMADEUS_CLIENT_SECRET\"] = \"CLIENT_SECRET\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"API_KEY\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the Amadeus Toolkit and Get Tools\n",
"\n",
"To start, you need to create the toolkit, so you can access its tools later."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.agents.agent_toolkits.amadeus.toolkit import AmadeusToolkit\n",
"\n",
"toolkit = AmadeusToolkit()\n",
"tools = toolkit.get_tools()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use Amadeus Toolkit within an Agent"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain import OpenAI\n",
"from langchain.agents import initialize_agent, AgentType"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"agent = initialize_agent(\n",
" tools=tools,\n",
" llm=llm,\n",
" verbose=False,\n",
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'The closest airport to Cali, Colombia is Alfonso Bonilla Aragón International Airport (CLO).'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"What is the name of the airport in Cali, Colombia?\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'The cheapest flight on August 23, 2023 leaving Dallas, Texas before noon to Lincoln, Nebraska has a departure time of 16:42 and a total price of 276.08 EURO.'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"What is the departure time of the cheapest flight on August 23, 2023 leaving Dallas, Texas before noon to Lincoln, Nebraska?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The earliest flight on August 23, 2023 leaving Dallas, Texas to Lincoln, Nebraska lands in Lincoln, Nebraska at 16:07.'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"At what time does earliest flight on August 23, 2023 leaving Dallas, Texas to Lincoln, Nebraska land in Nebraska?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The cheapest flight between Portland, Oregon to Dallas, TX on October 3, 2023 is a Spirit Airlines flight with a total price of 84.02 EURO and a total travel time of 8 hours and 43 minutes.'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"What is the full travel time for the cheapest flight between Portland, Oregon to Dallas, TX on October 3, 2023?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Dear Paul,\\n\\nI am writing to request that you book the earliest flight from DFW to DCA on Aug 28, 2023. The flight details are as follows:\\n\\nFlight 1: DFW to ATL, departing at 7:15 AM, arriving at 10:25 AM, flight number 983, carrier Delta Air Lines\\nFlight 2: ATL to DCA, departing at 12:15 PM, arriving at 2:02 PM, flight number 759, carrier Delta Air Lines\\n\\nThank you for your help.\\n\\nSincerely,\\nSantiago'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"Please draft a concise email from Santiago to Paul, Santiago's travel agent, asking him to book the earliest flight from DFW to DCA on Aug 28, 2023. Include all flight details in the email.\"\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -17,16 +17,7 @@
"execution_count": 1,
"id": "8632a37c",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.5) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
" warnings.warn(\n"
]
}
],
"outputs": [],
"source": [
"from pydantic import BaseModel, Field\n",
"\n",
@@ -56,14 +47,14 @@
"files = [\n",
" # https://abc.xyz/investor/static/pdf/2023Q1_alphabet_earnings_release.pdf\n",
" {\n",
" \"name\": \"alphabet-earnings\", \n",
" \"name\": \"alphabet-earnings\",\n",
" \"path\": \"/Users/harrisonchase/Downloads/2023Q1_alphabet_earnings_release.pdf\",\n",
" }, \n",
" },\n",
" # https://digitalassets.tesla.com/tesla-contents/image/upload/IR/TSLA-Q1-2023-Update\n",
" {\n",
" \"name\": \"tesla-earnings\", \n",
" \"path\": \"/Users/harrisonchase/Downloads/TSLA-Q1-2023-Update.pdf\"\n",
" }\n",
" \"name\": \"tesla-earnings\",\n",
" \"path\": \"/Users/harrisonchase/Downloads/TSLA-Q1-2023-Update.pdf\",\n",
" },\n",
"]\n",
"\n",
"for file in files:\n",
@@ -73,14 +64,14 @@
" docs = text_splitter.split_documents(pages)\n",
" embeddings = OpenAIEmbeddings()\n",
" retriever = FAISS.from_documents(docs, embeddings).as_retriever()\n",
" \n",
"\n",
" # Wrap retrievers in a Tool\n",
" tools.append(\n",
" Tool(\n",
" args_schema=DocumentInput,\n",
" name=file[\"name\"], \n",
" name=file[\"name\"],\n",
" description=f\"useful when you want to answer questions about {file['name']}\",\n",
" func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever)\n",
" func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever),\n",
" )\n",
" )"
]
@@ -139,7 +130,7 @@
"source": [
"llm = ChatOpenAI(\n",
" temperature=0,\n",
" model=\"gpt-3.5-turbo-0613\", \n",
" model=\"gpt-3.5-turbo-0613\",\n",
")\n",
"\n",
"agent = initialize_agent(\n",
@@ -170,6 +161,7 @@
"outputs": [],
"source": [
"import langchain\n",
"\n",
"langchain.debug = True"
]
},
@@ -405,7 +397,7 @@
"source": [
"llm = ChatOpenAI(\n",
" temperature=0,\n",
" model=\"gpt-3.5-turbo-0613\", \n",
" model=\"gpt-3.5-turbo-0613\",\n",
")\n",
"\n",
"agent = initialize_agent(\n",

View File

@@ -136,9 +136,11 @@
}
],
"source": [
"agent.run(\"Create an email draft for me to edit of a letter from the perspective of a sentient parrot\"\n",
" \" who is looking to collaborate on some research with her\"\n",
" \" estranged friend, a cat. Under no circumstances may you send the message, however.\")"
"agent.run(\n",
" \"Create an email draft for me to edit of a letter from the perspective of a sentient parrot\"\n",
" \" who is looking to collaborate on some research with her\"\n",
" \" estranged friend, a cat. Under no circumstances may you send the message, however.\"\n",
")"
]
},
{
@@ -160,7 +162,9 @@
}
],
"source": [
"agent.run(\"Could you search in my drafts folder and let me know if any of them are about collaboration?\")"
"agent.run(\n",
" \"Could you search in my drafts folder and let me know if any of them are about collaboration?\"\n",
")"
]
},
{
@@ -190,7 +194,9 @@
}
],
"source": [
"agent.run(\"Can you schedule a 30 minute meeting with a sentient parrot to discuss research collaborations on October 3, 2023 at 2 pm Easter Time?\")"
"agent.run(\n",
" \"Can you schedule a 30 minute meeting with a sentient parrot to discuss research collaborations on October 3, 2023 at 2 pm Easter Time?\"\n",
")"
]
},
{
@@ -210,7 +216,9 @@
}
],
"source": [
"agent.run(\"Can you tell me if I have any events on October 3, 2023 in Eastern Time, and if so, tell me if any of them are with a sentient parrot?\")"
"agent.run(\n",
" \"Can you tell me if I have any events on October 3, 2023 in Eastern Time, and if so, tell me if any of them are with a sentient parrot?\"\n",
")"
]
}
],

View File

@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "0e499e90-7a6d-4fab-8aab-31a4df417601",
"metadata": {},
@@ -15,6 +16,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ec927ac6-9b2a-4e8a-9a6e-3e429191875c",
"metadata": {
@@ -54,6 +56,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f74d1792",
"metadata": {},
@@ -81,6 +84,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "971cc455",
"metadata": {},
@@ -106,6 +110,44 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "54c01168",
"metadata": {},
"source": [
"## Disclamer ⚠️\n",
"\n",
"The query chain may generate insert/update/delete queries. When this is not expected, use a custom prompt or create a SQL users without write permissions.\n",
"\n",
"The final user might overload your SQL database by asking a simple question such as \"run the biggest query possible\". The generated query might look like:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "949772b9",
"metadata": {},
"outputs": [],
"source": [
"SELECT * FROM \"public\".\"users\"\n",
" JOIN \"public\".\"user_permissions\" ON \"public\".\"users\".id = \"public\".\"user_permissions\".user_id\n",
" JOIN \"public\".\"projects\" ON \"public\".\"users\".id = \"public\".\"projects\".user_id\n",
" JOIN \"public\".\"events\" ON \"public\".\"projects\".id = \"public\".\"events\".project_id;"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "5a4a9455",
"metadata": {},
"source": [
"For a transactional SQL database, if one of the table above contains millions of rows, the query might cause trouble to other applications using the same database.\n",
"\n",
"Most datawarehouse oriented databases support user-level quota, for limiting resource usage."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "36ae48c7-cb08-4fef-977e-c7d4b96a464b",
"metadata": {},
@@ -195,6 +237,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9abcfe8e-1868-42a4-8345-ad2d9b44c681",
"metadata": {},
@@ -269,6 +312,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6fbc26af-97e4-4a21-82aa-48bdc992da26",
"metadata": {},
@@ -451,6 +495,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7c7503b5-d9d9-4faa-b064-29fcdb5ff213",
"metadata": {},

View File

@@ -0,0 +1,742 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Xorbits Agent"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook shows how to use agents to interact with [Xorbits Pandas](https://doc.xorbits.io/en/latest/reference/pandas/index.html) dataframe and [Xorbits Numpy](https://doc.xorbits.io/en/latest/reference/numpy/index.html) ndarray. It is mostly optimized for question answering.\n",
"\n",
"**NOTE: this agent calls the Python agent under the hood, which executes LLM generated Python code - this can be bad if the LLM generated Python code is harmful. Use cautiously.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Pandas examples"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2023-07-13T08:06:33.955439Z",
"start_time": "2023-07-13T08:06:33.767539500Z"
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "05b7c067b1114ce9a8aef4a58a5d5fef",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import xorbits.pandas as pd\n",
"\n",
"from langchain.agents import create_xorbits_agent\n",
"from langchain.llms import OpenAI\n",
"\n",
"data = pd.read_csv(\"titanic.csv\")\n",
"agent = create_xorbits_agent(OpenAI(temperature=0), data, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2023-07-13T08:11:06.622471100Z",
"start_time": "2023-07-13T08:11:03.183042Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to count the number of rows and columns\n",
"Action: python_repl_ast\n",
"Action Input: data.shape\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m(891, 12)\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: There are 891 rows and 12 columns.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'There are 891 rows and 12 columns.'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"How many rows and columns are there?\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2023-07-13T08:11:23.189275300Z",
"start_time": "2023-07-13T08:11:11.029030900Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "8c63d745a7eb41a484043a5dba357997",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3mThought: I need to count the number of people in pclass 1\n",
"Action: python_repl_ast\n",
"Action Input: data[data['Pclass'] == 1].shape[0]\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m216\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: There are 216 people in pclass 1.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'There are 216 people in pclass 1.'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"How many people are in pclass 1?\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to calculate the mean age\n",
"Action: python_repl_ast\n",
"Action Input: data['Age'].mean()\u001b[0m"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "29af2e29f2d64a3397c212812adf0e9b",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Observation: \u001b[36;1m\u001b[1;3m29.69911764705882\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The mean age is 29.69911764705882.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The mean age is 29.69911764705882.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"whats the mean age?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to group the data by sex and then find the average age for each group\n",
"Action: python_repl_ast\n",
"Action Input: data.groupby('Sex')['Age'].mean()\u001b[0m"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c3d28625c35946fd91ebc2a47f8d8c5b",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Observation: \u001b[36;1m\u001b[1;3mSex\n",
"female 27.915709\n",
"male 30.726645\n",
"Name: Age, dtype: float64\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the average age for each group\n",
"Final Answer: The average age for female passengers is 27.92 and the average age for male passengers is 30.73.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The average age for female passengers is 27.92 and the average age for male passengers is 30.73.'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"Group the data by sex and find the average age for each group\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c72aab63b20d47599f4f9806f6887a69",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3mThought: I need to filter the dataframe to get the desired result\n",
"Action: python_repl_ast\n",
"Action Input: data[(data['Age'] > 30) & (data['Fare'] > 30) & (data['Fare'] < 50) & ((data['Pclass'] == 1) | (data['Pclass'] == 2))].shape[0]\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m20\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: 20\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'20'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"Show the number of people whose age is greater than 30 and fare is between 30 and 50 , and pclass is either 1 or 2\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Numpy examples"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "fa8baf315a0c41c89392edc4a24b76f5",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import xorbits.numpy as np\n",
"\n",
"from langchain.agents import create_xorbits_agent\n",
"from langchain.llms import OpenAI\n",
"\n",
"arr = np.array([1, 2, 3, 4, 5, 6])\n",
"agent = create_xorbits_agent(OpenAI(temperature=0), arr, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to find out the shape of the array\n",
"Action: python_repl_ast\n",
"Action Input: data.shape\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m(6,)\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The shape of the array is (6,).\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The shape of the array is (6,).'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"Give the shape of the array \")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to access the 2nd element of the array\n",
"Action: python_repl_ast\n",
"Action Input: data[1]\u001b[0m"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "64efcc74f81f404eb0a7d3f0326cd8b3",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Observation: \u001b[36;1m\u001b[1;3m2\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: 2\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'2'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"Give the 2nd element of the array \")"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to reshape the array and then transpose it\n",
"Action: python_repl_ast\n",
"Action Input: np.reshape(data, (2,3)).T\u001b[0m"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "fce51acf6fb347c0b400da67c6750534",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Observation: \u001b[36;1m\u001b[1;3m[[1 4]\n",
" [2 5]\n",
" [3 6]]\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The reshaped and transposed array is [[1 4], [2 5], [3 6]].\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The reshaped and transposed array is [[1 4], [2 5], [3 6]].'"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"Reshape the array into a 2-dimensional array with 2 rows and 3 columns, and then transpose it\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to reshape the array and then sum it\n",
"Action: python_repl_ast\n",
"Action Input: np.sum(np.reshape(data, (3,2)), axis=0)\u001b[0m"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "27fd4a0bbf694936bc41a6991064dec2",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Observation: \u001b[36;1m\u001b[1;3m[ 9 12]\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The sum of the array along the first axis is [9, 12].\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The sum of the array along the first axis is [9, 12].'"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\n",
" \"Reshape the array into a 2-dimensional array with 3 rows and 2 columns and sum the array along the first axis\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "a591b6d7913f45cba98d2f3b71a5120a",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])\n",
"agent = create_xorbits_agent(OpenAI(temperature=0), arr, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to use the numpy covariance function\n",
"Action: python_repl_ast\n",
"Action Input: np.cov(data)\u001b[0m"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "5fe40f83cfae48d0919c147627b5839f",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0.00/100 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Observation: \u001b[36;1m\u001b[1;3m[[1. 1. 1.]\n",
" [1. 1. 1.]\n",
" [1. 1. 1.]]\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: The covariance matrix is [[1. 1. 1.], [1. 1. 1.], [1. 1. 1.]].\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The covariance matrix is [[1. 1. 1.], [1. 1. 1.], [1. 1. 1.]].'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"calculate the covariance matrix\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to use the SVD function\n",
"Action: python_repl_ast\n",
"Action Input: U, S, V = np.linalg.svd(data)\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now have the U matrix\n",
"Final Answer: U = [[-0.70710678 -0.70710678]\n",
" [-0.70710678 0.70710678]]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'U = [[-0.70710678 -0.70710678]\\n [-0.70710678 0.70710678]]'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"compute the U of Singular Value Decomposition of the matrix\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -934,7 +934,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema import ToolException\n",
"from langchain.tools.base import ToolException\n",
"\n",
"from langchain import SerpAPIWrapper\n",
"from langchain.agents import AgentType, initialize_agent\n",

View File

@@ -24,7 +24,7 @@
"metadata": {},
"outputs": [],
"source": [
"#!pip install apify-client"
"#!pip install apify-client openai langchain chromadb tiktoken"
]
},
{

View File

@@ -34,6 +34,7 @@
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"DATAFORSEO_LOGIN\"] = \"your_api_access_username\"\n",
"os.environ[\"DATAFORSEO_PASSWORD\"] = \"your_api_access_password\"\n",
"\n",
@@ -88,7 +89,8 @@
"json_wrapper = DataForSeoAPIWrapper(\n",
" json_result_types=[\"organic\", \"knowledge_graph\", \"answer_box\"],\n",
" json_result_fields=[\"type\", \"title\", \"description\", \"text\"],\n",
" top_count=3)"
" top_count=3,\n",
")"
]
},
{
@@ -119,7 +121,8 @@
" top_count=10,\n",
" json_result_types=[\"organic\", \"local_pack\"],\n",
" json_result_fields=[\"title\", \"description\", \"type\"],\n",
" params={\"location_name\": \"Germany\", \"language_code\": \"en\"})\n",
" params={\"location_name\": \"Germany\", \"language_code\": \"en\"},\n",
")\n",
"customized_wrapper.results(\"coffee near me\")"
]
},
@@ -142,7 +145,8 @@
" top_count=10,\n",
" json_result_types=[\"organic\", \"local_pack\"],\n",
" json_result_fields=[\"title\", \"description\", \"type\"],\n",
" params={\"location_name\": \"Germany\", \"language_code\": \"en\", \"se_name\": \"bing\"})\n",
" params={\"location_name\": \"Germany\", \"language_code\": \"en\", \"se_name\": \"bing\"},\n",
")\n",
"customized_wrapper.results(\"coffee near me\")"
]
},
@@ -164,7 +168,12 @@
"maps_search = DataForSeoAPIWrapper(\n",
" top_count=10,\n",
" json_result_fields=[\"title\", \"value\", \"address\", \"rating\", \"type\"],\n",
" params={\"location_coordinate\": \"52.512,13.36,12z\", \"language_code\": \"en\", \"se_type\": \"maps\"})\n",
" params={\n",
" \"location_coordinate\": \"52.512,13.36,12z\",\n",
" \"language_code\": \"en\",\n",
" \"se_type\": \"maps\",\n",
" },\n",
")\n",
"maps_search.results(\"coffee near me\")"
]
},
@@ -184,10 +193,12 @@
"outputs": [],
"source": [
"from langchain.agents import Tool\n",
"\n",
"search = DataForSeoAPIWrapper(\n",
" top_count=3,\n",
" json_result_types=[\"organic\"],\n",
" json_result_fields=[\"title\", \"description\", \"type\"])\n",
" json_result_fields=[\"title\", \"description\", \"type\"],\n",
")\n",
"tool = Tool(\n",
" name=\"google-search-answer\",\n",
" description=\"My new answer tool\",\n",

View File

@@ -0,0 +1,142 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "245a954a",
"metadata": {},
"source": [
"# Golden Query\n",
"\n",
"This notebook goes over how to use the golden-query tool.\n",
"\n",
"- Go to the [Golden API docs](https://docs.golden.com/) to get an overview about the Golden API.\n",
"- Create a Golden account if you don't have one on the [Golden Website](golden.com).\n",
"- Get your API key from the [Golden API Settings](https://golden.com/settings/api) page.\n",
"- Save your API key into GOLDEN_API_KEY env variable"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "34bb5968",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"GOLDEN_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "ac4910f8",
"metadata": {},
"outputs": [],
"source": [
"from langchain.utilities.golden_query import GoldenQueryAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "84b8f773",
"metadata": {},
"outputs": [],
"source": [
"golden_query = GoldenQueryAPIWrapper()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "068991a6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'results': [{'id': 4673886,\n",
" 'latestVersionId': 60276991,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Samsung', 'citations': []}]}]},\n",
" {'id': 7008,\n",
" 'latestVersionId': 61087416,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Intel', 'citations': []}]}]},\n",
" {'id': 24193,\n",
" 'latestVersionId': 60274482,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Texas Instruments', 'citations': []}]}]},\n",
" {'id': 1142,\n",
" 'latestVersionId': 61406205,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Advanced Micro Devices', 'citations': []}]}]},\n",
" {'id': 193948,\n",
" 'latestVersionId': 58326582,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Freescale Semiconductor', 'citations': []}]}]},\n",
" {'id': 91316,\n",
" 'latestVersionId': 60387380,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Agilent Technologies', 'citations': []}]}]},\n",
" {'id': 90014,\n",
" 'latestVersionId': 60388078,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Novartis', 'citations': []}]}]},\n",
" {'id': 237458,\n",
" 'latestVersionId': 61406160,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'Analog Devices', 'citations': []}]}]},\n",
" {'id': 3941943,\n",
" 'latestVersionId': 60382250,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'AbbVie Inc.', 'citations': []}]}]},\n",
" {'id': 4178762,\n",
" 'latestVersionId': 60542667,\n",
" 'properties': [{'predicateId': 'name',\n",
" 'instances': [{'value': 'IBM', 'citations': []}]}]}],\n",
" 'next': 'https://golden.com/api/v2/public/queries/59044/results/?cursor=eyJwb3NpdGlvbiI6IFsxNzYxNiwgIklCTS04M1lQM1oiXX0%3D&pageSize=10',\n",
" 'previous': None}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import json\n",
"\n",
"json.loads(golden_query.run(\"companies in nanotech\"))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"vscode": {
"interpreter": {
"hash": "53f3bc57609c7a84333bb558594977aa5b4026b1d6070b93987956689e367341"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -52,7 +52,6 @@
"tools = load_tools(\n",
" [\"graphql\"],\n",
" graphql_endpoint=\"https://swapi-graphql.netlify.app/.netlify/functions/index\",\n",
" llm=llm,\n",
")\n",
"\n",
"agent = initialize_agent(\n",

View File

@@ -0,0 +1,233 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "16763ed3",
"metadata": {},
"source": [
"# Lemon AI NLP Workflow Automation\n",
"\\\n",
"Full docs are available at: https://github.com/felixbrock/lemonai-py-client\n",
"\n",
"**Lemon AI helps you build powerful AI assistants in minutes and automate workflows by allowing for accurate and reliable read and write operations in tools like Airtable, Hubspot, Discord, Notion, Slack and Github.**\n",
"\n",
"Most connectors available today are focused on read-only operations, limiting the potential of LLMs. Agents, on the other hand, have a tendency to hallucinate from time to time due to missing context or instructions.\n",
"\n",
"With Lemon AI, it is possible to give your agents access to well-defined APIs for reliable read and write operations. In addition, Lemon AI functions allow you to further reduce the risk of hallucinations by providing a way to statically define workflows that the model can rely on in case of uncertainty."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4881b484-1b97-478f-b206-aec407ceff66",
"metadata": {},
"source": [
"## Quick Start\n",
"\n",
"The following quick start demonstrates how to use Lemon AI in combination with Agents to automate workflows that involve interaction with internal tooling."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ff91b41a",
"metadata": {},
"source": [
"### 1. Install Lemon AI\n",
"\n",
"Requires Python 3.8.1 and above.\n",
"\n",
"To use Lemon AI in your Python project run `pip install lemonai`\n",
"\n",
"This will install the corresponding Lemon AI client which you can then import into your script.\n",
"\n",
"The tool uses Python packages langchain and loguru. In case of any installation errors with Lemon AI, install both packages first and then install the Lemon AI package."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "340ff63d",
"metadata": {},
"source": [
"### 2. Launch the Server\n",
"\n",
"The interaction of your agents and all tools provided by Lemon AI is handled by the [Lemon AI Server](https://github.com/felixbrock/lemonai-server). To use Lemon AI you need to run the server on your local machine so the Lemon AI Python client can connect to it."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e845f402",
"metadata": {},
"source": [
"### 3. Use Lemon AI with Langchain"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d3ae6a82",
"metadata": {},
"source": [
"Lemon AI automatically solves given tasks by finding the right combination of relevant tools or uses Lemon AI Functions as an alternative. The following example demonstrates how to retrieve a user from Hackernews and write it to a table in Airtable:"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "43476a22",
"metadata": {},
"source": [
"#### (Optional) Define your Lemon AI Functions"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "cb038670",
"metadata": {},
"source": [
"Similar to [OpenAI functions](https://openai.com/blog/function-calling-and-other-api-updates), Lemon AI provides the option to define workflows as reusable functions. These functions can be defined for use cases where it is especially important to move as close as possible to near-deterministic behavior. Specific workflows can be defined in a separate lemonai.json:"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e423ebbb",
"metadata": {},
"source": [
"```json\n",
"[\n",
" {\n",
" \"name\": \"Hackernews Airtable User Workflow\",\n",
" \"description\": \"retrieves user data from Hackernews and appends it to a table in Airtable\",\n",
" \"tools\": [\"hackernews-get-user\", \"airtable-append-data\"]\n",
" }\n",
"]\n",
"```"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3fdb36ce",
"metadata": {},
"source": [
"Your model will have access to these functions and will prefer them over self-selecting tools to solve a given task. All you have to do is to let the agent know that it should use a given function by including the function name in the prompt."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ebfb8b5d",
"metadata": {},
"source": [
"#### Include Lemon AI in your Langchain project "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "5318715d",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from lemonai import execute_workflow\n",
"from langchain import OpenAI"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c9d082cb",
"metadata": {},
"source": [
"#### Load API Keys and Access Tokens\n",
"\n",
"To use tools that require authentication, you have to store the corresponding access credentials in your environment in the format \"{tool name}_{authentication string}\" where the authentication string is one of [\"API_KEY\", \"SECRET_KEY\", \"SUBSCRIPTION_KEY\", \"ACCESS_KEY\"] for API keys or [\"ACCESS_TOKEN\", \"SECRET_TOKEN\"] for authentication tokens. Examples are \"OPENAI_API_KEY\", \"BING_SUBSCRIPTION_KEY\", \"AIRTABLE_ACCESS_TOKEN\"."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a370d999",
"metadata": {},
"outputs": [],
"source": [
"\"\"\" Load all relevant API Keys and Access Tokens into your environment variables \"\"\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"*INSERT OPENAI API KEY HERE*\"\n",
"os.environ[\"AIRTABLE_ACCESS_TOKEN\"] = \"*INSERT AIRTABLE TOKEN HERE*\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38d158e7",
"metadata": {},
"outputs": [],
"source": [
"hackernews_username = \"*INSERT HACKERNEWS USERNAME HERE*\"\n",
"airtable_base_id = \"*INSERT BASE ID HERE*\"\n",
"airtable_table_id = \"*INSERT TABLE ID HERE*\"\n",
"\n",
"\"\"\" Define your instruction to be given to your LLM \"\"\"\n",
"prompt = f\"\"\"Read information from Hackernews for user {hackernews_username} and then write the results to\n",
"Airtable (baseId: {airtable_base_id}, tableId: {airtable_table_id}). Only write the fields \"username\", \"karma\"\n",
"and \"created_at_i\". Please make sure that Airtable does NOT automatically convert the field types.\n",
"\"\"\"\n",
"\n",
"\"\"\"\n",
"Use the Lemon AI execute_workflow wrapper \n",
"to run your Langchain agent in combination with Lemon AI \n",
"\"\"\"\n",
"model = OpenAI(temperature=0)\n",
"\n",
"execute_workflow(llm=model, prompt_string=prompt)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "aef3e801",
"metadata": {},
"source": [
"### 4. Gain transparency on your Agent's decision making\n",
"\n",
"To gain transparency on how your Agent interacts with Lemon AI tools to solve a given task, all decisions made, tools used and operations performed are written to a local `lemonai.log` file. Every time your LLM agent is interacting with the Lemon AI tool stack a corresponding log entry is created.\n",
"\n",
"```log\n",
"2023-06-26T11:50:27.708785+0100 - b5f91c59-8487-45c2-800a-156eac0c7dae - hackernews-get-user\n",
"2023-06-26T11:50:39.624035+0100 - b5f91c59-8487-45c2-800a-156eac0c7dae - airtable-append-data\n",
"2023-06-26T11:58:32.925228+0100 - 5efe603c-9898-4143-b99a-55b50007ed9d - hackernews-get-user\n",
"2023-06-26T11:58:43.988788+0100 - 5efe603c-9898-4143-b99a-55b50007ed9d - airtable-append-data\n",
"```\n",
"\n",
"By using the [Lemon AI Analytics Tool](https://github.com/felixbrock/lemonai-analytics) you can easily gain a better understanding of how frequently and in which order tools are used. As a result, you can identify weak spots in your agents decision-making capabilities and move to a more deterministic behavior by defining Lemon AI functions."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -90,7 +90,12 @@
"metadata": {},
"outputs": [],
"source": [
"search.results(\"The best blog post about AI safety is definitely this: \", 10, include_domains=[\"lesswrong.com\"], start_published_date=\"2019-01-01\")"
"search.results(\n",
" \"The best blog post about AI safety is definitely this: \",\n",
" 10,\n",
" include_domains=[\"lesswrong.com\"],\n",
" start_published_date=\"2019-01-01\",\n",
")"
]
},
{

View File

@@ -341,7 +341,7 @@
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"zapier = ZapierNLAWrapper(zapier_nla_oauth_access_token='<fill in access token here>')\n",
"zapier = ZapierNLAWrapper(zapier_nla_oauth_access_token=\"<fill in access token here>\")\n",
"toolkit = ZapierToolkit.from_zapier_nla_wrapper(zapier)\n",
"agent = initialize_agent(\n",
" toolkit.get_tools(), llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",

View File

@@ -1,402 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "52694348",
"metadata": {},
"source": [
"# Tracing\n",
"\n",
"There are two recommended ways to trace your LangChains:\n",
"\n",
"1. Setting the `LANGCHAIN_TRACING` environment variable to `\"true\"`. \n",
"2. Using a context manager `with tracing_enabled()` to trace a particular block of code.\n",
"\n",
"**Note** if the environment variable is set, all code will be traced, regardless of whether or not it's within the context manager."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "aead9843",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain.agents import AgentType, initialize_agent, load_tools\n",
"from langchain.callbacks import tracing_enabled\n",
"from langchain.llms import OpenAI\n",
"\n",
"# To run the code, make sure to set OPENAI_API_KEY and SERPAPI_API_KEY\n",
"llm = OpenAI(temperature=0)\n",
"tools = load_tools([\"llm-math\", \"serpapi\"], llm=llm)\n",
"agent = initialize_agent(\n",
" tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
")\n",
"\n",
"questions = [\n",
" \"Who won the US Open men's final in 2019? What is his age raised to the 0.334 power?\",\n",
" \"Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?\",\n",
" \"Who won the most recent formula 1 grand prix? What is their age raised to the 0.23 power?\",\n",
" \"Who won the US Open women's final in 2019? What is her age raised to the 0.34 power?\",\n",
" \"Who is Beyonce's husband? What is his age raised to the 0.19 power?\",\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "a417dd85",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to load default session, using empty session: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /sessions?name=default (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f8b36d0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Search\n",
"Action Input: \"US Open men's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Action: Search\n",
"Action Input: \"Rafael Nadal age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m37 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to calculate the age raised to the 0.334 power\n",
"Action: Calculator\n",
"Action Input: 37^0.334\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.340253100876781\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f8c0f50>: Failed to establish a new connection: [Errno 61] Connection refused'))\n",
"WARNING:root:Failed to load default session, using empty session: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /sessions?name=default (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f8e6f50>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Rafael Nadal, aged 37, won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.340253100876781.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mSudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Harry Styles' age.\n",
"Action: Search\n",
"Action Input: \"Harry Styles age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m29 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 29^0.23\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f8fa590>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Harry Styles is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.169459462491557.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"os.environ[\"LANGCHAIN_TRACING\"] = \"true\"\n",
"\n",
"# Both of the agent runs will be traced because the environment variable is set\n",
"agent.run(questions[0])\n",
"with tracing_enabled() as session:\n",
" assert session\n",
" agent.run(questions[1])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "20f95a51",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to load my_test_session session, using empty session: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /sessions?name=my_test_session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f8e41d0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Search\n",
"Action Input: \"US Open men's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Action: Search\n",
"Action Input: \"Rafael Nadal age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m37 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now need to calculate the age raised to the 0.334 power\n",
"Action: Calculator\n",
"Action Input: 37^0.334\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.340253100876781\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f8d0a50>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Rafael Nadal, aged 37, won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.340253100876781.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mSudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Harry Styles' age.\n",
"Action: Search\n",
"Action Input: \"Harry Styles age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m29 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 29^0.23\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Harry Styles is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.169459462491557.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"\"Harry Styles is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.169459462491557.\""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Now, we unset the environment variable and use a context manager.\n",
"\n",
"if \"LANGCHAIN_TRACING\" in os.environ:\n",
" del os.environ[\"LANGCHAIN_TRACING\"]\n",
"\n",
"# here, we are writing traces to \"my_test_session\"\n",
"with tracing_enabled(\"my_test_session\") as session:\n",
" assert session\n",
" agent.run(questions[0]) # this should be traced\n",
"\n",
"agent.run(questions[1]) # this should not be traced"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "a392817b",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to load default session, using empty session: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /sessions?name=default (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f916ed0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who won the grand prix and then calculate their age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Formula 1 Grand Prix Winner\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out who won the US Open men's final in 2019 and then calculate his age raised to the 0.334 power.\n",
"Action: Search\n",
"Action Input: \"US Open men's final 2019 winner\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mSudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3mRafael Nadal defeated Daniil Medvedev in the final, 75, 63, 57, 46, 64 to win the men's singles tennis title at the 2019 US Open. It was his fourth US ...\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3mThe first Formula One World Drivers' Champion was Giuseppe Farina in the 1950 championship and the current title holder is Max Verstappen in the 2022 season.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Harry Styles' age.\n",
"Action: Search\n",
"Action Input: \"Harry Styles age\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to find out the age of the winner\n",
"Action: Search\n",
"Action Input: \"Rafael Nadal age\"\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m29 years\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3m37 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Max Verstappen's age.\n",
"Action: Search\n",
"Action Input: \"Max Verstappen Age\"\u001b[0m\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 29^0.23\u001b[0m\u001b[32;1m\u001b[1;3m I now need to calculate the age raised to the 0.334 power\n",
"Action: Calculator\n",
"Action Input: 37^0.334\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[33;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 3.340253100876781\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f95dbd0>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Harry Styles is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.169459462491557.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.23 power.\n",
"Action: Calculator\n",
"Action Input: 25^0.23\u001b[0m\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Rafael Nadal, aged 37, won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.340253100876781.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 2.096651272316035\u001b[0m\n",
"Thought:"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x12f95de50>: Failed to establish a new connection: [Errno 61] Connection refused'))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Max Verstappen, aged 25, won the most recent Formula 1 Grand Prix and his age raised to the 0.23 power is 2.096651272316035.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"\"Rafael Nadal, aged 37, won the US Open men's final in 2019 and his age raised to the 0.334 power is 3.340253100876781.\""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import asyncio\n",
"\n",
"# The context manager is concurrency safe:\n",
"if \"LANGCHAIN_TRACING\" in os.environ:\n",
" del os.environ[\"LANGCHAIN_TRACING\"]\n",
"\n",
"# start a background task\n",
"task = asyncio.create_task(agent.arun(questions[0])) # this should not be traced\n",
"with tracing_enabled() as session:\n",
" assert session\n",
" tasks = [agent.arun(q) for q in questions[1:3]] # these should be traced\n",
" await asyncio.gather(*tasks)\n",
"\n",
"await task"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc83fd11",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "venv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -57,6 +57,7 @@
"\n",
"# Remove the (1) import sys and sys.path.append(..) and (2) uncomment `!pip install langchain` after merging the PR for Infino/LangChain integration.\n",
"import sys\n",
"\n",
"sys.path.append(\"../../../../../langchain\")\n",
"#!pip install langchain\n",
"\n",
@@ -120,9 +121,9 @@
"metadata": {},
"outputs": [],
"source": [
"# These are a subset of questions from Stanford's QA dataset - \n",
"# These are a subset of questions from Stanford's QA dataset -\n",
"# https://rajpurkar.github.io/SQuAD-explorer/\n",
"data = '''In what country is Normandy located?\n",
"data = \"\"\"In what country is Normandy located?\n",
"When were the Normans in Normandy?\n",
"From which countries did the Norse originate?\n",
"Who was the Norse leader?\n",
@@ -141,9 +142,9 @@
"What principality did William the conquerer found?\n",
"What is the original meaning of the word Norman?\n",
"When was the Latin version of the word Norman first recorded?\n",
"What name comes from the English words Normans/Normanz?'''\n",
"What name comes from the English words Normans/Normanz?\"\"\"\n",
"\n",
"questions = data.split('\\n')"
"questions = data.split(\"\\n\")"
]
},
{
@@ -190,10 +191,12 @@
],
"source": [
"# Set your key here.\n",
"#os.environ[\"OPENAI_API_KEY\"] = \"YOUR_API_KEY\"\n",
"# os.environ[\"OPENAI_API_KEY\"] = \"YOUR_API_KEY\"\n",
"\n",
"# Create callback handler. This logs latency, errors, token usage, prompts as well as prompt responses to Infino.\n",
"handler = InfinoCallbackHandler(model_id=\"test_openai\", model_version=\"0.1\", verbose=False)\n",
"handler = InfinoCallbackHandler(\n",
" model_id=\"test_openai\", model_version=\"0.1\", verbose=False\n",
")\n",
"\n",
"# Create LLM.\n",
"llm = OpenAI(temperature=0.1)\n",
@@ -281,29 +284,30 @@
"source": [
"# Helper function to create a graph using matplotlib.\n",
"def plot(data, title):\n",
" data = json.loads(data)\n",
" data = json.loads(data)\n",
"\n",
" # Extract x and y values from the data\n",
" timestamps = [item[\"time\"] for item in data]\n",
" dates=[dt.datetime.fromtimestamp(ts) for ts in timestamps]\n",
" y = [item[\"value\"] for item in data]\n",
" # Extract x and y values from the data\n",
" timestamps = [item[\"time\"] for item in data]\n",
" dates = [dt.datetime.fromtimestamp(ts) for ts in timestamps]\n",
" y = [item[\"value\"] for item in data]\n",
"\n",
" plt.rcParams['figure.figsize'] = [6, 4]\n",
" plt.subplots_adjust(bottom=0.2)\n",
" plt.xticks(rotation=25 )\n",
" ax=plt.gca()\n",
" xfmt = md.DateFormatter('%Y-%m-%d %H:%M:%S')\n",
" ax.xaxis.set_major_formatter(xfmt)\n",
" \n",
" # Create the plot\n",
" plt.plot(dates, y)\n",
" plt.rcParams[\"figure.figsize\"] = [6, 4]\n",
" plt.subplots_adjust(bottom=0.2)\n",
" plt.xticks(rotation=25)\n",
" ax = plt.gca()\n",
" xfmt = md.DateFormatter(\"%Y-%m-%d %H:%M:%S\")\n",
" ax.xaxis.set_major_formatter(xfmt)\n",
"\n",
" # Set labels and title\n",
" plt.xlabel(\"Time\")\n",
" plt.ylabel(\"Value\")\n",
" plt.title(title)\n",
" # Create the plot\n",
" plt.plot(dates, y)\n",
"\n",
" # Set labels and title\n",
" plt.xlabel(\"Time\")\n",
" plt.ylabel(\"Value\")\n",
" plt.title(title)\n",
"\n",
" plt.show()\n",
"\n",
" plt.show()\n",
"\n",
"response = client.search_ts(\"__name__\", \"latency\", 0, int(time.time()))\n",
"plot(response.text, \"Latency\")\n",
@@ -318,7 +322,7 @@
"plot(response.text, \"Completion Tokens\")\n",
"\n",
"response = client.search_ts(\"__name__\", \"total_tokens\", 0, int(time.time()))\n",
"plot(response.text, \"Total Tokens\")\n"
"plot(response.text, \"Total Tokens\")"
]
},
{
@@ -356,7 +360,7 @@
"\n",
"query = \"king charles III\"\n",
"response = client.search_log(\"king charles III\", 0, int(time.time()))\n",
"print(\"Results for\", query, \":\", response.text)\n"
"print(\"Results for\", query, \":\", response.text)"
]
},
{

View File

@@ -0,0 +1,210 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# PromptLayer\n",
"\n",
"![PromptLayer](https://promptlayer.com/text_logo.png)\n",
"\n",
"[PromptLayer](https://promptlayer.com) is a an LLM observability platform that lets you visualize requests, version prompts, and track usage. In this guide we will go over how to setup the `PromptLayerCallbackHandler`. \n",
"\n",
"While PromptLayer does have LLMs that integrate directly with LangChain (eg [`PromptLayerOpenAI`](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/promptlayer_openai)), this callback is the recommended way to integrate PromptLayer with LangChain.\n",
"\n",
"See [our docs](https://docs.promptlayer.com/languages/langchain) for more information."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Installation and Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install promptlayer --upgrade"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Getting API Credentials\n",
"\n",
"If you do not have a PromptLayer account, create one on [promptlayer.com](https://www.promptlayer.com). Then get an API key by clicking on the settings cog in the navbar and\n",
"set it as an environment variabled called `PROMPTLAYER_API_KEY`\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Usage\n",
"\n",
"Getting started with `PromptLayerCallbackHandler` is fairly simple, it takes two optional arguments:\n",
"1. `pl_tags` - an optional list of strings that will be tracked as tags on PromptLayer.\n",
"2. `pl_id_callback` - an optional function that will take `promptlayer_request_id` as an argument. This ID can be used with all of PromptLayer's tracking features to track, metadata, scores, and prompt usage."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Simple OpenAI Example\n",
"\n",
"In this simple example we use `PromptLayerCallbackHandler` with `ChatOpenAI`. We add a PromptLayer tag named `chatopenai`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import promptlayer # Don't forget this 🍰\n",
"from langchain.callbacks import PromptLayerCallbackHandler\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema import (\n",
" HumanMessage,\n",
")\n",
"\n",
"chat_llm = ChatOpenAI(\n",
" temperature=0,\n",
" callbacks=[PromptLayerCallbackHandler(pl_tags=[\"chatopenai\"])],\n",
")\n",
"llm_results = chat_llm(\n",
" [\n",
" HumanMessage(content=\"What comes after 1,2,3 ?\"),\n",
" HumanMessage(content=\"Tell me another joke?\"),\n",
" ]\n",
")\n",
"print(llm_results)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### GPT4All Example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import promptlayer # Don't forget this 🍰\n",
"from langchain.callbacks import PromptLayerCallbackHandler\n",
"\n",
"from langchain.llms import GPT4All\n",
"\n",
"model = GPT4All(model=\"./models/gpt4all-model.bin\", n_ctx=512, n_threads=8)\n",
"\n",
"response = model(\n",
" \"Once upon a time, \",\n",
" callbacks=[PromptLayerCallbackHandler(pl_tags=[\"langchain\", \"gpt4all\"])],\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Full Featured Example\n",
"\n",
"In this example we unlock more of the power of PromptLayer.\n",
"\n",
"PromptLayer allows you to visually create, version, and track prompt templates. Using the [Prompt Registry](https://docs.promptlayer.com/features/prompt-registry), we can programatically fetch the prompt template called `example`.\n",
"\n",
"We also define a `pl_id_callback` function which takes in the `promptlayer_request_id` and logs a score, metadata and links the prompt template used. Read more about tracking on [our docs](https://docs.promptlayer.com/features/prompt-history/request-id)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import promptlayer # Don't forget this 🍰\n",
"from langchain.callbacks import PromptLayerCallbackHandler\n",
"from langchain.llms import OpenAI\n",
"\n",
"\n",
"def pl_id_callback(promptlayer_request_id):\n",
" print(\"prompt layer id \", promptlayer_request_id)\n",
" promptlayer.track.score(\n",
" request_id=promptlayer_request_id, score=100\n",
" ) # score is an integer 0-100\n",
" promptlayer.track.metadata(\n",
" request_id=promptlayer_request_id, metadata={\"foo\": \"bar\"}\n",
" ) # metadata is a dictionary of key value pairs that is tracked on PromptLayer\n",
" promptlayer.track.prompt(\n",
" request_id=promptlayer_request_id,\n",
" prompt_name=\"example\",\n",
" prompt_input_variables={\"product\": \"toasters\"},\n",
" version=1,\n",
" ) # link the request to a prompt template\n",
"\n",
"\n",
"openai_llm = OpenAI(\n",
" model_name=\"text-davinci-002\",\n",
" callbacks=[PromptLayerCallbackHandler(pl_id_callback=pl_id_callback)],\n",
")\n",
"\n",
"example_prompt = promptlayer.prompts.get(\"example\", version=1, langchain=True)\n",
"openai_llm(example_prompt.format(product=\"toasters\"))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"That is all it takes! After setup all your requests will show up on the PromptLayer dashboard.\n",
"This callback also works with any LLM implemented on LangChain."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8 (default, Apr 13 2021, 12:59:45) \n[Clang 10.0.0 ]"
},
"vscode": {
"interpreter": {
"hash": "c4fe2cd85a8d9e8baaec5340ce66faff1c77581a9f43e6c45e85e09b6fced008"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,921 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "82f3f65d-fbcb-4e8e-b04b-959856283643",
"metadata": {},
"source": [
"# Causal program-aided language (CPAL) chain\n",
"\n",
"The CPAL chain builds on the recent PAL to stop LLM hallucination. The problem with the PAL approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination.\n",
"\n",
"The original [PR's description](https://github.com/hwchase17/langchain/pull/6255) contains a full overview.\n",
"\n",
"Using the CPAL chain, the LLM translated this\n",
"\n",
" \"Tim buys the same number of pets as Cindy and Boris.\"\n",
" \"Cindy buys the same number of pets as Bill plus Bob.\"\n",
" \"Boris buys the same number of pets as Ben plus Beth.\"\n",
" \"Bill buys the same number of pets as Obama.\"\n",
" \"Bob buys the same number of pets as Obama.\"\n",
" \"Ben buys the same number of pets as Obama.\"\n",
" \"Beth buys the same number of pets as Obama.\"\n",
" \"If Obama buys one pet, how many pets total does everyone buy?\"\n",
"\n",
"\n",
"into this\n",
"\n",
"![complex-graph.png](/img/cpal_diagram.png).\n",
"\n",
"Outline of code examples demoed in this notebook.\n",
"\n",
"1. CPAL's value against hallucination: CPAL vs PAL \n",
" 1.1 Complex narrative \n",
" 1.2 Unanswerable math word problem \n",
"2. CPAL's three types of causal diagrams ([The Book of Why](https://en.wikipedia.org/wiki/The_Book_of_Why)). \n",
" 2.1 Mediator \n",
" 2.2 Collider \n",
" 2.3 Confounder "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1370e40f",
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import SVG\n",
"\n",
"from langchain.experimental.cpal.base import CPALChain\n",
"from langchain.chains import PALChain\n",
"from langchain import OpenAI\n",
"\n",
"llm = OpenAI(temperature=0, max_tokens=512)\n",
"cpal_chain = CPALChain.from_univariate_prompt(llm=llm, verbose=True)\n",
"pal_chain = PALChain.from_math_prompt(llm=llm, verbose=True)"
]
},
{
"cell_type": "markdown",
"id": "858a87d9-a9bd-4850-9687-9af4b0856b62",
"metadata": {},
"source": [
"## CPAL's value against hallucination: CPAL vs PAL\n",
"\n",
"Like PAL, CPAL intends to reduce large language model (LLM) hallucination.\n",
"\n",
"The CPAL chain is different from the PAL chain for a couple of reasons.\n",
"\n",
"CPAL adds a causal structure (or DAG) to link entity actions (or math expressions).\n",
"The CPAL math expressions are modeling a chain of cause and effect relations, which can be intervened upon, whereas for the PAL chain math expressions are projected math identities.\n"
]
},
{
"cell_type": "markdown",
"id": "496403c5-d268-43ae-8852-2bd9903ce444",
"metadata": {},
"source": [
"### 1.1 Complex narrative\n",
"\n",
"Takeaway: PAL hallucinates, CPAL does not hallucinate."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d5dad768-2892-4825-8093-9b840f643a8a",
"metadata": {},
"outputs": [],
"source": [
"question = (\n",
" \"Tim buys the same number of pets as Cindy and Boris.\"\n",
" \"Cindy buys the same number of pets as Bill plus Bob.\"\n",
" \"Boris buys the same number of pets as Ben plus Beth.\"\n",
" \"Bill buys the same number of pets as Obama.\"\n",
" \"Bob buys the same number of pets as Obama.\"\n",
" \"Ben buys the same number of pets as Obama.\"\n",
" \"Beth buys the same number of pets as Obama.\"\n",
" \"If Obama buys one pet, how many pets total does everyone buy?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "bbffa7a0-3c22-4a1d-ab2d-f230973073b0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mdef solution():\n",
" \"\"\"Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?\"\"\"\n",
" obama_pets = 1\n",
" tim_pets = obama_pets\n",
" cindy_pets = obama_pets + obama_pets\n",
" boris_pets = obama_pets + obama_pets\n",
" total_pets = tim_pets + cindy_pets + boris_pets\n",
" result = total_pets\n",
" return result\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'5'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pal_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "35a70d1d-86f8-4abc-b818-fbd083f072e9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mstory outcome data\n",
" name code value depends_on\n",
"0 obama pass 1.0 []\n",
"1 bill bill.value = obama.value 1.0 [obama]\n",
"2 bob bob.value = obama.value 1.0 [obama]\n",
"3 ben ben.value = obama.value 1.0 [obama]\n",
"4 beth beth.value = obama.value 1.0 [obama]\n",
"5 cindy cindy.value = bill.value + bob.value 2.0 [bill, bob]\n",
"6 boris boris.value = ben.value + beth.value 2.0 [ben, beth]\n",
"7 tim tim.value = cindy.value + boris.value 4.0 [cindy, boris]\u001b[0m\n",
"\n",
"\u001b[36;1m\u001b[1;3mquery data\n",
"{\n",
" \"question\": \"how many pets total does everyone buy?\",\n",
" \"expression\": \"SELECT SUM(value) FROM df\",\n",
" \"llm_error_msg\": \"\"\n",
"}\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"13.0"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cpal_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ccb6b2b0-9de6-4f66-a8fb-fc59229ee316",
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"292pt\" height=\"260pt\" viewBox=\"0.00 0.00 292.00 260.00\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 256)\">\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-256 288,-256 288,4 -4,4\"/>\n",
"<!-- obama -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>obama</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"137\" cy=\"-234\" rx=\"41.69\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"137\" y=\"-230.3\" font-family=\"Times,serif\" font-size=\"14.00\">obama</text>\n",
"</g>\n",
"<!-- bill -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>bill</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"27\" cy=\"-162\" rx=\"27\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"27\" y=\"-158.3\" font-family=\"Times,serif\" font-size=\"14.00\">bill</text>\n",
"</g>\n",
"<!-- obama&#45;&gt;bill -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>obama-&gt;bill</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M114.47,-218.67C97.08,-207.6 72.94,-192.23 54.42,-180.45\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"56.15,-177.4 45.84,-174.99 52.4,-183.31 56.15,-177.4\"/>\n",
"</g>\n",
"<!-- bob -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>bob</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"100\" cy=\"-162\" rx=\"28\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"100\" y=\"-158.3\" font-family=\"Times,serif\" font-size=\"14.00\">bob</text>\n",
"</g>\n",
"<!-- obama&#45;&gt;bob -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>obama-&gt;bob</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M128.04,-216.05C123.66,-207.77 118.3,-197.62 113.44,-188.42\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"116.39,-186.51 108.62,-179.31 110.2,-189.79 116.39,-186.51\"/>\n",
"</g>\n",
"<!-- ben -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>ben</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"174\" cy=\"-162\" rx=\"28\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"174\" y=\"-158.3\" font-family=\"Times,serif\" font-size=\"14.00\">ben</text>\n",
"</g>\n",
"<!-- obama&#45;&gt;ben -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>obama-&gt;ben</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M145.96,-216.05C150.34,-207.77 155.7,-197.62 160.56,-188.42\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"163.8,-189.79 165.38,-179.31 157.61,-186.51 163.8,-189.79\"/>\n",
"</g>\n",
"<!-- beth -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>beth</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"252\" cy=\"-162\" rx=\"32\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"252\" y=\"-158.3\" font-family=\"Times,serif\" font-size=\"14.00\">beth</text>\n",
"</g>\n",
"<!-- obama&#45;&gt;beth -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>obama-&gt;beth</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M160.27,-218.83C178.18,-207.94 203.04,-192.8 222.37,-181.04\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"224.36,-183.92 231.08,-175.73 220.72,-177.95 224.36,-183.92\"/>\n",
"</g>\n",
"<!-- cindy -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>cindy</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"93\" cy=\"-90\" rx=\"36\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"93\" y=\"-86.3\" font-family=\"Times,serif\" font-size=\"14.00\">cindy</text>\n",
"</g>\n",
"<!-- bill&#45;&gt;cindy -->\n",
"<g id=\"edge5\" class=\"edge\">\n",
"<title>bill-&gt;cindy</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M41,-146.15C49.77,-136.85 61.25,-124.67 71.2,-114.12\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"73.79,-116.47 78.11,-106.8 68.7,-111.67 73.79,-116.47\"/>\n",
"</g>\n",
"<!-- bob&#45;&gt;cindy -->\n",
"<g id=\"edge6\" class=\"edge\">\n",
"<title>bob-&gt;cindy</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M98.27,-143.7C97.5,-135.98 96.57,-126.71 95.71,-118.11\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"99.19,-117.7 94.71,-108.1 92.22,-118.4 99.19,-117.7\"/>\n",
"</g>\n",
"<!-- boris -->\n",
"<g id=\"node7\" class=\"node\">\n",
"<title>boris</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"181\" cy=\"-90\" rx=\"34.5\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"181\" y=\"-86.3\" font-family=\"Times,serif\" font-size=\"14.00\">boris</text>\n",
"</g>\n",
"<!-- ben&#45;&gt;boris -->\n",
"<g id=\"edge7\" class=\"edge\">\n",
"<title>ben-&gt;boris</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M175.73,-143.7C176.5,-135.98 177.43,-126.71 178.29,-118.11\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"181.78,-118.4 179.29,-108.1 174.81,-117.7 181.78,-118.4\"/>\n",
"</g>\n",
"<!-- beth&#45;&gt;boris -->\n",
"<g id=\"edge8\" class=\"edge\">\n",
"<title>beth-&gt;boris</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M236.59,-145.81C227.01,-136.36 214.51,-124.04 203.8,-113.48\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"205.96,-110.69 196.38,-106.16 201.04,-115.67 205.96,-110.69\"/>\n",
"</g>\n",
"<!-- tim -->\n",
"<g id=\"node8\" class=\"node\">\n",
"<title>tim</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"137\" cy=\"-18\" rx=\"27\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"137\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">tim</text>\n",
"</g>\n",
"<!-- cindy&#45;&gt;tim -->\n",
"<g id=\"edge9\" class=\"edge\">\n",
"<title>cindy-&gt;tim</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M103.43,-72.41C108.82,-63.83 115.51,-53.19 121.49,-43.67\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"124.59,-45.32 126.95,-34.99 118.66,-41.59 124.59,-45.32\"/>\n",
"</g>\n",
"<!-- boris&#45;&gt;tim -->\n",
"<g id=\"edge10\" class=\"edge\">\n",
"<title>boris-&gt;tim</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M170.79,-72.77C165.41,-64.19 158.68,-53.49 152.65,-43.9\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"155.43,-41.75 147.15,-35.15 149.51,-45.48 155.43,-41.75\"/>\n",
"</g>\n",
"</g>\n",
"</svg>"
],
"text/plain": [
"<IPython.core.display.SVG object>"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# wait 20 secs to see display\n",
"cpal_chain.draw(path=\"web.svg\")\n",
"SVG(\"web.svg\")"
]
},
{
"cell_type": "markdown",
"id": "1f6f345a-bb16-4e64-83c4-cbbc789a8325",
"metadata": {},
"source": [
"### Unanswerable math\n",
"\n",
"Takeaway: PAL hallucinates, where CPAL, rather than hallucinate, answers with _\"unanswerable, narrative question and plot are incoherent\"_"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "068afd79-fd41-4ec2-b4d0-c64140dc413f",
"metadata": {},
"outputs": [],
"source": [
"question = (\n",
" \"Jan has three times the number of pets as Marcia.\"\n",
" \"Marcia has two more pets than Cindy.\"\n",
" \"If Cindy has ten pets, how many pets does Barak have?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "02f77db2-72e8-46c2-90b3-5e37ca42f80d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mdef solution():\n",
" \"\"\"Jan has three times the number of pets as Marcia.Marcia has two more pets than Cindy.If Cindy has ten pets, how many pets does Barak have?\"\"\"\n",
" cindy_pets = 10\n",
" marcia_pets = cindy_pets + 2\n",
" jan_pets = marcia_pets * 3\n",
" result = jan_pets\n",
" return result\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'36'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pal_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "925958de-e998-4ffa-8b2e-5a00ddae5026",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mstory outcome data\n",
" name code value depends_on\n",
"0 cindy pass 10.0 []\n",
"1 marcia marcia.value = cindy.value + 2 12.0 [cindy]\n",
"2 jan jan.value = marcia.value * 3 36.0 [marcia]\u001b[0m\n",
"\n",
"\u001b[36;1m\u001b[1;3mquery data\n",
"{\n",
" \"question\": \"how many pets does barak have?\",\n",
" \"expression\": \"SELECT name, value FROM df WHERE name = 'barak'\",\n",
" \"llm_error_msg\": \"\"\n",
"}\u001b[0m\n",
"\n",
"unanswerable, query and outcome are incoherent\n",
"\n",
"outcome:\n",
" name code value depends_on\n",
"0 cindy pass 10.0 []\n",
"1 marcia marcia.value = cindy.value + 2 12.0 [cindy]\n",
"2 jan jan.value = marcia.value * 3 36.0 [marcia]\n",
"query:\n",
"{'question': 'how many pets does barak have?', 'expression': \"SELECT name, value FROM df WHERE name = 'barak'\", 'llm_error_msg': ''}\n"
]
}
],
"source": [
"try:\n",
" cpal_chain.run(question)\n",
"except Exception as e_msg:\n",
" print(e_msg)"
]
},
{
"cell_type": "markdown",
"id": "095adc76",
"metadata": {},
"source": [
"### Basic math\n",
"\n",
"#### Causal mediator"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "3ecf03fa-8350-4c4e-8080-84a307ba6ad4",
"metadata": {},
"outputs": [],
"source": [
"question = (\n",
" \"Jan has three times the number of pets as Marcia. \"\n",
" \"Marcia has two more pets than Cindy. \"\n",
" \"If Cindy has four pets, how many total pets do the three have?\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "74e49c47-3eed-4abe-98b7-8e97bcd15944",
"metadata": {},
"source": [
"---\n",
"PAL"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "2e88395f-d014-4362-abb0-88f6800860bb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mdef solution():\n",
" \"\"\"Jan has three times the number of pets as Marcia. Marcia has two more pets than Cindy. If Cindy has four pets, how many total pets do the three have?\"\"\"\n",
" cindy_pets = 4\n",
" marcia_pets = cindy_pets + 2\n",
" jan_pets = marcia_pets * 3\n",
" total_pets = cindy_pets + marcia_pets + jan_pets\n",
" result = total_pets\n",
" return result\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'28'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pal_chain.run(question)"
]
},
{
"cell_type": "markdown",
"id": "20ba6640-3d17-4b59-8101-aaba89d68cf4",
"metadata": {},
"source": [
"---\n",
"CPAL"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "312a0943-a482-4ed0-a064-1e7a72e9479b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mstory outcome data\n",
" name code value depends_on\n",
"0 cindy pass 4.0 []\n",
"1 marcia marcia.value = cindy.value + 2 6.0 [cindy]\n",
"2 jan jan.value = marcia.value * 3 18.0 [marcia]\u001b[0m\n",
"\n",
"\u001b[36;1m\u001b[1;3mquery data\n",
"{\n",
" \"question\": \"how many total pets do the three have?\",\n",
" \"expression\": \"SELECT SUM(value) FROM df\",\n",
" \"llm_error_msg\": \"\"\n",
"}\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"28.0"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cpal_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "4466b975-ae2b-4252-972b-b3182a089ade",
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"92pt\" height=\"188pt\" viewBox=\"0.00 0.00 92.49 188.00\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 184)\">\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-184 88.49,-184 88.49,4 -4,4\"/>\n",
"<!-- cindy -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>cindy</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"42.25\" cy=\"-162\" rx=\"36\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"42.25\" y=\"-158.3\" font-family=\"Times,serif\" font-size=\"14.00\">cindy</text>\n",
"</g>\n",
"<!-- marcia -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>marcia</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"42.25\" cy=\"-90\" rx=\"42.49\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"42.25\" y=\"-86.3\" font-family=\"Times,serif\" font-size=\"14.00\">marcia</text>\n",
"</g>\n",
"<!-- cindy&#45;&gt;marcia -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>cindy-&gt;marcia</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M42.25,-143.7C42.25,-135.98 42.25,-126.71 42.25,-118.11\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"45.75,-118.1 42.25,-108.1 38.75,-118.1 45.75,-118.1\"/>\n",
"</g>\n",
"<!-- jan -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>jan</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"42.25\" cy=\"-18\" rx=\"27\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"42.25\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">jan</text>\n",
"</g>\n",
"<!-- marcia&#45;&gt;jan -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>marcia-&gt;jan</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M42.25,-71.7C42.25,-63.98 42.25,-54.71 42.25,-46.11\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"45.75,-46.1 42.25,-36.1 38.75,-46.1 45.75,-46.1\"/>\n",
"</g>\n",
"</g>\n",
"</svg>"
],
"text/plain": [
"<IPython.core.display.SVG object>"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# wait 20 secs to see display\n",
"cpal_chain.draw(path=\"web.svg\")\n",
"SVG(\"web.svg\")"
]
},
{
"cell_type": "markdown",
"id": "29fa7b8a-75a3-4270-82a2-2c31939cd7e0",
"metadata": {},
"source": [
"### Causal collider"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "618eddac-f0ef-4ab5-90ed-72e880fdeba3",
"metadata": {},
"outputs": [],
"source": [
"question = (\n",
" \"Jan has the number of pets as Marcia plus the number of pets as Cindy. \"\n",
" \"Marcia has no pets. \"\n",
" \"If Cindy has four pets, how many total pets do the three have?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "a01563f3-7974-4de4-8bd9-0b7d710aa0d3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mstory outcome data\n",
" name code value depends_on\n",
"0 marcia pass 0.0 []\n",
"1 cindy pass 4.0 []\n",
"2 jan jan.value = marcia.value + cindy.value 4.0 [marcia, cindy]\u001b[0m\n",
"\n",
"\u001b[36;1m\u001b[1;3mquery data\n",
"{\n",
" \"question\": \"how many total pets do the three have?\",\n",
" \"expression\": \"SELECT SUM(value) FROM df\",\n",
" \"llm_error_msg\": \"\"\n",
"}\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"8.0"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cpal_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "0fbe7243-0522-4946-b9a2-6e21e7c49a42",
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"182pt\" height=\"116pt\" viewBox=\"0.00 0.00 182.00 116.00\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 112)\">\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-112 178,-112 178,4 -4,4\"/>\n",
"<!-- marcia -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>marcia</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"42.25\" cy=\"-90\" rx=\"42.49\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"42.25\" y=\"-86.3\" font-family=\"Times,serif\" font-size=\"14.00\">marcia</text>\n",
"</g>\n",
"<!-- jan -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>jan</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"90.25\" cy=\"-18\" rx=\"27\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"90.25\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">jan</text>\n",
"</g>\n",
"<!-- marcia&#45;&gt;jan -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>marcia-&gt;jan</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M53.62,-72.41C59.57,-63.74 66.95,-52.97 73.53,-43.38\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"76.51,-45.21 79.28,-34.99 70.74,-41.26 76.51,-45.21\"/>\n",
"</g>\n",
"<!-- cindy -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>cindy</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"138.25\" cy=\"-90\" rx=\"36\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"138.25\" y=\"-86.3\" font-family=\"Times,serif\" font-size=\"14.00\">cindy</text>\n",
"</g>\n",
"<!-- cindy&#45;&gt;jan -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>cindy-&gt;jan</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M127.11,-72.77C121.09,-63.98 113.54,-52.96 106.83,-43.19\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"109.53,-40.94 100.99,-34.67 103.75,-44.89 109.53,-40.94\"/>\n",
"</g>\n",
"</g>\n",
"</svg>"
],
"text/plain": [
"<IPython.core.display.SVG object>"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# wait 20 secs to see display\n",
"cpal_chain.draw(path=\"web.svg\")\n",
"SVG(\"web.svg\")"
]
},
{
"cell_type": "markdown",
"id": "d4082538-ec03-44f0-aac3-07e03aad7555",
"metadata": {},
"source": [
"### Causal confounder"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "83932c30-950b-435a-b328-7993ce8cc6bd",
"metadata": {},
"outputs": [],
"source": [
"question = (\n",
" \"Jan has the number of pets as Marcia plus the number of pets as Cindy. \"\n",
" \"Marcia has two more pets than Cindy. \"\n",
" \"If Cindy has four pets, how many total pets do the three have?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "570de307-7c6b-4fdc-80c3-4361daa8a629",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mstory outcome data\n",
" name code value depends_on\n",
"0 cindy pass 4.0 []\n",
"1 marcia marcia.value = cindy.value + 2 6.0 [cindy]\n",
"2 jan jan.value = cindy.value + marcia.value 10.0 [cindy, marcia]\u001b[0m\n",
"\n",
"\u001b[36;1m\u001b[1;3mquery data\n",
"{\n",
" \"question\": \"how many total pets do the three have?\",\n",
" \"expression\": \"SELECT SUM(value) FROM df\",\n",
" \"llm_error_msg\": \"\"\n",
"}\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"20.0"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cpal_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "00375615-6b6d-4357-bdb8-f64f682f7605",
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"121pt\" height=\"188pt\" viewBox=\"0.00 0.00 120.99 188.00\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 184)\">\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-184 116.99,-184 116.99,4 -4,4\"/>\n",
"<!-- cindy -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>cindy</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"77.25\" cy=\"-162\" rx=\"36\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"77.25\" y=\"-158.3\" font-family=\"Times,serif\" font-size=\"14.00\">cindy</text>\n",
"</g>\n",
"<!-- marcia -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>marcia</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"42.25\" cy=\"-90\" rx=\"42.49\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"42.25\" y=\"-86.3\" font-family=\"Times,serif\" font-size=\"14.00\">marcia</text>\n",
"</g>\n",
"<!-- cindy&#45;&gt;marcia -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>cindy-&gt;marcia</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M68.95,-144.41C64.87,-136.25 59.86,-126.22 55.28,-117.07\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"58.33,-115.34 50.72,-107.96 52.07,-118.47 58.33,-115.34\"/>\n",
"</g>\n",
"<!-- jan -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>jan</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"77.25\" cy=\"-18\" rx=\"27\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"77.25\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">jan</text>\n",
"</g>\n",
"<!-- cindy&#45;&gt;jan -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>cindy-&gt;jan</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M83.73,-144.1C87.32,-133.84 91.42,-120.36 93.25,-108 95.58,-92.17 95.58,-87.83 93.25,-72 91.95,-63.21 89.5,-53.86 86.91,-45.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"90.19,-44.29 83.73,-35.9 83.55,-46.49 90.19,-44.29\"/>\n",
"</g>\n",
"<!-- marcia&#45;&gt;jan -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>marcia-&gt;jan</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M50.72,-72.06C54.86,-63.77 59.94,-53.62 64.53,-44.42\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"67.75,-45.82 69.09,-35.31 61.49,-42.69 67.75,-45.82\"/>\n",
"</g>\n",
"</g>\n",
"</svg>"
],
"text/plain": [
"<IPython.core.display.SVG object>"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# wait 20 secs to see display\n",
"cpal_chain.draw(path=\"web.svg\")\n",
"SVG(\"web.svg\")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "255683de-0c1c-4131-b277-99d09f5ac1fc",
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,206 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "dd7ec7af",
"metadata": {},
"source": [
"# Elasticsearch database\n",
"\n",
"Interact with Elasticsearch analytics database via Langchain. This chain builds search queries via the Elasticsearch DSL API (filters and aggregations).\n",
"\n",
"The Elasticsearch client must have permissions for index listing, mapping description and search queries.\n",
"\n",
"See [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html) for instructions on how to run Elasticsearch locally.\n",
"\n",
"Make sure to install the Elasticsearch Python client before:\n",
"\n",
"```sh\n",
"pip install elasticsearch\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "dd8eae75",
"metadata": {},
"outputs": [],
"source": [
"from elasticsearch import Elasticsearch\n",
"\n",
"from langchain.chains.elasticsearch_database import ElasticsearchDatabaseChain\n",
"from langchain.chat_models import ChatOpenAI"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5cde03bc",
"metadata": {},
"outputs": [],
"source": [
"# Initialize Elasticsearch python client.\n",
"# See https://elasticsearch-py.readthedocs.io/en/v8.8.2/api.html#elasticsearch.Elasticsearch\n",
"ELASTIC_SEARCH_SERVER = \"https://elastic:pass@localhost:9200\"\n",
"db = Elasticsearch(ELASTIC_SEARCH_SERVER)"
]
},
{
"cell_type": "markdown",
"id": "74a41374",
"metadata": {},
"source": [
"Uncomment the next cell to initially populate your db."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "430ada0f",
"metadata": {},
"outputs": [],
"source": [
"# customers = [\n",
"# {\"firstname\": \"Jennifer\", \"lastname\": \"Walters\"},\n",
"# {\"firstname\": \"Monica\",\"lastname\":\"Rambeau\"},\n",
"# {\"firstname\": \"Carol\",\"lastname\":\"Danvers\"},\n",
"# {\"firstname\": \"Wanda\",\"lastname\":\"Maximoff\"},\n",
"# {\"firstname\": \"Jennifer\",\"lastname\":\"Takeda\"},\n",
"# ]\n",
"# for i, customer in enumerate(customers):\n",
"# db.create(index=\"customers\", document=customer, id=i)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f36ae0d8",
"metadata": {},
"outputs": [],
"source": [
"llm = ChatOpenAI(model_name=\"gpt-4\", temperature=0)\n",
"chain = ElasticsearchDatabaseChain.from_llm(llm=llm, database=db, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "b5d22d9d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ElasticsearchDatabaseChain chain...\u001b[0m\n",
"What are the first names of all the customers?\n",
"ESQuery:\u001b[32;1m\u001b[1;3m{'size': 10, 'query': {'match_all': {}}, '_source': ['firstname']}\u001b[0m\n",
"ESResult: \u001b[33;1m\u001b[1;3m{'took': 5, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 6, 'relation': 'eq'}, 'max_score': 1.0, 'hits': [{'_index': 'customers', '_id': '0', '_score': 1.0, '_source': {'firstname': 'Jennifer'}}, {'_index': 'customers', '_id': '1', '_score': 1.0, '_source': {'firstname': 'Monica'}}, {'_index': 'customers', '_id': '2', '_score': 1.0, '_source': {'firstname': 'Carol'}}, {'_index': 'customers', '_id': '3', '_score': 1.0, '_source': {'firstname': 'Wanda'}}, {'_index': 'customers', '_id': '4', '_score': 1.0, '_source': {'firstname': 'Jennifer'}}, {'_index': 'customers', '_id': 'firstname', '_score': 1.0, '_source': {'firstname': 'Jennifer'}}]}}\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3mThe first names of all the customers are Jennifer, Monica, Carol, Wanda, and Jennifer.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The first names of all the customers are Jennifer, Monica, Carol, Wanda, and Jennifer.'"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"What are the first names of all the customers?\"\n",
"chain.run(question)"
]
},
{
"cell_type": "markdown",
"id": "9b4bfada",
"metadata": {},
"source": [
"## Custom prompt\n",
"\n",
"For best results you'll likely need to customize the prompt."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "0a494f5b",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.elasticsearch_database.prompts import DEFAULT_DSL_TEMPLATE\n",
"from langchain.prompts.prompt import PromptTemplate\n",
"\n",
"PROMPT_TEMPLATE = \"\"\"Given an input question, create a syntactically correct Elasticsearch query to run. Unless the user specifies in their question a specific number of examples they wish to obtain, always limit your query to at most {top_k} results. You can order the results by a relevant column to return the most interesting examples in the database.\n",
"\n",
"Unless told to do not query for all the columns from a specific index, only ask for a the few relevant columns given the question.\n",
"\n",
"Pay attention to use only the column names that you can see in the mapping description. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which index. Return the query as valid json.\n",
"\n",
"Use the following format:\n",
"\n",
"Question: Question here\n",
"ESQuery: Elasticsearch Query formatted as json\n",
"\"\"\"\n",
"\n",
"PROMPT = PromptTemplate.from_template(\n",
" PROMPT_TEMPLATE,\n",
")\n",
"chain = ElasticsearchDatabaseChain.from_llm(llm=llm, database=db, query_prompt=PROMPT)"
]
},
{
"cell_type": "markdown",
"id": "372b8f93",
"metadata": {},
"source": [
"## Adding example rows from each index\n",
"\n",
"Sometimes, the format of the data is not obvious and it is optimal to include a sample of rows from the indices in the prompt to allow the LLM to understand the data before providing a final query. Here we will use this feature to let the LLM know that artists are saved with their full names by providing ten rows from the index."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eef818de",
"metadata": {},
"outputs": [],
"source": [
"chain = ElasticsearchDatabaseChain.from_llm(\n",
" llm=ChatOpenAI(temperature=0),\n",
" database=db,\n",
" sample_documents_in_index_info=2, # 2 rows from each index will be included in the prompt as sample data\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "venv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,271 +1,566 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6605e7f7",
"metadata": {},
"source": [
"# Extraction\n",
"\n",
"The extraction chain uses the OpenAI `functions` parameter to specify a schema to extract entities from a document. This helps us make sure that the model outputs exactly the schema of entities and properties that we want, with their appropriate types.\n",
"\n",
"The extraction chain is to be used when we want to extract several entities with their properties from the same passage (i.e. what people were mentioned in this passage?)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "34f04daf",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.4) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
" warnings.warn(\n"
]
}
],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.chains import create_extraction_chain, create_extraction_chain_pydantic\n",
"from langchain.prompts import ChatPromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a2648974",
"metadata": {},
"outputs": [],
"source": [
"llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\")"
]
},
{
"cell_type": "markdown",
"id": "5ef034ce",
"metadata": {},
"source": [
"## Extracting entities"
]
},
{
"cell_type": "markdown",
"id": "78ff9df9",
"metadata": {},
"source": [
"To extract entities, we need to create a schema like the following, were we specify all the properties we want to find and the type we expect them to have. We can also specify which of these properties are required and which are optional."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4ac43eba",
"metadata": {},
"outputs": [],
"source": [
"schema = {\n",
" \"properties\": {\n",
" \"person_name\": {\"type\": \"string\"},\n",
" \"person_height\": {\"type\": \"integer\"},\n",
" \"person_hair_color\": {\"type\": \"string\"},\n",
" \"dog_name\": {\"type\": \"string\"},\n",
" \"dog_breed\": {\"type\": \"string\"},\n",
" },\n",
" \"required\": [\"person_name\", \"person_height\"],\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "640bd005",
"metadata": {},
"outputs": [],
"source": [
"inp = \"\"\"\n",
"Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
"Alex's dog Frosty is a labrador and likes to play hide and seek.\n",
" \"\"\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "64313214",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain(schema, llm)"
]
},
{
"cell_type": "markdown",
"id": "17c48adb",
"metadata": {},
"source": [
"As we can see, we extracted the required entities and their properties in the required format:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "cc5436ed",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'person_name': 'Alex',\n",
" 'person_height': 5,\n",
" 'person_hair_color': 'blonde',\n",
" 'dog_name': 'Frosty',\n",
" 'dog_breed': 'labrador'},\n",
" {'person_name': 'Claudia',\n",
" 'person_height': 6,\n",
" 'person_hair_color': 'brunette'}]"
"cell_type": "markdown",
"id": "6605e7f7",
"metadata": {},
"source": [
"# Extraction\n",
"\n",
"The extraction chain uses the OpenAI `functions` parameter to specify a schema to extract entities from a document. This helps us make sure that the model outputs exactly the schema of entities and properties that we want, with their appropriate types.\n",
"\n",
"The extraction chain is to be used when we want to extract several entities with their properties from the same passage (i.e. what people were mentioned in this passage?)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(inp)"
]
},
{
"cell_type": "markdown",
"id": "698b4c4d",
"metadata": {},
"source": [
"## Pydantic example"
]
},
{
"cell_type": "markdown",
"id": "6504a6d9",
"metadata": {},
"source": [
"We can also use a Pydantic schema to choose the required properties and types and we will set as 'Optional' those that are not strictly required.\n",
"\n",
"By using the `create_extraction_chain_pydantic` function, we can send a Pydantic schema as input and the output will be an instantiated object that respects our desired schema. \n",
"\n",
"In this way, we can specify our schema in the same manner that we would a new class or function in Python - with purely Pythonic types."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "6792866b",
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional, List\n",
"from pydantic import BaseModel, Field"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "36a63761",
"metadata": {},
"outputs": [],
"source": [
"class Properties(BaseModel):\n",
" person_name: str\n",
" person_height: int\n",
" person_hair_color: str\n",
" dog_breed: Optional[str]\n",
" dog_name: Optional[str]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8ffd1e57",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain_pydantic(pydantic_schema=Properties, llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "24baa954",
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"inp = \"\"\"\n",
"Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
"Alex's dog Frosty is a labrador and likes to play hide and seek.\n",
" \"\"\""
]
},
{
"cell_type": "markdown",
"id": "84e0a241",
"metadata": {},
"source": [
"As we can see, we extracted the required entities and their properties in the required format:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "f771df58",
"metadata": {},
"outputs": [
},
{
"data": {
"text/plain": [
"[Properties(person_name='Alex', person_height=5, person_hair_color='blonde', dog_breed='labrador', dog_name='Frosty'),\n",
" Properties(person_name='Claudia', person_height=6, person_hair_color='brunette', dog_breed=None, dog_name=None)]"
"cell_type": "code",
"execution_count": 2,
"id": "34f04daf",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.4) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
" warnings.warn(\n"
]
}
],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.chains import create_extraction_chain, create_extraction_chain_pydantic\n",
"from langchain.prompts import ChatPromptTemplate"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
},
{
"cell_type": "code",
"execution_count": 3,
"id": "a2648974",
"metadata": {},
"outputs": [],
"source": [
"llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\")"
]
},
{
"cell_type": "markdown",
"id": "5ef034ce",
"metadata": {},
"source": [
"## Extracting entities"
]
},
{
"cell_type": "markdown",
"id": "78ff9df9",
"metadata": {},
"source": [
"To extract entities, we need to create a schema where we specify all the properties we want to find and the type we expect them to have. We can also specify which of these properties are required and which are optional."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "4ac43eba",
"metadata": {},
"outputs": [],
"source": [
"schema = {\n",
" \"properties\": {\n",
" \"name\": {\"type\": \"string\"},\n",
" \"height\": {\"type\": \"integer\"},\n",
" \"hair_color\": {\"type\": \"string\"},\n",
" },\n",
" \"required\": [\"name\", \"height\"],\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "640bd005",
"metadata": {},
"outputs": [],
"source": [
"inp = \"\"\"\n",
"Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
" \"\"\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "64313214",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain(schema, llm)"
]
},
{
"cell_type": "markdown",
"id": "17c48adb",
"metadata": {},
"source": [
"As we can see, we extracted the required entities and their properties in the required format (it even calculated Claudia's height before returning!)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "cc5436ed",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'name': 'Alex', 'height': 5, 'hair_color': 'blonde'},\n",
" {'name': 'Claudia', 'height': 6, 'hair_color': 'brunette'}]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(inp)"
]
},
{
"cell_type": "markdown",
"id": "8d51fcdc",
"metadata": {},
"source": [
"## Several entity types"
]
},
{
"cell_type": "markdown",
"id": "5813affe",
"metadata": {},
"source": [
"Notice that we are using OpenAI functions under the hood and thus the model can only call one function per request (with one, unique schema)"
]
},
{
"cell_type": "markdown",
"id": "511b9838",
"metadata": {},
"source": [
"If we want to extract more than one entity type, we need to introduce a little hack - we will define our properties with an included entity type. \n",
"\n",
"Following we have an example where we also want to extract dog attributes from the passage. Notice the 'person_' and 'dog_' prefixes we use for each property; this tells the model which entity type the property refers to. In this way, the model can return properties from several entity types in one single call."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "cf243a26",
"metadata": {},
"outputs": [],
"source": [
"schema = {\n",
" \"properties\": {\n",
" \"person_name\": {\"type\": \"string\"},\n",
" \"person_height\": {\"type\": \"integer\"},\n",
" \"person_hair_color\": {\"type\": \"string\"},\n",
" \"dog_name\": {\"type\": \"string\"},\n",
" \"dog_breed\": {\"type\": \"string\"},\n",
" },\n",
" \"required\": [\"person_name\", \"person_height\"],\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "52841fb3",
"metadata": {},
"outputs": [],
"source": [
"inp = \"\"\"\n",
"Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
"Alex's dog Frosty is a labrador and likes to play hide and seek.\n",
" \"\"\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "93f904ab",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain(schema, llm)"
]
},
{
"cell_type": "markdown",
"id": "eb074f7b",
"metadata": {},
"source": [
"People attributes and dog attributes were correctly extracted from the text in the same call"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "db3e9e17",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'person_name': 'Alex',\n",
" 'person_height': 5,\n",
" 'person_hair_color': 'blonde',\n",
" 'dog_name': 'Frosty',\n",
" 'dog_breed': 'labrador'},\n",
" {'person_name': 'Claudia',\n",
" 'person_height': 6,\n",
" 'person_hair_color': 'brunette'}]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(inp)"
]
},
{
"cell_type": "markdown",
"id": "0273e0e2",
"metadata": {},
"source": [
"## Unrelated entities"
]
},
{
"cell_type": "markdown",
"id": "c07b3480",
"metadata": {},
"source": [
"What if our entities are unrelated? In that case, the model will return the unrelated entities in different dictionaries, allowing us to successfully extract several unrelated entity types in the same call."
]
},
{
"cell_type": "markdown",
"id": "01d98af0",
"metadata": {},
"source": [
"Notice that we use `required: []`: we need to allow the model to return **only** person attributes or **only** dog attributes for a single entity (person or dog)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "e584c993",
"metadata": {},
"outputs": [],
"source": [
"schema = {\n",
" \"properties\": {\n",
" \"person_name\": {\"type\": \"string\"},\n",
" \"person_height\": {\"type\": \"integer\"},\n",
" \"person_hair_color\": {\"type\": \"string\"},\n",
" \"dog_name\": {\"type\": \"string\"},\n",
" \"dog_breed\": {\"type\": \"string\"},\n",
" },\n",
" \"required\": [],\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "ad6b105f",
"metadata": {},
"outputs": [],
"source": [
"inp = \"\"\"\n",
"Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
"\n",
"Willow is a German Shepherd that likes to play with other dogs and can always be found playing with Milo, a border collie that lives close by.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "6bfe5a33",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain(schema, llm)"
]
},
{
"cell_type": "markdown",
"id": "24fe09af",
"metadata": {},
"source": [
"We have each entity in its own separate dictionary, with only the appropriate attributes being returned"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "f6e1fd89",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'person_name': 'Alex', 'person_height': 5, 'person_hair_color': 'blonde'},\n",
" {'person_name': 'Claudia',\n",
" 'person_height': 6,\n",
" 'person_hair_color': 'brunette'},\n",
" {'dog_name': 'Willow', 'dog_breed': 'German Shepherd'},\n",
" {'dog_name': 'Milo', 'dog_breed': 'border collie'}]"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(inp)"
]
},
{
"cell_type": "markdown",
"id": "0ac466d1",
"metadata": {},
"source": [
"## Extra info for an entity"
]
},
{
"cell_type": "markdown",
"id": "d240ffc1",
"metadata": {},
"source": [
"What if.. _we don't know what we want?_ More specifically, say we know a few properties we want to extract for a given entity but we also want to know if there's any extra information in the passage. Fortunately, we don't need to structure everything - we can have unstructured extraction as well. \n",
"\n",
"We can do this by introducing another hack, namely the *extra_info* attribute - let's see an example."
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "f19685f6",
"metadata": {},
"outputs": [],
"source": [
"schema = {\n",
" \"properties\": {\n",
" \"person_name\": {\"type\": \"string\"},\n",
" \"person_height\": {\"type\": \"integer\"},\n",
" \"person_hair_color\": {\"type\": \"string\"},\n",
" \"dog_name\": {\"type\": \"string\"},\n",
" \"dog_breed\": {\"type\": \"string\"},\n",
" \"dog_extra_info\": {\"type\": \"string\"},\n",
" },\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 81,
"id": "200c3477",
"metadata": {},
"outputs": [],
"source": [
"inp = \"\"\"\n",
"Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
"\n",
"Willow is a German Shepherd that likes to play with other dogs and can always be found playing with Milo, a border collie that lives close by.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 82,
"id": "ddad7dc6",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain(schema, llm)"
]
},
{
"cell_type": "markdown",
"id": "e5c0dbbc",
"metadata": {},
"source": [
"It is nice to know more about Willow and Milo!"
]
},
{
"cell_type": "code",
"execution_count": 83,
"id": "c22cfd30",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'person_name': 'Alex', 'person_height': 5, 'person_hair_color': 'blonde'},\n",
" {'person_name': 'Claudia',\n",
" 'person_height': 6,\n",
" 'person_hair_color': 'brunette'},\n",
" {'dog_name': 'Willow',\n",
" 'dog_breed': 'German Shepherd',\n",
" 'dog_extra_information': 'likes to play with other dogs'},\n",
" {'dog_name': 'Milo',\n",
" 'dog_breed': 'border collie',\n",
" 'dog_extra_information': 'lives close by'}]"
]
},
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(inp)"
]
},
{
"cell_type": "markdown",
"id": "698b4c4d",
"metadata": {},
"source": [
"## Pydantic example"
]
},
{
"cell_type": "markdown",
"id": "6504a6d9",
"metadata": {},
"source": [
"We can also use a Pydantic schema to choose the required properties and types and we will set as 'Optional' those that are not strictly required.\n",
"\n",
"By using the `create_extraction_chain_pydantic` function, we can send a Pydantic schema as input and the output will be an instantiated object that respects our desired schema. \n",
"\n",
"In this way, we can specify our schema in the same manner that we would a new class or function in Python - with purely Pythonic types."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "6792866b",
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional, List\n",
"from pydantic import BaseModel, Field"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "36a63761",
"metadata": {},
"outputs": [],
"source": [
"class Properties(BaseModel):\n",
" person_name: str\n",
" person_height: int\n",
" person_hair_color: str\n",
" dog_breed: Optional[str]\n",
" dog_name: Optional[str]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8ffd1e57",
"metadata": {},
"outputs": [],
"source": [
"chain = create_extraction_chain_pydantic(pydantic_schema=Properties, llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "24baa954",
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"inp = \"\"\"\n",
"Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
"Alex's dog Frosty is a labrador and likes to play hide and seek.\n",
" \"\"\""
]
},
{
"cell_type": "markdown",
"id": "84e0a241",
"metadata": {},
"source": [
"As we can see, we extracted the required entities and their properties in the required format:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "f771df58",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Properties(person_name='Alex', person_height=5, person_hair_color='blonde', dog_breed='labrador', dog_name='Frosty'),\n",
" Properties(person_name='Claudia', person_height=6, person_hair_color='brunette', dog_breed=None, dog_name=None)]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(inp)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0df61283",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
],
"source": [
"chain.run(inp)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0df61283",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -72,7 +72,10 @@
"import numpy as np\n",
"\n",
"from langchain.schema import BaseRetriever\n",
"from langchain.callbacks.manager import AsyncCallbackManagerForRetrieverRun, CallbackManagerForRetrieverRun\n",
"from langchain.callbacks.manager import (\n",
" AsyncCallbackManagerForRetrieverRun,\n",
" CallbackManagerForRetrieverRun,\n",
")\n",
"from langchain.utilities import GoogleSerperAPIWrapper\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.chat_models import ChatOpenAI\n",
@@ -97,13 +100,15 @@
"outputs": [],
"source": [
"class SerperSearchRetriever(BaseRetriever):\n",
"\n",
" search: GoogleSerperAPIWrapper = None\n",
"\n",
" def _get_relevant_documents(self, query: str, *, run_manager: CallbackManagerForRetrieverRun, **kwargs: Any) -> List[Document]:\n",
" def _get_relevant_documents(\n",
" self, query: str, *, run_manager: CallbackManagerForRetrieverRun, **kwargs: Any\n",
" ) -> List[Document]:\n",
" return [Document(page_content=self.search.run(query))]\n",
"\n",
" async def _aget_relevant_documents(self,\n",
" async def _aget_relevant_documents(\n",
" self,\n",
" query: str,\n",
" *,\n",
" run_manager: AsyncCallbackManagerForRetrieverRun,\n",

View File

@@ -83,9 +83,15 @@
"schema = client.schema()\n",
"schema.propertyKey(\"name\").asText().ifNotExist().create()\n",
"schema.propertyKey(\"birthDate\").asText().ifNotExist().create()\n",
"schema.vertexLabel(\"Person\").properties(\"name\", \"birthDate\").usePrimaryKeyId().primaryKeys(\"name\").ifNotExist().create()\n",
"schema.vertexLabel(\"Movie\").properties(\"name\").usePrimaryKeyId().primaryKeys(\"name\").ifNotExist().create()\n",
"schema.edgeLabel(\"ActedIn\").sourceLabel(\"Person\").targetLabel(\"Movie\").ifNotExist().create()"
"schema.vertexLabel(\"Person\").properties(\n",
" \"name\", \"birthDate\"\n",
").usePrimaryKeyId().primaryKeys(\"name\").ifNotExist().create()\n",
"schema.vertexLabel(\"Movie\").properties(\"name\").usePrimaryKeyId().primaryKeys(\n",
" \"name\"\n",
").ifNotExist().create()\n",
"schema.edgeLabel(\"ActedIn\").sourceLabel(\"Person\").targetLabel(\n",
" \"Movie\"\n",
").ifNotExist().create()"
]
},
{
@@ -124,7 +130,9 @@
"\n",
"g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather\", {})\n",
"g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather Part II\", {})\n",
"g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather Coda The Death of Michael Corleone\", {})\n",
"g.addEdge(\n",
" \"ActedIn\", \"1:Al Pacino\", \"2:The Godfather Coda The Death of Michael Corleone\", {}\n",
")\n",
"g.addEdge(\"ActedIn\", \"1:Robert De Niro\", \"2:The Godfather Part II\", {})"
]
},
@@ -164,7 +172,7 @@
" password=\"admin\",\n",
" address=\"localhost\",\n",
" port=8080,\n",
" graph=\"hugegraph\"\n",
" graph=\"hugegraph\",\n",
")"
]
},
@@ -228,9 +236,7 @@
"metadata": {},
"outputs": [],
"source": [
"chain = HugeGraphQAChain.from_llm(\n",
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n",
")"
"chain = HugeGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph, verbose=True)"
]
},
{

View File

@@ -31,6 +31,7 @@
"outputs": [],
"source": [
"import kuzu\n",
"\n",
"db = kuzu.Database(\"test_db\")\n",
"conn = kuzu.Connection(db)"
]
@@ -61,7 +62,9 @@
],
"source": [
"conn.execute(\"CREATE NODE TABLE Movie (name STRING, PRIMARY KEY(name))\")\n",
"conn.execute(\"CREATE NODE TABLE Person (name STRING, birthDate STRING, PRIMARY KEY(name))\")\n",
"conn.execute(\n",
" \"CREATE NODE TABLE Person (name STRING, birthDate STRING, PRIMARY KEY(name))\"\n",
")\n",
"conn.execute(\"CREATE REL TABLE ActedIn (FROM Person TO Movie)\")"
]
},
@@ -94,11 +97,21 @@
"conn.execute(\"CREATE (:Person {name: 'Robert De Niro', birthDate: '1943-08-17'})\")\n",
"conn.execute(\"CREATE (:Movie {name: 'The Godfather'})\")\n",
"conn.execute(\"CREATE (:Movie {name: 'The Godfather: Part II'})\")\n",
"conn.execute(\"CREATE (:Movie {name: 'The Godfather Coda: The Death of Michael Corleone'})\")\n",
"conn.execute(\"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather' CREATE (p)-[:ActedIn]->(m)\")\n",
"conn.execute(\"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)\")\n",
"conn.execute(\"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather Coda: The Death of Michael Corleone' CREATE (p)-[:ActedIn]->(m)\")\n",
"conn.execute(\"MATCH (p:Person), (m:Movie) WHERE p.name = 'Robert De Niro' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)\")"
"conn.execute(\n",
" \"CREATE (:Movie {name: 'The Godfather Coda: The Death of Michael Corleone'})\"\n",
")\n",
"conn.execute(\n",
" \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather' CREATE (p)-[:ActedIn]->(m)\"\n",
")\n",
"conn.execute(\n",
" \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)\"\n",
")\n",
"conn.execute(\n",
" \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather Coda: The Death of Michael Corleone' CREATE (p)-[:ActedIn]->(m)\"\n",
")\n",
"conn.execute(\n",
" \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Robert De Niro' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)\"\n",
")"
]
},
{
@@ -137,9 +150,7 @@
"metadata": {},
"outputs": [],
"source": [
"chain = KuzuQAChain.from_llm(\n",
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n",
")"
"chain = KuzuQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph, verbose=True)"
]
},
{

View File

@@ -60,7 +60,8 @@
],
"metadata": {
"collapsed": false
}
},
"id": "7af596b5"
},
{
"cell_type": "markdown",
@@ -150,20 +151,20 @@
"text": [
"\n",
"\n",
"\u001B[1m> Entering new GraphSparqlQAChain chain...\u001B[0m\n",
"\u001b[1m> Entering new GraphSparqlQAChain chain...\u001b[0m\n",
"Identified intent:\n",
"\u001B[32;1m\u001B[1;3mSELECT\u001B[0m\n",
"\u001b[32;1m\u001b[1;3mSELECT\u001b[0m\n",
"Generated SPARQL:\n",
"\u001B[32;1m\u001B[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
"\u001b[32;1m\u001b[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
"SELECT ?homepage\n",
"WHERE {\n",
" ?person foaf:name \"Tim Berners-Lee\" .\n",
" ?person foaf:workplaceHomepage ?homepage .\n",
"}\u001B[0m\n",
"}\u001b[0m\n",
"Full Context:\n",
"\u001B[32;1m\u001B[1;3m[]\u001B[0m\n",
"\u001b[32;1m\u001b[1;3m[]\u001b[0m\n",
"\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
@@ -207,19 +208,19 @@
"text": [
"\n",
"\n",
"\u001B[1m> Entering new GraphSparqlQAChain chain...\u001B[0m\n",
"\u001b[1m> Entering new GraphSparqlQAChain chain...\u001b[0m\n",
"Identified intent:\n",
"\u001B[32;1m\u001B[1;3mUPDATE\u001B[0m\n",
"\u001b[32;1m\u001b[1;3mUPDATE\u001b[0m\n",
"Generated SPARQL:\n",
"\u001B[32;1m\u001B[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
"\u001b[32;1m\u001b[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
"INSERT {\n",
" ?person foaf:workplaceHomepage <http://www.w3.org/foo/bar/> .\n",
"}\n",
"WHERE {\n",
" ?person foaf:name \"Timothy Berners-Lee\" .\n",
"}\u001B[0m\n",
"}\u001b[0m\n",
"\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
@@ -234,7 +235,9 @@
}
],
"source": [
"chain.run(\"Save that the person with the name 'Timothy Berners-Lee' has a work homepage at 'http://www.w3.org/foo/bar/'\")"
"chain.run(\n",
" \"Save that the person with the name 'Timothy Berners-Lee' has a work homepage at 'http://www.w3.org/foo/bar/'\"\n",
")"
]
},
{
@@ -297,4 +300,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}

View File

@@ -0,0 +1,162 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# LLM Symbolic Math \n",
"This notebook showcases using LLMs and Python to Solve Algebraic Equations. Under the hood is makes use of [SymPy](https://www.sympy.org/en/index.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.chains.llm_symbolic_math.base import LLMSymbolicMathChain\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"llm_symbolic_math = LLMSymbolicMathChain.from_llm(llm)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Integrals and derivates"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Answer: exp(x)*sin(x) + exp(x)*cos(x)'"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_symbolic_math.run(\"What is the derivative of sin(x)*exp(x) with respect to x?\")"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Answer: exp(x)*sin(x)'"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_symbolic_math.run(\n",
" \"What is the integral of exp(x)*sin(x) + exp(x)*cos(x) with respect to x?\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solve linear and differential equations"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Answer: Eq(y(t), C2*exp(-t) + (C1 + t/2)*exp(t))'"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_symbolic_math.run('Solve the differential equation y\" - y = e^t')"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Answer: {0, -sqrt(3)*I/3, sqrt(3)*I/3}'"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_symbolic_math.run(\"What are the solutions to this equation y^3 + 1/3y?\")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Answer: (3 - sqrt(7), -sqrt(7) - 2, 1 - sqrt(7)), (sqrt(7) + 3, -2 + sqrt(7), 1 + sqrt(7))'"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_symbolic_math.run(\"x = y + 5, y = z - 3, z = x * y. Solve for x, y, z\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "venv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -38,7 +38,7 @@
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_documents(documents)\n",
"for i, text in enumerate(texts):\n",
" text.metadata['source'] = f\"{i}-pl\"\n",
" text.metadata[\"source\"] = f\"{i}-pl\"\n",
"embeddings = OpenAIEmbeddings()\n",
"docsearch = Chroma.from_documents(texts, embeddings)"
]
@@ -97,8 +97,8 @@
"outputs": [],
"source": [
"final_qa_chain = StuffDocumentsChain(\n",
" llm_chain=qa_chain, \n",
" document_variable_name='context',\n",
" llm_chain=qa_chain,\n",
" document_variable_name=\"context\",\n",
" document_prompt=doc_prompt,\n",
")"
]
@@ -111,8 +111,7 @@
"outputs": [],
"source": [
"retrieval_qa = RetrievalQA(\n",
" retriever=docsearch.as_retriever(),\n",
" combine_documents_chain=final_qa_chain\n",
" retriever=docsearch.as_retriever(), combine_documents_chain=final_qa_chain\n",
")"
]
},
@@ -175,8 +174,8 @@
"outputs": [],
"source": [
"final_qa_chain_pydantic = StuffDocumentsChain(\n",
" llm_chain=qa_chain_pydantic, \n",
" document_variable_name='context',\n",
" llm_chain=qa_chain_pydantic,\n",
" document_variable_name=\"context\",\n",
" document_prompt=doc_prompt,\n",
")"
]
@@ -189,8 +188,7 @@
"outputs": [],
"source": [
"retrieval_qa_pydantic = RetrievalQA(\n",
" retriever=docsearch.as_retriever(),\n",
" combine_documents_chain=final_qa_chain_pydantic\n",
" retriever=docsearch.as_retriever(), combine_documents_chain=final_qa_chain_pydantic\n",
")"
]
},
@@ -235,6 +233,7 @@
"from langchain.chains import ConversationalRetrievalChain\n",
"from langchain.memory import ConversationBufferMemory\n",
"from langchain.chains import LLMChain\n",
"\n",
"memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True)\n",
"_template = \"\"\"Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.\\\n",
"Make sure to avoid using any unclear pronouns.\n",
@@ -258,10 +257,10 @@
"outputs": [],
"source": [
"qa = ConversationalRetrievalChain(\n",
" question_generator=condense_question_chain, \n",
" question_generator=condense_question_chain,\n",
" retriever=docsearch.as_retriever(),\n",
" memory=memory, \n",
" combine_docs_chain=final_qa_chain\n",
" memory=memory,\n",
" combine_docs_chain=final_qa_chain,\n",
")"
]
},
@@ -389,7 +388,9 @@
" \"\"\"An answer to the question being asked, with sources.\"\"\"\n",
"\n",
" answer: str = Field(..., description=\"Answer to the question that was asked\")\n",
" countries_referenced: List[str] = Field(..., description=\"All of the countries mentioned in the sources\")\n",
" countries_referenced: List[str] = Field(\n",
" ..., description=\"All of the countries mentioned in the sources\"\n",
" )\n",
" sources: List[str] = Field(\n",
" ..., description=\"List of sources used to answer the question\"\n",
" )\n",
@@ -405,20 +406,23 @@
" HumanMessage(content=\"Answer question using the following context\"),\n",
" HumanMessagePromptTemplate.from_template(\"{context}\"),\n",
" HumanMessagePromptTemplate.from_template(\"Question: {question}\"),\n",
" HumanMessage(content=\"Tips: Make sure to answer in the correct format. Return all of the countries mentioned in the sources in uppercase characters.\"),\n",
" HumanMessage(\n",
" content=\"Tips: Make sure to answer in the correct format. Return all of the countries mentioned in the sources in uppercase characters.\"\n",
" ),\n",
"]\n",
"\n",
"chain_prompt = ChatPromptTemplate(messages=prompt_messages)\n",
"\n",
"qa_chain_pydantic = create_qa_with_structure_chain(llm, CustomResponseSchema, output_parser=\"pydantic\", prompt=chain_prompt)\n",
"qa_chain_pydantic = create_qa_with_structure_chain(\n",
" llm, CustomResponseSchema, output_parser=\"pydantic\", prompt=chain_prompt\n",
")\n",
"final_qa_chain_pydantic = StuffDocumentsChain(\n",
" llm_chain=qa_chain_pydantic,\n",
" document_variable_name='context',\n",
" document_variable_name=\"context\",\n",
" document_prompt=doc_prompt,\n",
")\n",
"retrieval_qa_pydantic = RetrievalQA(\n",
" retriever=docsearch.as_retriever(),\n",
" combine_documents_chain=final_qa_chain_pydantic\n",
" retriever=docsearch.as_retriever(), combine_documents_chain=final_qa_chain_pydantic\n",
")\n",
"query = \"What did he say about russia\"\n",
"retrieval_qa_pydantic.run(query)"

View File

@@ -35,7 +35,9 @@
"metadata": {},
"outputs": [],
"source": [
"chain = get_openapi_chain(\"https://www.klarna.com/us/shopping/public/openai/v0/api-docs/\")"
"chain = get_openapi_chain(\n",
" \"https://www.klarna.com/us/shopping/public/openai/v0/api-docs/\"\n",
")"
]
},
{
@@ -186,7 +188,9 @@
},
"outputs": [],
"source": [
"chain = get_openapi_chain(\"https://gist.githubusercontent.com/roaldnefs/053e505b2b7a807290908fe9aa3e1f00/raw/0a212622ebfef501163f91e23803552411ed00e4/openapi.yaml\")"
"chain = get_openapi_chain(\n",
" \"https://gist.githubusercontent.com/roaldnefs/053e505b2b7a807290908fe9aa3e1f00/raw/0a212622ebfef501163f91e23803552411ed00e4/openapi.yaml\"\n",
")"
]
},
{

View File

@@ -28,7 +28,7 @@
"\n",
"from pydantic import Extra\n",
"\n",
"from langchain.schemea import BaseLanguageModel\n",
"from langchain.schema import BaseLanguageModel\n",
"from langchain.callbacks.manager import (\n",
" AsyncCallbackManagerForChainRun,\n",
" CallbackManagerForChainRun,\n",

View File

@@ -22,7 +22,8 @@
"from typing import Optional\n",
"\n",
"from langchain.chains.openai_functions import (\n",
" create_openai_fn_chain, create_structured_output_chain\n",
" create_openai_fn_chain,\n",
" create_structured_output_chain,\n",
")\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate\n",
@@ -58,8 +59,10 @@
"source": [
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class Person(BaseModel):\n",
" \"\"\"Identifying information about a person.\"\"\"\n",
"\n",
" name: str = Field(..., description=\"The person's name\")\n",
" age: int = Field(..., description=\"The person's age\")\n",
" fav_food: Optional[str] = Field(None, description=\"The person's favorite food\")"
@@ -77,12 +80,13 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mSystem: You are a world class algorithm for extracting information in structured formats.\n",
"Human: Use the given format to extract information from the following input:\n",
"Human: Sally is 13\n",
"Human: Tips: Make sure to answer in the correct format\u001b[0m\n",
" {'function_call': {'name': '_OutputFormatter', 'arguments': '{\\n \"output\": {\\n \"name\": \"Sally\",\\n \"age\": 13,\\n \"fav_food\": \"Unknown\"\\n }\\n}'}}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -90,7 +94,7 @@
{
"data": {
"text/plain": [
"{'name': 'Sally', 'age': 13}"
"Person(name='Sally', age=13, fav_food='Unknown')"
]
},
"execution_count": 3,
@@ -100,16 +104,18 @@
],
"source": [
"# If we pass in a model explicitly, we need to make sure it supports the OpenAI function-calling API.\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0613\", temperature=0)\n",
"llm = ChatOpenAI(model=\"gpt-4\", temperature=0)\n",
"\n",
"prompt_msgs = [\n",
" SystemMessage(\n",
" content=\"You are a world class algorithm for extracting information in structured formats.\"\n",
" ),\n",
" HumanMessage(content=\"Use the given format to extract information from the following input:\"),\n",
" HumanMessagePromptTemplate.from_template(\"{input}\"),\n",
" HumanMessage(content=\"Tips: Make sure to answer in the correct format\"),\n",
" ]\n",
" SystemMessage(\n",
" content=\"You are a world class algorithm for extracting information in structured formats.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Use the given format to extract information from the following input:\"\n",
" ),\n",
" HumanMessagePromptTemplate.from_template(\"{input}\"),\n",
" HumanMessage(content=\"Tips: Make sure to answer in the correct format\"),\n",
"]\n",
"prompt = ChatPromptTemplate(messages=prompt_msgs)\n",
"\n",
"chain = create_structured_output_chain(Person, llm, prompt, verbose=True)\n",
@@ -136,12 +142,13 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mSystem: You are a world class algorithm for extracting information in structured formats.\n",
"Human: Use the given format to extract information from the following input:\n",
"Human: Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally, so she's 23.\n",
"Human: Tips: Make sure to answer in the correct format\u001b[0m\n",
" {'function_call': {'name': '_OutputFormatter', 'arguments': '{\\n \"output\": {\\n \"people\": [\\n {\\n \"name\": \"Sally\",\\n \"age\": 13,\\n \"fav_food\": \"\"\\n },\\n {\\n \"name\": \"Joey\",\\n \"age\": 12,\\n \"fav_food\": \"spinach\"\\n },\\n {\\n \"name\": \"Caroline\",\\n \"age\": 23,\\n \"fav_food\": \"\"\\n }\\n ]\\n }\\n}'}}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -149,9 +156,7 @@
{
"data": {
"text/plain": [
"{'people': [{'name': 'Sally', 'age': 13, 'fav_food': ''},\n",
" {'name': 'Joey', 'age': 12, 'fav_food': 'spinach'},\n",
" {'name': 'Caroline', 'age': 23, 'fav_food': ''}]}"
"People(people=[Person(name='Sally', age=13, fav_food=''), Person(name='Joey', age=12, fav_food='spinach'), Person(name='Caroline', age=23, fav_food='')])"
]
},
"execution_count": 4,
@@ -162,12 +167,17 @@
"source": [
"from typing import Sequence\n",
"\n",
"\n",
"class People(BaseModel):\n",
" \"\"\"Identifying information about all people in a text.\"\"\"\n",
"\n",
" people: Sequence[Person] = Field(..., description=\"The people in the text\")\n",
" \n",
"\n",
"\n",
"chain = create_structured_output_chain(People, llm, prompt, verbose=True)\n",
"chain.run(\"Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally, so she's 23.\")"
"chain.run(\n",
" \"Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally, so she's 23.\"\n",
")"
]
},
{
@@ -182,7 +192,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"id": "3484415e",
"metadata": {},
"outputs": [],
@@ -192,32 +202,21 @@
" \"description\": \"Identifying information about a person.\",\n",
" \"type\": \"object\",\n",
" \"properties\": {\n",
" \"name\": {\n",
" \"title\": \"Name\",\n",
" \"description\": \"The person's name\",\n",
" \"type\": \"string\"\n",
" },\n",
" \"age\": {\n",
" \"title\": \"Age\",\n",
" \"description\": \"The person's age\",\n",
" \"type\": \"integer\"\n",
" },\n",
" \"fav_food\": {\n",
" \"title\": \"Fav Food\",\n",
" \"description\": \"The person's favorite food\",\n",
" \"type\": \"string\"\n",
" }\n",
" \"name\": {\"title\": \"Name\", \"description\": \"The person's name\", \"type\": \"string\"},\n",
" \"age\": {\"title\": \"Age\", \"description\": \"The person's age\", \"type\": \"integer\"},\n",
" \"fav_food\": {\n",
" \"title\": \"Fav Food\",\n",
" \"description\": \"The person's favorite food\",\n",
" \"type\": \"string\",\n",
" },\n",
" },\n",
" \"required\": [\n",
" \"name\",\n",
" \"age\"\n",
" ]\n",
"}\n"
" \"required\": [\"name\", \"age\"],\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"id": "be9b76b3",
"metadata": {},
"outputs": [
@@ -227,12 +226,13 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mSystem: You are a world class algorithm for extracting information in structured formats.\n",
"Human: Use the given format to extract information from the following input:\n",
"Human: Sally is 13\n",
"Human: Tips: Make sure to answer in the correct format\u001b[0m\n",
" {'function_call': {'name': 'output_formatter', 'arguments': '{\\n \"name\": \"Sally\",\\n \"age\": 13\\n}'}}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -243,7 +243,7 @@
"{'name': 'Sally', 'age': 13}"
]
},
"execution_count": 6,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -279,20 +279,22 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"id": "17f52508",
"metadata": {},
"outputs": [],
"source": [
"class RecordPerson(BaseModel):\n",
" \"\"\"Record some identifying information about a pe.\"\"\"\n",
"\n",
" name: str = Field(..., description=\"The person's name\")\n",
" age: int = Field(..., description=\"The person's age\")\n",
" fav_food: Optional[str] = Field(None, description=\"The person's favorite food\")\n",
"\n",
" \n",
"\n",
"class RecordDog(BaseModel):\n",
" \"\"\"Record some identifying information about a dog.\"\"\"\n",
"\n",
" name: str = Field(..., description=\"The dog's name\")\n",
" color: str = Field(..., description=\"The dog's color\")\n",
" fav_food: Optional[str] = Field(None, description=\"The dog's favorite food\")"
@@ -300,7 +302,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 9,
"id": "a4658ad8",
"metadata": {},
"outputs": [
@@ -310,12 +312,13 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mSystem: You are a world class algorithm for recording entities\n",
"Human: Make calls to the relevant function to record the entities in the following input:\n",
"Human: Harry was a chubby brown beagle who loved chicken\n",
"Human: Tips: Make sure to answer in the correct format\u001b[0m\n",
" {'function_call': {'name': 'RecordDog', 'arguments': '{\\n \"name\": \"Harry\",\\n \"color\": \"brown\",\\n \"fav_food\": \"chicken\"\\n}'}}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -326,17 +329,17 @@
"RecordDog(name='Harry', color='brown', fav_food='chicken')"
]
},
"execution_count": 8,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"prompt_msgs = [\n",
" SystemMessage(\n",
" content=\"You are a world class algorithm for recording entities\"\n",
" SystemMessage(content=\"You are a world class algorithm for recording entities\"),\n",
" HumanMessage(\n",
" content=\"Make calls to the relevant function to record the entities in the following input:\"\n",
" ),\n",
" HumanMessage(content=\"Make calls to the relevant function to record the entities in the following input:\"),\n",
" HumanMessagePromptTemplate.from_template(\"{input}\"),\n",
" HumanMessage(content=\"Tips: Make sure to answer in the correct format\"),\n",
"]\n",
@@ -359,7 +362,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 10,
"id": "95ac5825",
"metadata": {},
"outputs": [
@@ -369,12 +372,13 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mSystem: You are a world class algorithm for recording entities\n",
"Human: Make calls to the relevant function to record the entities in the following input:\n",
"Human: The most important thing to remember about Tommy, my 12 year old, is that he'll do anything for apple pie.\n",
"Human: Tips: Make sure to answer in the correct format\u001b[0m\n",
" {'function_call': {'name': 'record_person', 'arguments': '{\\n \"name\": \"Tommy\",\\n \"age\": 12,\\n \"fav_food\": {\\n \"food\": \"apple pie\"\\n }\\n}'}}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -385,7 +389,7 @@
"{'name': 'Tommy', 'age': 12, 'fav_food': {'food': 'apple pie'}}"
]
},
"execution_count": 9,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -393,11 +397,16 @@
"source": [
"class OptionalFavFood(BaseModel):\n",
" \"\"\"Either a food or null.\"\"\"\n",
" food: Optional[str] = Field(None, description=\"Either the name of a food or null. Should be null if the food isn't known.\")\n",
"\n",
" food: Optional[str] = Field(\n",
" None,\n",
" description=\"Either the name of a food or null. Should be null if the food isn't known.\",\n",
" )\n",
"\n",
"\n",
"def record_person(name: str, age: int, fav_food: OptionalFavFood) -> str:\n",
" \"\"\"Record some basic identifying information about a person.\n",
" \n",
"\n",
" Args:\n",
" name: The person's name.\n",
" age: The person's age in years.\n",
@@ -405,9 +414,11 @@
" \"\"\"\n",
" return f\"Recording person {name} of age {age} with favorite food {fav_food.food}!\"\n",
"\n",
" \n",
"\n",
"chain = create_openai_fn_chain([record_person], llm, prompt, verbose=True)\n",
"chain.run(\"The most important thing to remember about Tommy, my 12 year old, is that he'll do anything for apple pie.\")"
"chain.run(\n",
" \"The most important thing to remember about Tommy, my 12 year old, is that he'll do anything for apple pie.\"\n",
")"
]
},
{
@@ -423,7 +434,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 11,
"id": "8b0d11de",
"metadata": {},
"outputs": [
@@ -433,12 +444,13 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mSystem: You are a world class algorithm for recording entities\n",
"Human: Make calls to the relevant function to record the entities in the following input:\n",
"Human: I can't find my dog Henry anywhere, he's a small brown beagle. Could you send a message about him?\n",
"Human: Tips: Make sure to answer in the correct format\u001b[0m\n",
" {'function_call': {'name': 'record_dog', 'arguments': '{\\n \"name\": \"Henry\",\\n \"color\": \"brown\",\\n \"fav_food\": {\\n \"food\": null\\n }\\n}'}}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -450,7 +462,7 @@
" 'arguments': {'name': 'Henry', 'color': 'brown', 'fav_food': {'food': None}}}"
]
},
"execution_count": 10,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
@@ -458,7 +470,7 @@
"source": [
"def record_dog(name: str, color: str, fav_food: OptionalFavFood) -> str:\n",
" \"\"\"Record some basic identifying information about a dog.\n",
" \n",
"\n",
" Args:\n",
" name: The dog's name.\n",
" color: The dog's color.\n",
@@ -468,7 +480,9 @@
"\n",
"\n",
"chain = create_openai_fn_chain([record_person, record_dog], llm, prompt, verbose=True)\n",
"chain.run(\"I can't find my dog Henry anywhere, he's a small brown beagle. Could you send a message about him?\")"
"chain.run(\n",
" \"I can't find my dog Henry anywhere, he's a small brown beagle. Could you send a message about him?\"\n",
")"
]
},
{
@@ -484,14 +498,6 @@
"- [OpenAPI](/docs/modules/chains/additional/openapi_openai): take an OpenAPI spec and create + execute valid requests against the API, using OpenAI functions under the hood.\n",
"- [QA with citations](/docs/modules/chains/additional/qa_citations): use OpenAI functions ability to extract citations from text."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "93425c66",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -77,7 +77,9 @@
}
],
"source": [
"loader = BraveSearchLoader(query=\"obama middle name\", api_key=api_key, search_kwargs={\"count\": 3})\n",
"loader = BraveSearchLoader(\n",
" query=\"obama middle name\", api_key=api_key, search_kwargs={\"count\": 3}\n",
")\n",
"docs = loader.load()\n",
"len(docs)"
]

View File

@@ -0,0 +1,104 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Browserless\n",
"\n",
"Browserless is a service that allows you to run headless Chrome instances in the cloud. It's a great way to run browser-based automation at scale without having to worry about managing your own infrastructure.\n",
"\n",
"To use Browserless as a document loader, initialize a `BrowserlessLoader` instance as shown in this notebook. Note that by default, `BrowserlessLoader` returns the `innerText` of the page's `body` element. To disable this and get the raw HTML, set `text_content` to `False`."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import BrowserlessLoader"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"BROWSERLESS_API_TOKEN = \"YOUR_BROWSERLESS_API_TOKEN\""
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Jump to content\n",
"Main menu\n",
"Search\n",
"Create account\n",
"Log in\n",
"Personal tools\n",
"Toggle the table of contents\n",
"Document classification\n",
"17 languages\n",
"Article\n",
"Talk\n",
"Read\n",
"Edit\n",
"View history\n",
"Tools\n",
"From Wikipedia, the free encyclopedia\n",
"\n",
"Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done \"manually\" (or \"intellectually\") or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science. The problems are overlapping, however, and there is therefore interdisciplinary research on document classification.\n",
"\n",
"The documents to be classified may be texts, images, music, etc. Each kind of document possesses its special classification problems. When not otherwise specified, text classification is implied.\n",
"\n",
"Do\n"
]
}
],
"source": [
"loader = BrowserlessLoader(\n",
" api_token=BROWSERLESS_API_TOKEN,\n",
" urls=[\n",
" \"https://en.wikipedia.org/wiki/Document_classification\",\n",
" ],\n",
" text_content=True,\n",
")\n",
"\n",
"documents = loader.load()\n",
"\n",
"print(documents[0].page_content[:1000])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,96 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Datadog Logs\n",
"\n",
">[Datadog](https://www.datadoghq.com/) is a monitoring and analytics platform for cloud-scale applications.\n",
"\n",
"This loader fetches the logs from your applications in Datadog using the `datadog_api_client` Python package. You must initialize the loader with your `Datadog API key` and `APP key`, and you need to pass in the query to extract the desired logs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import DatadogLogsLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#!pip install datadog-api-client"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"service:agent status:error\"\n",
"\n",
"loader = DatadogLogsLoader(\n",
" query=query,\n",
" api_key=DD_API_KEY,\n",
" app_key=DD_APP_KEY,\n",
" from_time=1688732708951, # Optional, timestamp in milliseconds\n",
" to_time=1688736308951, # Optional, timestamp in milliseconds\n",
" limit=100, # Optional, default is 100\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='message: grep: /etc/datadog-agent/system-probe.yaml: No such file or directory', metadata={'id': 'AgAAAYkwpLImvkjRpQAAAAAAAAAYAAAAAEFZa3dwTUFsQUFEWmZfLU5QdElnM3dBWQAAACQAAAAAMDE4OTMwYTQtYzk3OS00MmJjLTlhNDAtOTY4N2EwY2I5ZDdk', 'status': 'error', 'service': 'agent', 'tags': ['accessible-from-goog-gke-node', 'allow-external-ingress-high-ports', 'allow-external-ingress-http', 'allow-external-ingress-https', 'container_id:c7d8ecd27b5b3cfdf3b0df04b8965af6f233f56b7c3c2ffabfab5e3b6ccbd6a5', 'container_name:lab_datadog_1', 'datadog.pipelines:false', 'datadog.submission_auth:private_api_key', 'docker_image:datadog/agent:7.41.1', 'env:dd101-dev', 'hostname:lab-host', 'image_name:datadog/agent', 'image_tag:7.41.1', 'instance-id:7497601202021312403', 'instance-type:custom-1-4096', 'instruqt_aws_accounts:', 'instruqt_azure_subscriptions:', 'instruqt_gcp_projects:', 'internal-hostname:lab-host.d4rjybavkary.svc.cluster.local', 'numeric_project_id:3390740675', 'p-d4rjybavkary', 'project:instruqt-prod', 'service:agent', 'short_image:agent', 'source:agent', 'zone:europe-west1-b'], 'timestamp': datetime.datetime(2023, 7, 7, 13, 57, 27, 206000, tzinfo=tzutc())}),\n",
" Document(page_content='message: grep: /etc/datadog-agent/system-probe.yaml: No such file or directory', metadata={'id': 'AgAAAYkwpLImvkjRpgAAAAAAAAAYAAAAAEFZa3dwTUFsQUFEWmZfLU5QdElnM3dBWgAAACQAAAAAMDE4OTMwYTQtYzk3OS00MmJjLTlhNDAtOTY4N2EwY2I5ZDdk', 'status': 'error', 'service': 'agent', 'tags': ['accessible-from-goog-gke-node', 'allow-external-ingress-high-ports', 'allow-external-ingress-http', 'allow-external-ingress-https', 'container_id:c7d8ecd27b5b3cfdf3b0df04b8965af6f233f56b7c3c2ffabfab5e3b6ccbd6a5', 'container_name:lab_datadog_1', 'datadog.pipelines:false', 'datadog.submission_auth:private_api_key', 'docker_image:datadog/agent:7.41.1', 'env:dd101-dev', 'hostname:lab-host', 'image_name:datadog/agent', 'image_tag:7.41.1', 'instance-id:7497601202021312403', 'instance-type:custom-1-4096', 'instruqt_aws_accounts:', 'instruqt_azure_subscriptions:', 'instruqt_gcp_projects:', 'internal-hostname:lab-host.d4rjybavkary.svc.cluster.local', 'numeric_project_id:3390740675', 'p-d4rjybavkary', 'project:instruqt-prod', 'service:agent', 'short_image:agent', 'source:agent', 'zone:europe-west1-b'], 'timestamp': datetime.datetime(2023, 7, 7, 13, 57, 27, 206000, tzinfo=tzutc())})]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"documents = loader.load()\n",
"documents"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.11"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

Some files were not shown because too many files have changed in this diff Show More