Compare commits

..

116 Commits

Author SHA1 Message Date
William Fu-Hinthorn
d05f99930c silly mypy fun 2023-10-24 21:59:11 -07:00
William Fu-Hinthorn
19b07f6d61 update 2023-10-24 19:08:28 -07:00
William Fu-Hinthorn
5c9b679dde Runnable Traceable 2023-10-24 17:32:34 -07:00
William Fu-Hinthorn
1cd73e8dbd merge 2023-10-24 17:22:13 -07:00
William FH
276c6ba115 Check for ls project in run tree context (#12242)
If I go traceable -> runnable when the project is manually specified,
the runnable wont be logged. This makes sure the session/project is
threaded through appropriately.
2023-10-24 17:18:59 -07:00
Vasek Mlejnsky
1f8094938f Integrate E2B's data analysis/code interpreter (#12011)
This PR adds a data [E2B's](https://e2b.dev/) analysis/code interpreter
sandbox as a tool

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Jakub Novak <jakub@e2b.dev>
2023-10-24 16:04:02 -07:00
Bagatur
d2cb95c39d Docs: add lcel to sequential chain (#12234) 2023-10-24 15:15:35 -07:00
Holt Skinner
e7e670805c docs: Google Cloud Documentation Cleanup (#12224)
- Move Document AI provider to the Google provider page
- Change Vertex AI Matching Engine to Vector Search
- Change references from GCP to Google Cloud
- Add Gmail chat loader to Google provider page
- Change Serper page title to "Serper - Google Search API" since it is
not a Google product.
2023-10-24 14:54:43 -07:00
Bagatur
286a29a49e bump 322 and 34 (#12228) 2023-10-24 13:52:17 -07:00
Bagatur
2008a6438c add experimental test release gha (#12229) 2023-10-24 13:49:16 -07:00
Eugene Yurtsev
583dc49477 Add type to Generation and sub-classes, handle root validator (#12220)
* Add a type literal for the generation and sub-classes for serialization purposes.
* Fix the root validator of ChatGeneration to return ValueError instead of KeyError or Attribute error if intialized improperly.
* This change is done for langserve to make sure that llm related callbacks can be serialized/deserialized properly.
2023-10-24 16:21:00 -04:00
William Fu-Hinthorn
d085f51c5d update 2023-10-24 13:17:50 -07:00
Eugene Yurtsev
81052ee18e Fix code block in runnable doc (#12221)
Fix code block syntax in runnable doc-string
2023-10-24 16:11:58 -04:00
Mikelarg
46e28b9613 Added GigaChat chat model support (#12201)
- **Description:** Added integration with
[GigaChat](https://developers.sber.ru/portal/products/gigachat) language
model.
- **Twitter handle:** @dvoshansky
2023-10-24 12:53:51 -07:00
Dayuan Jiang
9c2c9c5274 fix typo in langchain/cookbook/stepback-qa.ipynb (#12204) 2023-10-24 12:51:51 -07:00
Bagatur
87af2360df mv old integration docs (#12217) 2023-10-24 12:38:16 -07:00
Bagatur
6e3f39963f Docs: consolidate top nav (#12219) 2023-10-24 12:28:08 -07:00
Anurag Wagh
d5c2ce7c2e [fix] create redis vector index before adding docs, add prefix to doc… (#11257)
Fix Description: 
For Redis Vector integration in add_texts method, there were two issues
that lead to this bug.
1. Vector index is not being created leading to no such_index error 
2. `doc:index` prefix was also missing for Redis Keys. 

resolves #11197 
Maintainer: @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-24 10:51:25 -07:00
Eugene Yurtsev
079d1f3b8e Expose handle_event and ahandle_events as public API (#12181)
Expose functionality to handle generic events.
2023-10-24 13:42:28 -04:00
William FH
67c4fd0ad0 Update deprecation (#12178)
in runner_utils
2023-10-24 10:37:28 -07:00
Nir Kopler
d3744175bf Finetuned OpenAI models cost calculation #11715 (#12190)
**Description:**
Add cost calculation for fine tuned models (new and legacy), this is
required after OpenAI added new models for fine tuning and separated the
costs of I/O for fine tuned models.
Also I updated the relevant unit tests
see https://platform.openai.com/docs/guides/fine-tuning for more
information.
issue: https://github.com/langchain-ai/langchain/issues/11715

  - **Issue:** 11715
  - **Twitter handle:** @nirkopler
2023-10-24 10:22:05 -07:00
Spyros
a2840a2b42 fix vertexai codey models (#12173)
**Description:**

This PR fixes issue #12156 by checking for Codey models appropriately
before result parsing.


Maintainer: @hwchase17 , @agola11
2023-10-24 10:20:05 -07:00
Leonid Ganeline
386ea48432 updated integrations/providers/microsoft (#12177)
Added several missed tools, utilities, toolkits to the `Microsoft` page.
2023-10-24 10:19:06 -07:00
Hech
d76f026d72 Fix flexible dimension and doc for DingoDB (#12187) 2023-10-24 10:16:19 -07:00
Erick Friis
95ae40ff90 Fix Anthropic Functions ainvoke (#12215)
Removes custom `NotImplementedError` in experimental anthropic
functions, allowing it to fallback on default `ainvoke` implementation.
2023-10-24 10:07:01 -07:00
Iskren Ivov Chernev
d5d7ba582a Improvements to llm/deepinfra (#10846)
- replace `requests` package with `langchain.requests`
- add `_acall` support
- add `_stream` and `_astream`
- freshen up the documentation a bit
- update vendor doc
2023-10-24 09:54:23 -07:00
sudranga
f09f82541b Expose configuration options in GraphCypherQAChain (#12159)
Allows for passing arguments into the LLM chains used by the
GraphCypherQAChain. This is to address a request by a user to include
memory in the Cypher creating chain. Will keep the prompt variables
as-is to be backward compatible. But, would be a good idea to deprecate
them and use the **kwargs variables. Added a test case.

In general, I think it would be good for any chain to automatically pass
in a readonlymemory(of its input) to its subchains whilist allowing for
an override. But, this would be a different change.
2023-10-24 09:52:55 -07:00
Leonid Ganeline
11f13aed53 docstrings update (#12093)
Added missed docstrings. Added missed Args:, Returns: Raises:
2023-10-24 09:34:10 -07:00
Johnny Oshika
ba20c14e28 Fix typo in stuff_prompt's system_template (#12063)
- **Description:** 

Add missing apostrophe in `user's` in stuff_prompt's system_template.
The first sentence in the system template went from:

> Use the following pieces of context to answer the users question.

to

> Use the following pieces of context to answer the user's question.

- **Issue:** 
- **Dependencies:** none
- **Tag maintainer:** @baskaryan
- **Twitter handle:** ojohnnyo
2023-10-24 09:21:28 -07:00
Bagatur
deb8168329 fix note callout (#12214) 2023-10-24 09:17:18 -07:00
Bagatur
8ba97cb408 separate compile integration tests (#12171)
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
2023-10-24 08:55:19 -07:00
Bagatur
44dae6936b Docs: Add LCEL to chains/foundational/llm (#12213) 2023-10-24 08:53:55 -07:00
Bagatur
922193475a Docs: Add LCEL to chains/foundational/transform (#12212) 2023-10-24 08:52:47 -07:00
Bagatur
55f0f8dae8 Docs: add LCEL to chains/foundational/router (#12211) 2023-10-24 08:51:12 -07:00
Holt Skinner
69d9eae5cd feat: Add Client Info to available Google Cloud Clients (#12168)
- This is used internally to gather aggregate usage metrics for the
LangChain integrations

- Note: This cannot be added to some of the Vertex AI integrations at
this time because the SDK doesn't allow overriding the
[`ClientInfo`](https://googleapis.dev/python/google-api-core/latest/client_info.html#module-google.api_core.client_info)

- Added to:
  - BigQuery
  - Google Cloud Storage
  - Document AI
  - Vertex AI Model Garden
  - Document AI Warehouse
  - Vertex AI Search
  - Vertex AI Matching Engine (Cloud Storage Client)
 
@baskaryan, @eyurtsev, @hwchase17

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-24 08:49:11 -07:00
Lukas Wolf
69f5f82804 Update extraction.py (#12207)
Description: Pass tags as argument to create_extraction_chain
Issue: create_extraction_chain does not pass tags to chain yet 

@baskaryan
2023-10-24 08:25:14 -07:00
Nuno Campos
34ffb94770 Remove GetLocal, PutLocal (#12133)
Do you agree?
2023-10-24 10:16:46 +01:00
Eric Hartford
8c150ad7f6 Add COBOL parser and splitter (#11674)
- **Description:** Add COBOL parser and splitter
  - **Issue:** n/a
  - **Dependencies:** n/a
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** erhartford

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 15:44:31 -04:00
Ikko Eltociear Ashimine
bb137fd6e7 Fix typo in jsonformer_experimental.ipynb (#12099)
HuggingFace -> Hugging Face

\
2023-10-23 15:35:54 -04:00
Eugene Yurtsev
ace2234391 Update security.md (#11942)
Update security.md
2023-10-23 15:35:33 -04:00
John Mai
ebf749c40c Baichuan & Hunyuan set default api_base (#12059)
### Description
Baichuan & Hunyuan set default api_base env
2023-10-23 15:33:35 -04:00
Priyanshu Prajapati
283a3ecc9c Create CODE_OF_CONDUCT.md (#12105)
code of conduct.md file is missing it is generally present in good repos
which have large community

Replace this entire comment with:
- **Description:** Added a `code_of_conduct.md` file to the repository
to establish community standards and guidelines for contributors.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** N/A

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 15:15:24 -04:00
Shilong Dai
99afc1b4f8 Fixed hardcoded "vector" and replaced with vector_query_field variable (#12126)
- **Description:** In the max_marginal_relevance_search function of the
ElasticsearchStore vector store, the name of the field corresponding to
the vector embedding of the document is hard coded in the delete
statement that drops the field from the document metadata. This results
in an exception if the vector embedding field is customized. This PR
changes the hard-coded "vector" into the vector_query_field variable.
  - **Issue:** None
  - **Dependencies:** None
  - **Tag maintainer:** @hwchase17

Co-authored-by: Shilong Dai <sdai@viperfish.net>
2023-10-23 15:08:55 -04:00
Vikram Shitole
0d44746430 10634: Added the capability to inject boto3 client in SagemakerEndpointEmbeddings (#12146)
**Description: Allow to inject boto3 client for Cross account access
type of scenarios in using SagemakerEndpointEmbeddings and also updated
the documentation for same in the sample notebook**

**Issue:SagemakerEndpointEmbeddings cross account capability #10634
#10184**

Dependencies: None
Tag maintainer:
Twitter handle:lethargicoder

Co-authored-by: Vikram(VS) <vssht@amazon.com>
2023-10-23 15:08:26 -04:00
Deepanshu
ff79a99825 Fix Typo in CONTRIBUTING.md file (#12145)
Fix Type & add suitable pronoun in CONTRIBUTING.md file


Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 14:53:03 -04:00
aubin_mzt
66f8cb015d Add connection args for pgvector vector store (#11930)
- **Description:** sqlalchemy create_engine() does not take into account
connect_args which are mandatory for managed PGSQL instances on cloud
providers (ssl_context for example).
Also re-enabled create_vector_extension at post_init for using pgvector
class seamlessly
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Sami Bargaoui <bargaoui.sam@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-10-23 14:43:44 -04:00
NuODaniel
4d6243fa87 fix: doc string of default params in chat_models, llm qianfan (#12153)
- **Description:** a fix of the doc string in Qianfan
  - **Issue:** no
  - **Dependencies:** no
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** no
2023-10-23 14:03:18 -04:00
Predrag Gruevski
f82bdf4613 Update deprecated langchain imports with suggested new paths. (#12164)
Let's help our users find the proper import to use instead of the
deprecated top-level ones.
2023-10-23 13:52:08 -04:00
Bagatur
963ff93476 bump 321 (#12161) 2023-10-23 12:49:38 -04:00
Nuno Campos
d0505c0d47 Update default recursion_limit, update docs (#12134)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-23 16:29:17 +01:00
William FH
4f23aa677a Fix Pickle Error (#12141)
If non-pickleable objects (like locks) get passed to the tracing
callback, they'll fail in the deepcopy. Fallback to a shallow copy in
these instances .
2023-10-23 08:22:47 -07:00
Predrag Gruevski
95a1b598fe Update to actions/checkout@v4. (#11951)
We don't use any of the new functionality at the moment. Just making
sure we don't fall back on versions and fail to benefit from new
patches. This is an easy upgrade and it's always harder to upgrade
across multiple major versions at once.
2023-10-23 10:01:33 -04:00
William FH
7c4f340cc0 Include Parent Run ID (#12139)
If you set local callbacks
2023-10-22 17:19:11 -07:00
Sanyam Jain
3df0f03928 Improved readability of Docs (#12136)
Replace this entire comment with:
  - **Description:** a description of the change, 
 improved grammar and readability of DOCS
 
@hwchase17
2023-10-22 17:16:30 -07:00
omahs
f3cc9bba5b Fix typos (#12128)
Fix typos
2023-10-22 17:16:03 -07:00
Nuno Campos
1afdb40b48 Add optional config arg to RunnablePassthrough func arg (#12131)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 19:57:16 +01:00
Nuno Campos
325fdde8b4 Fix bug where types were lost when calling with_cconfig or bind (#12137)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 19:26:13 +01:00
Nuno Campos
2719e49718 Add how-to guide on runnable generators (#12135)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 19:02:17 +01:00
Nuno Campos
02dce74b97 Fix type hint for older py versions (#12132)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 18:01:09 +01:00
Nuno Campos
d0ce374731 Allow specifying custom input/output schemas for runnables with .with_types() (#12083)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-22 17:26:48 +01:00
Harrison Chase
6fcba975d0 add rag fusion notebook (#12121) 2023-10-21 15:37:11 -07:00
Harrison Chase
dd0374560a fix up notebook (#12119) 2023-10-21 14:06:16 -07:00
Harrison Chase
ee69116761 move csv agent to langchain experimental (#12113) 2023-10-21 10:26:02 -07:00
Harrison Chase
03bf6ef473 add missing init files (#12114) 2023-10-21 10:25:50 -07:00
Harrison Chase
acb82cf25e add step back notebook (#11953) 2023-10-21 10:05:52 -07:00
Harrison Chase
9d9198de0b rewrite (#12111) 2023-10-21 09:31:10 -07:00
Bagatur
ef8b180d6d bump 320 (#12108) 2023-10-21 11:52:52 -04:00
Rotem Weiss
c4f8fefe74 Update Tavily API key link (#12109)
fix broken link to generate tavily api key
2023-10-21 11:44:57 -04:00
Rotem Weiss
78d186fb44 Add Tavily Search API as a Tool (#12103)
Adding Tavily Search API as a tool. I will be the maintainer and
assaf_elovic is the twitter handler.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-21 11:23:21 -04:00
Bagatur
85302a9ec1 Add CI check that integration tests compile (#12090) 2023-10-21 10:52:18 -04:00
verlocks
5dbe456aae Bug fix tongyi.py to be compatible with DashScope API (#11956)
Current ChatTongyi is not compatible with DashScope API, which will
cause error when passing api key to chat model directly.
- **Description:** Update tongyi.py to be compatible with DashScope API.
Specifically, update parameter name "dashscope_api_key" to "api_key".
  - **Issue:** None.
- **Dependencies:** Nothing new, Tongyi would require DashScope as
before.
2023-10-20 18:46:41 -04:00
Abhay Kaushik
39f65fb1c9 Fix typos in whatsapp.ipynb and telegram.ipynb (#12075)
- **Description:** 
    - Replace Telegram with Whatsapp in whatsapp.ipynb
    - Add # to mark the telegram as heading in telegram.ipynb
 
  - **Issue:** None
  - **Dependencies:** None
2023-10-20 18:45:33 -04:00
Tomaz Bratanic
82f4c0589c Add neo4j graph environment variables (#12080) 2023-10-20 14:43:01 -07:00
Mohammad Mohtashim
d5400f6502 Google Scholar Search Tool using serpapi (#11513)
- **Description:** Implementing the Google Scholar Tool as requested in
PR #11505. The tool will be using the [serpapi python
package](https://serpapi.com/integrations/python#search-google-scholar).
The main idea of the tool will be to return the results from a Google
Scholar search given a query as an input to the tool.

- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17
2023-10-20 17:35:55 -04:00
Ofer Mendelevitch
e542bf1b6b Minor update to doc/text in IPYNB example (#12089)
- **Description:** changed sign-up link in IPYNB example
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ofermend
2023-10-20 17:17:36 -04:00
Shreyas S
2e8637da2f Minor typo fix (#11804)
remove redundant a
langchain > LangChain
2023-10-20 17:11:53 -04:00
Shinya Maeda
89bc73c6c3 Fix superfluous Auto-fixing parser documents (#12062)
Replace this entire comment with:
- **Description:** Fix superfluous [Auto-fixing
parser](https://python.langchain.com/docs/modules/model_io/output_parsers/output_fixing_parser)
docs. Also switching to `langchain.pydantic_v1` from the direct
reference to `pydantic`,
  - **Issue:** N/A,
  - **Dependencies:** N/A,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
  - **Twitter handle:** @dosuken123 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
2023-10-20 16:07:03 -04:00
Holt Skinner
f5be2d525a fix: Add _serving_config property to GoogleVertexAISearchRetriever (#12084)
- Fixes error:

```
ValueError: "GoogleVertexAISearchRetriever" object has no field "_serving_config"
```

Introduced in #11736

@baskaryan, @eyurtsev, @hwchase17 if you could review and merge quickly,
that would be appreciated :)
2023-10-20 15:16:42 -04:00
Nuno Campos
5fee61a207 Support runnable factories in .configurable_alts() (#12065)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-20 15:22:09 +01:00
Lance Martin
b01a443ee5 Update figures in multi-modal Cookbooks (#12060) 2023-10-19 19:51:36 -07:00
Jacob Lee
34ec2da701 Fix typo in google vertex ai palm notebook documentation (#12056) 2023-10-19 21:46:35 -04:00
Bagatur
56c279015e clear nb img output (#12055) 2023-10-19 15:28:54 -07:00
Bagatur
54a8d70eb5 Bagatur/mv singlestore doc (#12053) 2023-10-19 15:06:26 -07:00
Leonid Ganeline
52b103dd13 update interface notebook (#12042)
Added a use case with parallelise on batches. Simplified text.
2023-10-19 17:06:14 -04:00
Bagatur
8cabb4ee8e add cookbook table (#12043) 2023-10-19 14:05:24 -07:00
Zhitao Xu
a4c3a44712 Fix documentation typo in Clickhouse Class (#12047)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
- **Description:** The return info in the documentation for
similarity_search_by_vector and similarity_search_with_relevance_scores
is wrong
2023-10-19 17:00:22 -04:00
William FH
25418b9b4d Always add run ID (#12046)
in eval callback handler.

Useful if you're using a custom run evaluator and don't want to thread
things through.
2023-10-19 12:38:07 -07:00
Eugene Yurtsev
44d7763580 Add zapier deprecation warning (#12045)
Add zapier deprecation
2023-10-19 15:27:56 -04:00
John Mai
4188f046ec Add Tencent Hunyuan chat model (#12022)
### Description:
The Tencent Hunyuan model, developed by Tencent, is a large language
model by robust Chinese text generation capabilities, adeptness in
logical reasoning within complex contexts, and reliable task execution
proficiency.For more information, see
[https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)
2023-10-19 15:10:12 -04:00
Eugene Yurtsev
68599d98c2 More security notes (#12040)
Add more security notes
2023-10-19 14:49:09 -04:00
Bagatur
0006075b08 bump 319 (#12041) 2023-10-19 11:45:27 -07:00
John Mai
8eb40b5fe2 baichuan_secret_key use pydantic.types.SecretStr & Add Baichuan tests (#12031)
### Description
- `baichuan_secret_key` use pydantic.types.SecretStr
- Add Baichuan tests
2023-10-19 14:37:41 -04:00
Nuno Campos
85bac75729 nc/runnable-dynamic-schemas-from-config (#12038)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-19 19:34:35 +01:00
Nuno Campos
85eaa4ccee Revert "nc/runnable-dynamic-schemas-from-config" (#12037)
This reverts commit a46eef64a7.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-19 19:27:02 +01:00
Nuno Campos
a46eef64a7 nc/runnable-dynamic-schemas-from-config 2023-10-19 19:17:48 +01:00
Nuno Campos
d392e030be Add default value (#12032)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-10-19 18:30:05 +01:00
Kenneth Choe
62efe1ffb9 support add_embeddings for elasticsearch (#11002)
- **Description:** Provide a way to use different text for embedding.
- For example, if you are ingesting stack-overflow Q&As for RAG, you
would want to embed the questions and return the answer(s) for the hits.
With this change, the consumer of langchain can implement that easily.
- I noticed the similar function is added on faiss.py with #1912 which
was for performance reason, but I see the same function can be used to
achieve what I thought. So instead of changing Document class to have
embedding_content, I mimicked the implementation of faiss.py.
- The test should provide some guidance on how to use it. It would be
more intuitive if I just pass texts and embedding_texts as separate
arguments, but I chose to use `zip`-ed object for the consistency with
faiss.py implementation.
      - I plan to make similar pull request for OpenSearch.
  - **Issue:** N/A
  - **Dependencies:** None other than the existing ones.

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-19 09:43:51 -07:00
Bagatur
76d3afaef0 bump 318 (#12030) 2023-10-19 09:33:39 -07:00
Dmitry Tyumentsev
5dd2161c4b add _acall method to YandexGPT (#12029)
- **Description:** Add async support for YandexGPT LLM model

Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
2023-10-19 09:15:26 -07:00
Palau
720ecacb1c Add notebook for kay.ai press release data (#11575)
- **Description:** Adding a notebook for Press Release data from Kay.ai,
as discussed offline
  - **Tag maintainer:** @baskaryan @hwchase17 
- **Twitter handle:** https://twitter.com/kaydotai
https://twitter.com/vishalrohra_

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-19 08:06:56 -07:00
Peter Krenesky
8425f33363 Pydantic v2 support for OpenAPI Specs (#11936)
- **Description:** Adding Pydantic v2 support for OpenAPI Specs 

- **Issue:**
- OpenAPI spec support was disabled because `openapi-schema-pydantic`
doesn't support Pydantic v2:
     #9205
     
     - Caused errors in `get_openapi_chain`
   
    - This may be the cause of #9520.

- **Tag maintainer:** @eyurtsev
- **Twitter handle:** kreneskyp


The root cause was that `openapi-schema-pydantic` hasn't been updated in
some time but
[openapi-pydantic](https://github.com/mike-oakley/openapi-pydantic)
forked and updated the project.
2023-10-19 11:06:11 -04:00
volodymyr-memsql
4adabd33ac Add example of retriever usage with SingleStoreDB vector store (#12021)
Added a notebook with examples of the creation of a retriever from the
SingleStoreDB vector store, and further usage.

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
2023-10-19 09:48:35 -04:00
Joe McElroy
c9f1768cb9 Elasticsearch Query Retriever: Use match + fuzziness for LIKE (#12023)
Updated the elasticsearch self query retriever to use the match clause
for LIKE operator instead of the non-analyzed fuzzy search clause.

Other small updates include:
- fixing the stack inference integration test where the index's default
pipeline didn't use the inference pipeline created
- adding a user-agent to the old implementation to track usage
- improved the documentation for ElasticsearchStore filters
2023-10-19 09:47:21 -04:00
maks-operlejn-ds
84d250f781 Docs: QA Privacy Nit (#12025)
Resize image in docs for QA Privacy
2023-10-19 09:43:47 -04:00
Nuno Campos
7db6aabf65 Update chat model output type (#11833)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-19 00:55:15 -07:00
Simon Dai
ed62984cb2 update Weaviate to support multi tenancy (#11842)
- **Description:** update Weaviate to support multi tenancy
  - **Issue:** 9956
  - **Dependencies:** 
  - **Tag maintainer:** hwchase17
  - **Twitter handle:** dsx1986_
2023-10-19 00:49:30 -07:00
hiigao
f818ec49b8 Encapsulate alicloud pai-eas access method for chatmodels and llms (#11852)
### Description: 
To provide an eas llm service access methods in this pull request by
impletementing `PaiEasEndpoint` and `PaiEasChatEndpoint` classes in
`langchain.llms` and `langchain.chat_models` modules. Base on this pr,
langchain users can build up a chain to call remote eas llm service and
get the llm inference results.

### About EAS Service
EAS is a Alicloud product on Alibaba Cloud Machine Learning Platform for
AI which is short for AliCloud PAI. EAS provides model inference
deployment services for the users. We build up a llm inference services
on EAS with a general llm docker images. Therefore, end users can
quickly setup their llm remote instances to load majority of the
hugginface llm models, and serve as a backend for most of the llm apps.

### Dependencies
This pr does't involve any new dependencies.

---------

Co-authored-by: 子洪 <gaoyihong.gyh@alibaba-inc.com>
2023-10-19 00:20:18 -07:00
Shinya Maeda
1da6d92369 fix: superfluous List Parser doc (#12014) 2023-10-19 00:14:38 -07:00
John Mai
a6b483dcbc Supported RetryOutputParser & RetryWithErrorOutputParser max_retries (#11903)
Description: Supported RetryOutputParser & RetryWithErrorOutputParser
max_retries
- max_retries: Maximum number of retries to parser.

Issue: None
Dependencies: None
Tag maintainer: @baskaryan 
Twitter handle:
2023-10-18 23:57:16 -07:00
Hugues Chocart
008c7df80d [LLMonitorCallbackHandler] Refactor + add llmonitor-py dependency (#11948)
We now require uses to have the pip package `llmonitor` installed. It
allows us to have cleaner code and avoid duplicates between our library
and our code in Langchain.
2023-10-18 23:54:10 -07:00
Sian Cao
77fc2f7644 fix: impl missing embeddings method (#10823)
FAISS does not implement embeddings method and use embed_query to
embedding texts which is wrong for some embedding models.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-18 23:51:28 -07:00
Holt Skinner
2661dc94f3 feat: Google Vertex AI Search Retriever - Add support for Website Data Stores (#11736)
- Only works for Data stores with Advanced Website Indexing
-
https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features
- Minor restructuring - Follow up to #10513
- Remove outdated docs (readded in
https://github.com/langchain-ai/langchain/pull/11620)
  - Move legacy class into new py file to clean up the directory
- Shouldn't cause backwards compatibility issues as the import works the
same way for users
2023-10-18 23:41:48 -07:00
Shorthills AI
4b6fdd7bf0 Update modal.py (#11588)
feat: Raise KeyError when 'prompt' key is missing in JSON response

This commit updates the error handling in the code to raise a KeyError
when the 'prompt' key is not found in the JSON response. This change
makes the code more explicit about the nature of the error, helping to
improve clarity and debugging.

@baskaryan, @eyurtsev.
2023-10-18 23:40:37 -07:00
Surav Shrestha
2038c7fd5d fix typo in multi_language.ipynb (#12009)
exprience -> experience
2023-10-18 23:33:25 -07:00
William FH
dfb4baa3f9 Fix Fireworks Callbacks (#12003)
I may be missing something but it seems like we inappropriately overrode
the 'stream()' method, losing callbacks in the process. I don't think
(?) it gave us anything in this case to customize it here?

See new trace:

https://smith.langchain.com/public/fbb82825-3a16-446b-8207-35622358db3b/r

and confirmed it streams.

Also fixes the stopwords issues from #12000
2023-10-18 23:33:09 -07:00
Lance Martin
12f8e87a0e LLaMA2 SQL cookbook clean (#12007) 2023-10-18 21:16:58 -07:00
257 changed files with 12996 additions and 4514 deletions

132
.github/CODE_OF_CONDUCT.md vendored Normal file
View File

@@ -0,0 +1,132 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
conduct@langchain.dev.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -1,7 +1,7 @@
# Contributing to LangChain
Hi there! Thank you for even being interested in contributing to LangChain.
As an open source project in a rapidly developing field, we are extremely open
As an open-source project in a rapidly developing field, we are extremely open
to contributions, whether they be in the form of new features, improved infra, better documentation, or bug fixes.
## 🗺️ Guidelines
@@ -14,7 +14,7 @@ Please do not try to push directly to this repo unless you are a maintainer.
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant
maintainers.
Pull requests cannot land without passing the formatting, linting and testing checks first. See [Testing](#testing) and
Pull requests cannot land without passing the formatting, linting, and testing checks first. See [Testing](#testing) and
[Formatting and Linting](#formatting-and-linting) for how to run these checks locally.
It's essential that we maintain great documentation and testing. If you:
@@ -79,7 +79,7 @@ There are two separate projects in this repository:
- `langchain`: core langchain code, abstractions, and use cases
- `langchain.experimental`: see the [Experimental README](../libs/experimental/README.md) for more information.
Each of these has their own development environment. Docs are run from the top-level makefile, but development
Each of these has its own development environment. Docs are run from the top-level makefile, but development
is split across separate test & release flows.
For this quickstart, start with langchain core:

View File

@@ -0,0 +1,57 @@
name: compile-integration-test
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.6.1"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
name: Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: compile-integration
- name: Install integration dependencies
shell: bash
run: poetry install --with=test_integration
- name: Check integration tests compile
shell: bash
run: poetry run pytest -m compile tests/integration_tests
- name: Ensure the tests did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -34,7 +34,7 @@ jobs:
- "3.8"
- "3.11"
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
# Fetch the last FETCH_DEPTH commits, so the mtime-changing script
# can accurately set the mtimes of files modified in the last FETCH_DEPTH commits.

View File

@@ -26,7 +26,7 @@ jobs:
- "3.11"
name: Pydantic v1/v2 compatibility - Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

View File

@@ -30,7 +30,7 @@ jobs:
run:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

View File

@@ -26,7 +26,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
@@ -44,6 +44,14 @@ jobs:
shell: bash
run: make test
- name: Install integration dependencies
shell: bash
run: poetry install --with=test_integration
- name: Check integration tests compile
shell: bash
run: poetry run pytest -m compile tests/integration_tests
- name: Ensure the tests did not create any additional files
shell: bash
run: |

50
.github/workflows/_test_release.yml vendored Normal file
View File

@@ -0,0 +1,50 @@
name: test-release
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
env:
POETRY_VERSION: "1.6.1"
jobs:
publish_to_test_pypi:
runs-on: ubuntu-latest
permissions:
# This permission is used for trusted publishing:
# https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
#
# Trusted publishing has to also be configured on PyPI for each package:
# https://docs.pypi.org/trusted-publishers/adding-a-publisher/
id-token: write
defaults:
run:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: "3.10"
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: release
- name: Build project for distribution
run: poetry build
- name: Check Version
id: check-version
run: |
echo version=$(poetry version --short) >> $GITHUB_OUTPUT
- name: Publish package to TestPyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
repository-url: https://test.pypi.org/legacy/
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true

View File

@@ -17,7 +17,7 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Install Dependencies
run: |

View File

@@ -44,6 +44,13 @@ jobs:
working-directory: libs/langchain
secrets: inherit
compile-integration-tests:
uses:
./.github/workflows/_compile_integration_test.yml
with:
working-directory: libs/langchain
secrets: inherit
pydantic-compatibility:
uses:
./.github/workflows/_pydantic_compatibility.yml
@@ -65,7 +72,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }} extended tests
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

View File

@@ -44,6 +44,13 @@ jobs:
working-directory: libs/experimental
secrets: inherit
compile-integration-tests:
uses:
./.github/workflows/_compile_integration_test.yml
with:
working-directory: libs/experimental
secrets: inherit
# It's possible that langchain-experimental works fine with the latest *published* langchain,
# but is broken with the langchain on `master`.
#
@@ -62,7 +69,7 @@ jobs:
- "3.11"
name: test with unpublished langchain - Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
@@ -97,7 +104,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }} extended tests
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"

View File

@@ -0,0 +1,13 @@
---
name: Experimental Test Release
on:
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
release:
uses:
./.github/workflows/_test_release.yml
with:
working-directory: libs/experimental
secrets: inherit

View File

@@ -0,0 +1,13 @@
---
name: Test Release
on:
workflow_dispatch: # Allows to trigger the workflow manually in GitHub UI
jobs:
release:
uses:
./.github/workflows/_test_release.yml
with:
working-directory: libs/langchain
secrets: inherit

View File

@@ -24,7 +24,7 @@ jobs:
- "3.11"
name: Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: "./.github/actions/poetry_setup"

View File

@@ -93,7 +93,7 @@ Memory refers to persisting state between calls of a chain/agent. LangChain prov
**🧐 Evaluation:**
[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is by using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
For more information on these concepts, please see our [full documentation](https://python.langchain.com).

View File

@@ -40,6 +40,7 @@ Notebook | Description
[openai_functions_retrieval_qa....](https://github.com/langchain-ai/langchain/tree/master/cookbook/openai_functions_retrieval_qa.ipynb) | Structure response output in a question answering system by incorporating openai functions into a retrieval pipeline.
[petting_zoo.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/petting_zoo.ipynb) | Create multi-agent simulations with simulated environments using the petting zoo library.
[plan_and_execute_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/plan_and_execute_agent.ipynb) | Create plan-and-execute agents that accomplish objectives by planning tasks with a language model (llm) and executing them with a separate agent.
[press_releases.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/press_releases.ipynb) | Retrieve and query company press release data powered by [Kay.ai](https://kay.ai).
[program_aided_language_model.i...](https://github.com/langchain-ai/langchain/tree/master/cookbook/program_aided_language_model.ipynb) | Implement program-aided language models as described in the provided research paper.
[sales_agent_with_context.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/sales_agent_with_context.ipynb) | Implement a context-aware ai sales agent, salesgpt, that can have natural sales conversations, interact with other systems, and use a product knowledge base to discuss a company's offerings.
[self_query_hotel_search.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/self_query_hotel_search.ipynb) | Build a hotel room search feature with self-querying retrieval, using a specific hotel recommendation dataset.

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,152 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "62ee82e4-2ad8-498b-8438-fac388afe1a2",
"metadata": {},
"source": [
"Press Releases Data\n",
"=\n",
"\n",
"Press Releases data powered by [Kay.ai](https://kay.ai).\n",
"\n",
">Press releases are used by companies to announce something noteworthy, including product launches, financial performance reports, partnerships, and other significant news. They are widely used by analysts to track corporate strategy, operational updates and financial performance.\n",
"Kay.ai obtains press releases of all US public companies from a variety of sources, which include the company's official press room and partnerships with various data API providers. \n",
"This data is updated till Sept 30th for free access, if you want to access the real-time feed, reach out to us at hello@kay.ai or [tweet at us](https://twitter.com/vishalrohra_)"
]
},
{
"cell_type": "markdown",
"id": "8183d85d-365f-4672-a963-52b533547de0",
"metadata": {},
"source": [
"Setup\n",
"=\n",
"\n",
"First you will need to install the `kay` package. You will also need an API key: you can get one for free at [https://kay.ai](https://kay.ai/). Once you have an API key, you must set it as an environment variable `KAY_API_KEY`.\n",
"\n",
"In this example we're going to use the `KayAiRetriever`. Take a look at the [kay notebook](/docs/integrations/retrievers/kay) for more detailed information for the parmeters that it accepts."
]
},
{
"cell_type": "markdown",
"id": "02ec21c7-49fe-4844-b58a-bf064ad40b2a",
"metadata": {},
"source": [
"Examples\n",
"="
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "bf0395f7-6ebe-4136-8b0d-00b9dea3becd",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n",
" ········\n"
]
}
],
"source": [
"# Setup API keys for Kay and OpenAI\n",
"from getpass import getpass\n",
"KAY_API_KEY = getpass()\n",
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f7fcaf70-29a4-444b-8f07-9784f808c300",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"KAY_API_KEY\"] = KAY_API_KEY\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ac00bf93-3635-4ffe-b9a6-a8b4f35c0c85",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import ConversationalRetrievalChain\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.retrievers import KayAiRetriever\n",
"\n",
"model = ChatOpenAI(model_name=\"gpt-3.5-turbo\")\n",
"retriever = KayAiRetriever.create(dataset_id=\"company\", data_types=[\"PressRelease\"], num_contexts=6)\n",
"qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8d9d927c-35b2-4a7b-8ea7-4d0350797941",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-> **Question**: How is the healthcare industry adopting generative AI tools? \n",
"\n",
"**Answer**: The healthcare industry is adopting generative AI tools to improve various aspects of patient care and administrative tasks. Companies like HCA Healthcare Inc, Amazon Com Inc, and Mayo Clinic have collaborated with technology providers like Google Cloud, AWS, and Microsoft to implement generative AI solutions.\n",
"\n",
"HCA Healthcare is testing a nurse handoff tool that generates draft reports quickly and accurately, which nurses have shown interest in using. They are also exploring the use of Google's medically-tuned Med-PaLM 2 LLM to support caregivers in asking complex medical questions.\n",
"\n",
"Amazon Web Services (AWS) has introduced AWS HealthScribe, a generative AI-powered service that automatically creates clinical documentation. However, integrating multiple AI systems into a cohesive solution requires significant engineering resources, including access to AI experts, healthcare data, and compute capacity.\n",
"\n",
"Mayo Clinic is among the first healthcare organizations to deploy Microsoft 365 Copilot, a generative AI service that combines large language models with organizational data from Microsoft 365. This tool has the potential to automate tasks like form-filling, relieving administrative burdens on healthcare providers and allowing them to focus more on patient care.\n",
"\n",
"Overall, the healthcare industry is recognizing the potential benefits of generative AI tools in improving efficiency, automating tasks, and enhancing patient care. \n",
"\n"
]
}
],
"source": [
"# More sample questions in the Playground on https://kay.ai\n",
"questions = [\n",
" \"How is the healthcare industry adopting generative AI tools?\",\n",
" #\"What are some recent challenges faced by the renewable energy sector?\",\n",
"]\n",
"chat_history = []\n",
"\n",
"for question in questions:\n",
" result = qa({\"question\": question, \"chat_history\": chat_history})\n",
" chat_history.append((question, result[\"answer\"]))\n",
" print(f\"-> **Question**: {question} \\n\")\n",
" print(f\"**Answer**: {result['answer']} \\n\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

263
cookbook/rag_fusion.ipynb Normal file
View File

@@ -0,0 +1,263 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "993c2768",
"metadata": {},
"source": [
"# RAG Fusion\n",
"\n",
"Re-implemented from [this GitHub repo](https://github.com/Raudaschl/rag-fusion), all credit to original author\n",
"\n",
"> RAG-Fusion, a search methodology that aims to bridge the gap between traditional search paradigms and the multifaceted dimensions of human queries. Inspired by the capabilities of Retrieval Augmented Generation (RAG), this project goes a step further by employing multiple query generation and Reciprocal Rank Fusion to re-rank search results."
]
},
{
"cell_type": "markdown",
"id": "ebcc6791",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"For this example, we will use Pinecone and some fake data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "661a1c36",
"metadata": {},
"outputs": [],
"source": [
"import pinecone\n",
"from langchain.vectorstores import Pinecone\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"\n",
"pinecone.init(api_key=\"...\",environment=\"...\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "48ef7e93",
"metadata": {},
"outputs": [],
"source": [
"all_documents = {\n",
" \"doc1\": \"Climate change and economic impact.\",\n",
" \"doc2\": \"Public health concerns due to climate change.\",\n",
" \"doc3\": \"Climate change: A social perspective.\",\n",
" \"doc4\": \"Technological solutions to climate change.\",\n",
" \"doc5\": \"Policy changes needed to combat climate change.\",\n",
" \"doc6\": \"Climate change and its impact on biodiversity.\",\n",
" \"doc7\": \"Climate change: The science and models.\",\n",
" \"doc8\": \"Global warming: A subset of climate change.\",\n",
" \"doc9\": \"How climate change affects daily weather.\",\n",
" \"doc10\": \"The history of climate change activism.\"\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fde89f0b",
"metadata": {},
"outputs": [],
"source": [
"vectorstore = Pinecone.from_texts(list(all_documents.values()), OpenAIEmbeddings(), index_name='rag-fusion')"
]
},
{
"cell_type": "markdown",
"id": "22ddd041",
"metadata": {},
"source": [
"## Define the Query Generator\n",
"\n",
"We will now define a chain to do the query generation"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "1d547524",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "af9ab4db",
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"\n",
"prompt = hub.pull('langchain-ai/rag-fusion-query-generation')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3628b552",
"metadata": {},
"outputs": [],
"source": [
"# prompt = ChatPromptTemplate.from_messages([\n",
"# (\"system\", \"You are a helpful assistant that generates multiple search queries based on a single input query.\"),\n",
"# (\"user\", \"Generate multiple search queries related to: {original_query}\"),\n",
"# (\"user\", \"OUTPUT (4 queries):\")\n",
"# ])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8d6cbb73",
"metadata": {},
"outputs": [],
"source": [
"generate_queries = prompt | ChatOpenAI(temperature=0) | StrOutputParser() | (lambda x: x.split(\"\\n\"))"
]
},
{
"cell_type": "markdown",
"id": "ee2824cd",
"metadata": {},
"source": [
"## Define the full chain\n",
"\n",
"We can now put it all together and define the full chain. This chain:\n",
" \n",
" 1. Generates a bunch of queries\n",
" 2. Looks up each query in the retriever\n",
" 3. Joins all the results together using reciprocal rank fusion\n",
" \n",
" \n",
"Note that it does NOT do a final generation step"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "ca0bfec4",
"metadata": {},
"outputs": [],
"source": [
"original_query = \"impact of climate change\""
]
},
{
"cell_type": "code",
"execution_count": 75,
"id": "02437d65",
"metadata": {},
"outputs": [],
"source": [
"vectorstore = Pinecone.from_existing_index(\"rag-fusion\", OpenAIEmbeddings())\n",
"retriever = vectorstore.as_retriever()"
]
},
{
"cell_type": "code",
"execution_count": 76,
"id": "46a9a0e6",
"metadata": {},
"outputs": [],
"source": [
"from langchain.load import dumps, loads\n",
"def reciprocal_rank_fusion(results: list[list], k=60):\n",
" fused_scores = {}\n",
" for docs in results:\n",
" # Assumes the docs are returned in sorted order of relevance\n",
" for rank, doc in enumerate(docs):\n",
" doc_str = dumps(doc)\n",
" if doc_str not in fused_scores:\n",
" fused_scores[doc_str] = 0\n",
" previous_score = fused_scores[doc_str]\n",
" fused_scores[doc_str] += 1 / (rank + k)\n",
" \n",
" reranked_results = [(loads(doc), score) for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)]\n",
" return reranked_results "
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "3f9d4502",
"metadata": {},
"outputs": [],
"source": [
"chain = generate_queries | retriever.map() | reciprocal_rank_fusion"
]
},
{
"cell_type": "code",
"execution_count": 78,
"id": "d70c4fcd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(Document(page_content='Climate change and economic impact.'),\n",
" 0.06558258417063283),\n",
" (Document(page_content='Climate change: A social perspective.'),\n",
" 0.06400409626216078),\n",
" (Document(page_content='How climate change affects daily weather.'),\n",
" 0.04787506400409626),\n",
" (Document(page_content='Climate change and its impact on biodiversity.'),\n",
" 0.03306010928961749),\n",
" (Document(page_content='Public health concerns due to climate change.'),\n",
" 0.016666666666666666),\n",
" (Document(page_content='Technological solutions to climate change.'),\n",
" 0.016666666666666666),\n",
" (Document(page_content='Policy changes needed to combat climate change.'),\n",
" 0.01639344262295082)]"
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"original_query\": original_query})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7866e551",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

351
cookbook/rewrite.ipynb Normal file
View File

@@ -0,0 +1,351 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "260629f9",
"metadata": {},
"source": [
"# Rewrite-Retrieve-Read\n",
"\n",
"**Rewrite-Retrieve-Read** is a method proposed in the paper [Query Rewriting for Retrieval-Augmented Large Language Models](https://arxiv.org/pdf/2305.14283.pdf)\n",
"\n",
"> Because the original query can not be always optimal to retrieve for the LLM, especially in the real world... we first prompt an LLM to rewrite the queries, then conduct retrieval-augmented reading\n",
"\n",
"We show how you can easily do that with LangChain Expression Language"
]
},
{
"cell_type": "markdown",
"id": "eda93712",
"metadata": {},
"source": [
"## Baseline\n",
"\n",
"Baseline RAG (**Retrieve-and-read**) can be done like the following:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1d2edbd2",
"metadata": {},
"outputs": [],
"source": [
"from operator import itemgetter\n",
"\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnablePassthrough, RunnableLambda\n",
"from langchain.utilities import DuckDuckGoSearchAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "86a46aa9",
"metadata": {},
"outputs": [],
"source": [
"template = \"\"\"Answer the users question based only on the following context:\n",
"\n",
"<context>\n",
"{context}\n",
"</context>\n",
"\n",
"Question: {question}\n",
"\"\"\"\n",
"prompt = ChatPromptTemplate.from_template(template)\n",
"\n",
"model = ChatOpenAI(temperature=0)\n",
"\n",
"search = DuckDuckGoSearchAPIWrapper()\n",
"\n",
"\n",
"def retriever(query):\n",
" return search.run(query)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "8566d48e",
"metadata": {},
"outputs": [],
"source": [
"chain = (\n",
" {\"context\": retriever, \"question\": RunnablePassthrough()} \n",
" | prompt \n",
" | model \n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "5c57f9ee",
"metadata": {},
"outputs": [],
"source": [
"simple_query = \"what is langchain?\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "37c5f962",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"\"LangChain is a powerful and versatile Python library that enables developers and researchers to create, experiment with, and analyze language models and agents. It simplifies the development of language-based applications by providing a suite of features for artificial general intelligence. It can be used to build chatbots, perform document analysis and summarization, and streamline interaction with various large language model providers. LangChain's unique proposition is its ability to create logical links between one or more language models, known as Chains. It is an open-source library that offers a generic interface to foundation models and allows prompt management and integration with other components and tools.\""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(simple_query)"
]
},
{
"cell_type": "markdown",
"id": "23bdb9bd",
"metadata": {},
"source": [
"While this is fine for well formatted queries, it can break down for more complicated queries"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "8df6a814",
"metadata": {},
"outputs": [],
"source": [
"distracted_query = \"man that sam bankman fried trial was crazy! what is langchain?\""
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "16d7db64",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Based on the given context, there is no information provided about \"langchain.\"'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(distracted_query)"
]
},
{
"cell_type": "markdown",
"id": "0b4f8b93",
"metadata": {},
"source": [
"This is because the retriever does a bad job with these \"distracted\" queries"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "3439d8dc",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Business She\\'s the star witness against Sam Bankman-Fried. Her testimony was explosive Gary Wang, who co-founded both FTX and Alameda Research, said Bankman-Fried directed him to change a... The Verge, following the trial\\'s Oct. 4 kickoff: \"Is Sam Bankman-Fried\\'s Defense Even Trying to Win?\". CBS Moneywatch, from Thursday: \"Sam Bankman-Fried\\'s Lawyer Struggles to Poke ... Sam Bankman-Fried, FTX\\'s founder, responded with a single word: \"Oof.\". Less than a year later, Mr. Bankman-Fried, 31, is on trial in federal court in Manhattan, fighting criminal charges ... July 19, 2023. A U.S. judge on Wednesday overruled objections by Sam Bankman-Fried\\'s lawyers and allowed jurors in the FTX founder\\'s fraud trial to see a profane message he sent to a reporter days ... Sam Bankman-Fried, who was once hailed as a virtuoso in cryptocurrency trading, is on trial over the collapse of FTX, the financial exchange he founded. Bankman-Fried is accused of...'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever(distracted_query)"
]
},
{
"cell_type": "markdown",
"id": "7eb748ac",
"metadata": {},
"source": [
"## Rewrite-Retrieve-Read Implementation\n",
"\n",
"The main part is a rewriter to rewrite the search query"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "88ae702e",
"metadata": {},
"outputs": [],
"source": [
"template = \"\"\"Provide a better search query for \\\n",
"web search engine to answer the given question, end \\\n",
"the queries with **. Question: \\\n",
"{x} Answer:\"\"\"\n",
"rewrite_prompt = ChatPromptTemplate.from_template(template)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "184e1bcb",
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"\n",
"rewrite_prompt = hub.pull(\"langchain-ai/rewrite\")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "a4c23d40",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Provide a better search query for web search engine to answer the given question, end the queries with **. Question {x} Answer:\n"
]
}
],
"source": [
"print(rewrite_prompt.template)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f55cd010",
"metadata": {},
"outputs": [],
"source": [
"# Parser to remove the `**`\n",
"\n",
"def _parse(text):\n",
" return text.strip(\"**\")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "c9c34bef",
"metadata": {},
"outputs": [],
"source": [
"rewriter = rewrite_prompt | ChatOpenAI(temperature=0) | StrOutputParser() | _parse"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "fb17fb3d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'What is the definition and purpose of Langchain?'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rewriter.invoke({\"x\": distracted_query})"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f83edb09",
"metadata": {},
"outputs": [],
"source": [
"rewrite_retrieve_read_chain = (\n",
" {\n",
" \"context\": {\"x\": RunnablePassthrough()} | rewriter | retriever,\n",
" \"question\": RunnablePassthrough()} \n",
" | prompt \n",
" | model \n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "43096322",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Based on the given context, LangChain is an open-source framework designed to simplify the creation of applications using large language models (LLMs). It enables LLM models to generate responses based on up-to-date online information and simplifies the organization of large volumes of data for easy access by LLMs. LangChain offers a standard interface for chains, integrations with other tools, and end-to-end chains for common applications. It is a robust library that streamlines interaction with various LLM providers. LangChain\\'s unique proposition is its ability to create logical links between one or more LLMs, known as Chains. It is an AI framework with features that simplify the development of language-based applications and offers a suite of features for artificial general intelligence. However, the context does not provide any information about the \"sam bankman fried trial\" mentioned in the question.'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rewrite_retrieve_read_chain.invoke(distracted_query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "59874b4f",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

335
cookbook/stepback-qa.ipynb Normal file
View File

@@ -0,0 +1,335 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "83ef724e",
"metadata": {},
"source": [
"# Step-Back Prompting (Question-Answering)\n",
"\n",
"One prompting technique called \"Step-Back\" prompting can improve performance on complex questions by first asking a \"step back\" question. This can be combined with regular question-answering applications by then doing retrieval on both the original and step-back question.\n",
"\n",
"Read the paper [here](https://arxiv.org/abs/2310.06117)\n",
"\n",
"See an excellent blog post on this by Cobus Greyling [here](https://cobusgreyling.medium.com/a-new-prompt-engineering-technique-has-been-introduced-called-step-back-prompting-b00e8954cacb)\n",
"\n",
"In this cookbook we will replicate this technique. We modify the prompts used slightly to work better with chat models."
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "67b5cdac",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnableLambda"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "7e017c44",
"metadata": {},
"outputs": [],
"source": [
"# Few Shot Examples\n",
"examples = [\n",
" {\n",
" \"input\": \"Could the members of The Police perform lawful arrests?\",\n",
" \"output\": \"what can the members of The Police do?\"\n",
" },\n",
" {\n",
" \"input\": \"Jan Sindels was born in what country?\", \n",
" \"output\": \"what is Jan Sindels personal history?\"\n",
" },\n",
"]\n",
"# We now transform these to example messages\n",
"example_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"human\", \"{input}\"),\n",
" (\"ai\", \"{output}\"),\n",
" ]\n",
")\n",
"few_shot_prompt = FewShotChatMessagePromptTemplate(\n",
" example_prompt=example_prompt,\n",
" examples=examples,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "206415ee",
"metadata": {},
"outputs": [],
"source": [
"prompt = ChatPromptTemplate.from_messages([\n",
" (\"system\", \"\"\"You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:\"\"\"),\n",
" # Few shot examples\n",
" few_shot_prompt,\n",
" # New question\n",
" (\"user\", \"{question}\"),\n",
"])"
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "d643a85c",
"metadata": {},
"outputs": [],
"source": [
"question_gen = prompt | ChatOpenAI(temperature=0) | StrOutputParser()"
]
},
{
"cell_type": "code",
"execution_count": 182,
"id": "5ba21b2a",
"metadata": {},
"outputs": [],
"source": [
"question = \"was chatgpt around while trump was president?\""
]
},
{
"cell_type": "code",
"execution_count": 183,
"id": "5992c8ca",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'when was ChatGPT developed?'"
]
},
"execution_count": 183,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question_gen.invoke({\"question\": question})"
]
},
{
"cell_type": "code",
"execution_count": 190,
"id": "32667424",
"metadata": {},
"outputs": [],
"source": [
"from langchain.utilities import DuckDuckGoSearchAPIWrapper\n",
"\n",
"\n",
"search = DuckDuckGoSearchAPIWrapper(max_results=4)\n",
"\n",
"def retriever(query):\n",
" return search.run(query)"
]
},
{
"cell_type": "code",
"execution_count": 191,
"id": "ffc28c91",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'This includes content about former President Donald Trump. According to further tests, ChatGPT successfully wrote poems admiring all recent U.S. presidents, but failed when we entered a query for ... On Wednesday, a Twitter user posted screenshots of him asking OpenAI\\'s chatbot, ChatGPT, to write a positive poem about former President Donald Trump, to which the chatbot declined, citing it ... While impressive in many respects, ChatGPT also has some major flaws. ... [President\\'s Name],\" refused to write a poem about ex-President Trump, but wrote one about President Biden ... During the Trump administration, Altman gained new attention as a vocal critic of the president. It was against that backdrop that he was rumored to be considering a run for California governor.'"
]
},
"execution_count": 191,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever(question)"
]
},
{
"cell_type": "code",
"execution_count": 192,
"id": "00c77443",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Will Douglas Heaven March 3, 2023 Stephanie Arnett/MITTR | Envato When OpenAI launched ChatGPT, with zero fanfare, in late November 2022, the San Francisco-based artificial-intelligence company... ChatGPT, which stands for Chat Generative Pre-trained Transformer, is a large language model -based chatbot developed by OpenAI and launched on November 30, 2022, which enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. ChatGPT is an artificial intelligence (AI) chatbot built on top of OpenAI's foundational large language models (LLMs) like GPT-4 and its predecessors. This chatbot has redefined the standards of... June 4, 2023 ⋅ 4 min read 124 SHARES 13K At the end of 2022, OpenAI introduced the world to ChatGPT. Since its launch, ChatGPT hasn't shown significant signs of slowing down in developing new...\""
]
},
"execution_count": 192,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever(question_gen.invoke({\"question\": question}))"
]
},
{
"cell_type": "code",
"execution_count": 193,
"id": "b257bc06",
"metadata": {},
"outputs": [],
"source": [
"# response_prompt_template = \"\"\"You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.\n",
"\n",
"# {normal_context}\n",
"# {step_back_context}\n",
"\n",
"# Original Question: {question}\n",
"# Answer:\"\"\"\n",
"# response_prompt = ChatPromptTemplate.from_template(response_prompt_template)"
]
},
{
"cell_type": "code",
"execution_count": 203,
"id": "f48c65b2",
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"\n",
"response_prompt = hub.pull(\"langchain-ai/stepback-answer\")"
]
},
{
"cell_type": "code",
"execution_count": 204,
"id": "97a6d5ab",
"metadata": {},
"outputs": [],
"source": [
"chain = {\n",
" # Retrieve context using the normal question\n",
" \"normal_context\": RunnableLambda(lambda x: x['question']) | retriever,\n",
" # Retrieve context using the step-back question\n",
" \"step_back_context\": question_gen | retriever,\n",
" # Pass on the question\n",
" \"question\": lambda x: x[\"question\"]\n",
"} | response_prompt | ChatOpenAI(temperature=0) | StrOutputParser()"
]
},
{
"cell_type": "code",
"execution_count": 205,
"id": "ce554cb0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"No, ChatGPT was not around while Donald Trump was president. ChatGPT was launched on November 30, 2022, which is after Donald Trump's presidency. The context provided mentions that during the Trump administration, Altman, the CEO of OpenAI, gained attention as a vocal critic of the president. This suggests that ChatGPT was not developed or available during that time.\""
]
},
"execution_count": 205,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"question\": question})"
]
},
{
"cell_type": "markdown",
"id": "a9fb8dd2",
"metadata": {},
"source": [
"## Baseline"
]
},
{
"cell_type": "code",
"execution_count": 206,
"id": "00db8a15",
"metadata": {},
"outputs": [],
"source": [
"response_prompt_template = \"\"\"You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.\n",
"\n",
"{normal_context}\n",
"\n",
"Original Question: {question}\n",
"Answer:\"\"\"\n",
"response_prompt = ChatPromptTemplate.from_template(response_prompt_template)"
]
},
{
"cell_type": "code",
"execution_count": 207,
"id": "06335ebb",
"metadata": {},
"outputs": [],
"source": [
"chain = {\n",
" # Retrieve context using the normal question (only the first 3 results)\n",
" \"normal_context\": RunnableLambda(lambda x: x['question']) | retriever,\n",
" # Pass on the question\n",
" \"question\": lambda x: x[\"question\"]\n",
"} | response_prompt | ChatOpenAI(temperature=0) | StrOutputParser()"
]
},
{
"cell_type": "code",
"execution_count": 208,
"id": "15e0e741",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Yes, ChatGPT was around while Donald Trump was president. However, it is important to note that the specific context you provided mentions that ChatGPT refused to write a positive poem about former President Donald Trump. This suggests that while ChatGPT was available during Trump's presidency, it may have had limitations or biases in its responses regarding him.\""
]
},
"execution_count": 208,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"question\": question})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e7b9e5d6",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -13,6 +13,7 @@ cp -r . ../_dist
cd ../_dist
poetry run python scripts/model_feat_table.py
poetry run nbdoc_build --srcdir docs
cp ../cookbook/README.md src/pages/cookbook.mdx
poetry run python scripts/generate_api_reference_links.py
yarn install
yarn start

File diff suppressed because one or more lines are too long

View File

@@ -107,7 +107,7 @@
"# Now let's try with fallbacks to Anthropic\n",
"with patch('openai.ChatCompletion.create', side_effect=RateLimitError()):\n",
" try:\n",
" print(llm.invoke(\"Why did the the chicken cross the road?\"))\n",
" print(llm.invoke(\"Why did the chicken cross the road?\"))\n",
" except:\n",
" print(\"Hit error\")"
]

View File

@@ -0,0 +1,119 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Custom generator functions\n",
"\n",
"You can use generator functions (ie. functions that use the `yield` keyword, and behave like iterators) in a LCEL pipeline.\n",
"\n",
"The signature of these generators should be `Iterator[Input] -> Iterator[Output]`. Or for async generators: `AsyncIterator[Input] -> AsyncIterator[Output]`.\n",
"\n",
"These are useful for:\n",
"- implementing a custom output parser\n",
"- modifying the output of a previous step, while preserving streaming capabilities\n",
"\n",
"Let's implement a custom output parser for comma-separated lists."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"lion, tiger, wolf, gorilla, panda\n"
]
}
],
"source": [
"from typing import Iterator, List\n",
"\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts.chat import ChatPromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"\n",
"\n",
"prompt = ChatPromptTemplate.from_template(\n",
" \"Write a comma-separated list of 5 animals similar to: {animal}\"\n",
")\n",
"model = ChatOpenAI(temperature=0.0)\n",
"\n",
"\n",
"str_chain = prompt | model | StrOutputParser()\n",
"\n",
"print(str_chain.invoke({\"animal\": \"bear\"}))\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# This is a custom parser that splits an iterator of llm tokens\n",
"# into a list of strings separated by commas\n",
"def split_into_list(input: Iterator[str]) -> Iterator[List[str]]:\n",
" # hold partial input until we get a comma\n",
" buffer = \"\"\n",
" for chunk in input:\n",
" # add current chunk to buffer\n",
" buffer += chunk\n",
" # while there are commas in the buffer\n",
" while \",\" in buffer:\n",
" # split buffer on comma\n",
" comma_index = buffer.index(\",\")\n",
" # yield everything before the comma\n",
" yield [buffer[:comma_index].strip()]\n",
" # save the rest for the next iteration\n",
" buffer = buffer[comma_index + 1 :]\n",
" # yield the last chunk\n",
" yield [buffer.strip()]\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['lion', 'tiger', 'wolf', 'gorilla', 'panda']\n"
]
}
],
"source": [
"list_chain = str_chain | split_into_list\n",
"\n",
"print(list_chain.invoke({\"animal\": \"bear\"}))\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -346,7 +346,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
"version": "3.9.1"
}
},
"nbformat": 4,

File diff suppressed because it is too large Load Diff

View File

@@ -109,7 +109,7 @@
"# Now let's try with fallbacks to Anthropic\n",
"with patch('openai.ChatCompletion.create', side_effect=RateLimitError()):\n",
" try:\n",
" print(llm.invoke(\"Why did the the chicken cross the road?\"))\n",
" print(llm.invoke(\"Why did the chicken cross the road?\"))\n",
" except:\n",
" print(\"Hit error\")"
]

View File

@@ -148,7 +148,7 @@
"\n",
"Inference speed is a challenge when running models locally (see above).\n",
"\n",
"To minimize latency, it is desiable to run models locally on GPU, which ships with many consumer laptops [e.g., Apple devices](https://www.apple.com/newsroom/2022/06/apple-unveils-m2-with-breakthrough-performance-and-capabilities/).\n",
"To minimize latency, it is desirable to run models locally on GPU, which ships with many consumer laptops [e.g., Apple devices](https://www.apple.com/newsroom/2022/06/apple-unveils-m2-with-breakthrough-performance-and-capabilities/).\n",
"\n",
"And even with GPU, the available GPU memory bandwidth (as noted above) is important.\n",
"\n",
@@ -254,7 +254,7 @@
"\n",
"`f16_kv`: whether the model should use half-precision for the key/value cache\n",
"* Value: True\n",
"* Meaning: The model will use half-precision, which can be more memory efficient; Metal only support True."
"* Meaning: The model will use half-precision, which can be more memory efficient; Metal only supports True."
]
},
{
@@ -291,7 +291,7 @@
"id": "f56f5168",
"metadata": {},
"source": [
"The console log will show the the below to indicate Metal was enabled properly from steps above:\n",
"The console log will show the below to indicate Metal was enabled properly from steps above:\n",
"```\n",
"ggml_metal_init: allocating\n",
"ggml_metal_init: using MPS\n",

View File

@@ -229,7 +229,7 @@
"- fasttext (recommended)\n",
"- langdetect\n",
"\n",
"From our exprience *fasttext* performs a bit better, but you should verify it on your use case."
"From our experience *fasttext* performs a bit better, but you should verify it on your use case."
]
},
{

View File

@@ -21,7 +21,7 @@
"\n",
"In this notebook, we will look at building a basic system for question answering, based on private data. Before feeding the LLM with this data, we need to protect it so that it doesn't go to an external API (e.g. OpenAI, Anthropic). Then, after receiving the model output, we would like the data to be restored to its original form. Below you can observe an example flow of this QA system:\n",
"\n",
"<img src=\"/img/qa_privacy_protection.png\" width=\"800\"/>\n",
"<img src=\"/img/qa_privacy_protection.png\" width=\"900\"/>\n",
"\n",
"\n",
"In the following notebook, we will not go into the details of how the anonymizer works. If you are interested, please visit [this part of the documentation](https://python.langchain.com/docs/guides/privacy/presidio_data_anonymization/).\n",
@@ -839,6 +839,8 @@
"metadata": {},
"outputs": [],
"source": [
"documents = [Document(page_content=document_content)]\n",
"\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)\n",
"chunks = text_splitter.split_documents(documents)\n",
"\n",

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,114 @@
{
"cells": [
{
"cell_type": "markdown",
"source": [
"# GigaChat\n",
"This notebook shows how to use LangChain with [GigaChat](https://developers.sber.ru/portal/products/gigachat).\n",
"To use you need to install ```gigachat``` python package."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# !pip install gigachat"
]
},
{
"cell_type": "markdown",
"source": [
"To get GigaChat credentials you need to [create account](https://developers.sber.ru/studio/login) and [get access to API](https://developers.sber.ru/docs/ru/gigachat/api/integration)\n",
"## Example"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 9,
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ['GIGACHAT_CREDENTIALS'] = getpass()"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 10,
"outputs": [],
"source": [
"from langchain.chat_models import GigaChat\n",
"\n",
"chat = GigaChat(verify_ssl_certs=False)"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 31,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"What do you get when you cross a goat and a skunk? A smelly goat!\n"
]
}
],
"source": [
"from langchain.schema import SystemMessage, HumanMessage\n",
"\n",
"messages = [\n",
" SystemMessage(\n",
" content=\"You are a helpful AI that shares everything you know. Talk in English.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Tell me a joke\"\n",
" ),\n",
"]\n",
"\n",
"print(chat(messages).content)"
],
"metadata": {
"collapsed": false
}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

View File

@@ -5,9 +5,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# GCP Vertex AI \n",
"# Google Cloud Vertex AI \n",
"\n",
"Note: This is seperate from the Google PaLM integration. Google has chosen to offer an enterprise version of PaLM through GCP, and this supports the models made available through there. \n",
"Note: This is separate from the Google PaLM integration. Google has chosen to offer an enterprise version of PaLM through GCP, and this supports the models made available through there. \n",
"\n",
"By default, Google Cloud [does not use](https://cloud.google.com/vertex-ai/docs/generative-ai/data-governance#foundation_model_development) Customer Data to train its foundation models as part of Google Cloud`s AI/ML Privacy Commitment. More details about how Google processes data can also be found in [Google's Customer Data Processing Addendum (CDPA)](https://cloud.google.com/terms/data-processing-addendum).\n",
"\n",
@@ -31,7 +31,7 @@
},
"outputs": [],
"source": [
"#!pip install langchain google-cloud-aiplatform"
"#!pip install langchain google-cloud-aiplatform\n"
]
},
{
@@ -41,7 +41,7 @@
"outputs": [],
"source": [
"from langchain.chat_models import ChatVertexAI\n",
"from langchain.prompts import ChatPromptTemplate"
"from langchain.prompts import ChatPromptTemplate\n"
]
},
{
@@ -50,7 +50,7 @@
"metadata": {},
"outputs": [],
"source": [
"chat = ChatVertexAI()"
"chat = ChatVertexAI()\n"
]
},
{
@@ -64,7 +64,7 @@
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", system), (\"human\", human)]\n",
")\n",
"messages = prompt.format_messages()"
"messages = prompt.format_messages()\n"
]
},
{
@@ -84,7 +84,7 @@
}
],
"source": [
"chat(messages)"
"chat(messages)\n"
]
},
{
@@ -104,7 +104,7 @@
"human = \"{text}\"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", system), (\"human\", human)]\n",
")"
")\n"
]
},
{
@@ -127,7 +127,7 @@
"chain = prompt | chat\n",
"chain.invoke(\n",
" {\"input_language\": \"English\", \"output_language\": \"Japanese\", \"text\": \"I love programming\"}\n",
")"
")\n"
]
},
{
@@ -161,7 +161,7 @@
" model_name=\"codechat-bison\",\n",
" max_output_tokens=1000,\n",
" temperature=0.5\n",
")"
")\n"
]
},
{
@@ -189,7 +189,7 @@
],
"source": [
"# For simple string in string out usage, we can use the `predict` method:\n",
"print(chat.predict(\"Write a Python function to identify all prime numbers\"))"
"print(chat.predict(\"Write a Python function to identify all prime numbers\"))\n"
]
},
{
@@ -209,7 +209,7 @@
"source": [
"import asyncio\n",
"# import nest_asyncio\n",
"# nest_asyncio.apply()"
"# nest_asyncio.apply()\n"
]
},
{
@@ -237,7 +237,7 @@
" top_k=40,\n",
")\n",
"\n",
"asyncio.run(chat.agenerate([messages]))"
"asyncio.run(chat.agenerate([messages]))\n"
]
},
{
@@ -257,7 +257,7 @@
}
],
"source": [
"asyncio.run(chain.ainvoke({\"input_language\": \"English\", \"output_language\": \"Sanskrit\", \"text\": \"I love programming\"}))"
"asyncio.run(chain.ainvoke({\"input_language\": \"English\", \"output_language\": \"Sanskrit\", \"text\": \"I love programming\"}))\n"
]
},
{
@@ -275,7 +275,7 @@
"metadata": {},
"outputs": [],
"source": [
"import sys"
"import sys\n"
]
},
{
@@ -310,7 +310,7 @@
"messages = prompt.format_messages()\n",
"for chunk in chat.stream(messages):\n",
" sys.stdout.write(chunk.content)\n",
" sys.stdout.flush()"
" sys.stdout.flush()\n"
]
}
],

View File

@@ -0,0 +1,160 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tencent Hunyuan\n",
"\n",
"Hunyuan chat model API by Tencent. For more information, see [https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-19T10:20:38.718834Z",
"start_time": "2023-10-19T10:20:38.264050Z"
}
},
"outputs": [],
"source": [
"from langchain.chat_models import ChatHunyuan\n",
"from langchain.schema import HumanMessage"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-19T10:19:53.529876Z",
"start_time": "2023-10-19T10:19:53.526210Z"
}
},
"outputs": [],
"source": [
"chat = ChatHunyuan(\n",
" hunyuan_app_id='YOUR_APP_ID',\n",
" hunyuan_secret_id='YOUR_SECRET_ID',\n",
" hunyuan_secret_key='YOUR_SECRET_KEY',\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-19T10:19:56.054289Z",
"start_time": "2023-10-19T10:19:53.531078Z"
}
},
"outputs": [
{
"data": {
"text/plain": "AIMessage(content=\"J'aime programmer.\")"
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat([\n",
" HumanMessage(content='You are a helpful assistant that translates English to French.Translate this sentence from English to French. I love programming.')\n",
"])"
]
},
{
"cell_type": "markdown",
"source": [
"## For ChatHunyuan with Streaming"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 2,
"outputs": [],
"source": [
"chat = ChatHunyuan(\n",
" hunyuan_app_id='YOUR_APP_ID',\n",
" hunyuan_secret_id='YOUR_SECRET_ID',\n",
" hunyuan_secret_key='YOUR_SECRET_KEY',\n",
" streaming=True,\n",
")"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-10-19T10:20:41.507720Z",
"start_time": "2023-10-19T10:20:41.496456Z"
}
}
},
{
"cell_type": "code",
"execution_count": 3,
"outputs": [
{
"data": {
"text/plain": "AIMessageChunk(content=\"J'aime programmer.\")"
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chat([\n",
" HumanMessage(content='You are a helpful assistant that translates English to French.Translate this sentence from English to French. I love programming.')\n",
"])"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-10-19T10:20:46.275673Z",
"start_time": "2023-10-19T10:20:44.241097Z"
}
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"start_time": "2023-10-19T10:19:56.233477Z"
}
}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,121 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# AliCloud PAI EAS\n",
"Machine Learning Platform for AI of Alibaba Cloud is a machine learning or deep learning engineering platform intended for enterprises and developers. It provides easy-to-use, cost-effective, high-performance, and easy-to-scale plug-ins that can be applied to various industry scenarios. With over 140 built-in optimization algorithms, Machine Learning Platform for AI provides whole-process AI engineering capabilities including data labeling (PAI-iTAG), model building (PAI-Designer and PAI-DSW), model training (PAI-DLC), compilation optimization, and inference deployment (PAI-EAS). PAI-EAS supports different types of hardware resources, including CPUs and GPUs, and features high throughput and low latency. It allows you to deploy large-scale complex models with a few clicks and perform elastic scale-ins and scale-outs in real time. It also provides a comprehensive O&M and monitoring system."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup Eas Service\n",
"\n",
"One who want to use eas llms must set up eas service first. When the eas service is launched, eas_service_rul and eas_service token can be got. Users can refer to https://www.alibabacloud.com/help/en/pai/user-guide/service-deployment/ for more information. Try to set environment variables to init eas service url and token:\n",
"\n",
"```base\n",
"export EAS_SERVICE_URL=XXX\n",
"export EAS_SERVICE_TOKEN=XXX\n",
"```\n",
"or run as follow codes:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from langchain.chat_models.base import HumanMessage\n",
"from langchain.chat_models import PaiEasChatEndpoint\n",
"os.environ[\"EAS_SERVICE_URL\"] = \"Your_EAS_Service_URL\"\n",
"os.environ[\"EAS_SERVICE_TOKEN\"] = \"Your_EAS_Service_Token\"\n",
"chat = PaiEasChatEndpoint(\n",
" eas_service_url=os.environ[\"EAS_SERVICE_URL\"], \n",
" eas_service_token=os.environ[\"EAS_SERVICE_TOKEN\"]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run Chat Model\n",
"You can use the default settings to call eas service as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"output = chat([HumanMessage(content=\"write a funny joke\")])\n",
"print(\"output:\", output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or, call eas service with new inference params:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"kwargs = {\"temperature\": 0.8, \"top_p\": 0.8, \"top_k\": 5}\n",
"output = chat([HumanMessage(content=\"write a funny joke\")], **kwargs)\n",
"print(\"output:\", output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or, run a stream call to get a stream response:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"outputs = chat.stream([HumanMessage(content=\"hi\")], streaming=True)\n",
"for output in outputs:\n",
" print(\"stream output:\", output)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -52,9 +52,9 @@
"id": "8533ab63-d437-492a-aaec-ccca31167bf2",
"metadata": {},
"source": [
"## 1. Select dataset\n",
"## 1. Select a dataset\n",
"\n",
"This notebook fine-tunes a model directly on a selecting which runs to fine-tune on. You will often curate these from traced runs. You can learn more about LangSmith datasets in the docs [docs](https://docs.smith.langchain.com/evaluation/datasets).\n",
"This notebook fine-tunes a model directly on selecting which runs to fine-tune on. You will often curate these from traced runs. You can learn more about LangSmith datasets in the docs [docs](https://docs.smith.langchain.com/evaluation/datasets).\n",
"\n",
"For the sake of this tutorial, we will upload an existing dataset here that you can use."
]

View File

@@ -5,7 +5,7 @@
"id": "735455a6-f82e-4252-b545-27385ef883f4",
"metadata": {},
"source": [
" Telegram\n",
"# Telegram\n",
"\n",
"This notebook shows how to use the Telegram chat loader. This class helps map exported Telegram conversations to LangChain chat messages.\n",
"\n",

View File

@@ -7,7 +7,7 @@
"source": [
"# WhatsApp\n",
"\n",
"This notebook shows how to use the WhatsApp chat loader. This class helps map exported Telegram conversations to LangChain chat messages.\n",
"This notebook shows how to use the WhatsApp chat loader. This class helps map exported WhatsApp conversations to LangChain chat messages.\n",
"\n",
"The process has three steps:\n",
"1. Export the chat conversations to computer\n",

View File

@@ -49,7 +49,7 @@
"metadata": {},
"source": [
"`BibtexLoader` has these arguments:\n",
"- `file_path`: the path the the `.bib` bibtex file\n",
"- `file_path`: the path of the `.bib` bibtex file\n",
"- optional `max_docs`: default=None, i.e. not limit. Use it to limit number of retrieved documents.\n",
"- optional `max_content_chars`: default=4000. Use it to limit the number of characters in a single document.\n",
"- optional `load_extra_meta`: default=False. By default only the most important fields from the bibtex entries: `Published` (publication year), `Title`, `Authors`, `Summary`, `Journal`, `Keywords`, and `URL`. If True, it will also try to load return `entry_id`, `note`, `doi`, and `links` fields. \n",

View File

@@ -6,30 +6,7 @@
"source": [
"# DeepInfra\n",
"\n",
"`DeepInfra` provides [several LLMs](https://deepinfra.com/models).\n",
"\n",
"This notebook goes over how to use Langchain with [DeepInfra](https://deepinfra.com)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"from langchain.llms import DeepInfra\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain"
"[DeepInfra](https://deepinfra.com/?utm_source=langchain) is a serverless inference as a service that provides access to a [variety of LLMs](https://deepinfra.com/models?utm_source=langchain) and [embeddings models](https://deepinfra.com/models?type=embeddings&utm_source=langchain). This notebook goes over how to use LangChain with DeepInfra for language models."
]
},
{
@@ -45,7 +22,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 6,
"metadata": {
"tags": []
},
@@ -68,12 +45,14 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"DEEPINFRA_API_TOKEN\"] = DEEPINFRA_API_TOKEN"
]
},
@@ -87,11 +66,13 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"llm = DeepInfra(model_id=\"databricks/dolly-v2-12b\")\n",
"from langchain.llms import DeepInfra\n",
"\n",
"llm = DeepInfra(model_id=\"meta-llama/Llama-2-70b-chat-hf\")\n",
"llm.model_kwargs = {\n",
" \"temperature\": 0.7,\n",
" \"repetition_penalty\": 1.2,\n",
@@ -100,6 +81,51 @@
"}"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'This is a question that has puzzled many people'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# run inferences directly via wrapper\n",
"llm(\"Who let the dogs out?\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
" Will\n",
" Smith\n",
"."
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# run streaming inference\n",
"for chunk in llm.stream(\"Who let the dogs out?\"):\n",
" print(chunk)"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -110,10 +136,12 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
@@ -130,10 +158,12 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import LLMChain\n",
"\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)"
]
},
@@ -147,16 +177,16 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Penguins live in the Southern hemisphere.\\nThe North pole is located in the Northern hemisphere.\\nSo, first you need to turn the penguin South.\\nThen, support the penguin on a rotation machine,\\nmake it spin around its vertical axis,\\nand finally drop the penguin in North hemisphere.\\nNow, you have a penguin in the north pole!\\n\\nStill didn't understand?\\nWell, you're a failure as a teacher.\""
"\"Penguins are found in Antarctica and the surrounding islands, which are located at the southernmost tip of the planet. The North Pole is located at the northernmost tip of the planet, and it would be a long journey for penguins to get there. In fact, penguins don't have the ability to fly or migrate over such long distances. So, no, penguins cannot reach the North Pole. \""
]
},
"execution_count": 8,
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
@@ -166,6 +196,13 @@
"\n",
"llm_chain.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -184,7 +221,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.11.5"
},
"vscode": {
"interpreter": {

View File

@@ -0,0 +1,113 @@
{
"cells": [
{
"cell_type": "markdown",
"source": [
"# GigaChat\n",
"This notebook shows how to use LangChain with [GigaChat](https://developers.sber.ru/portal/products/gigachat).\n",
"To use you need to install ```gigachat``` python package."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# !pip install gigachat"
]
},
{
"cell_type": "markdown",
"source": [
"To get GigaChat credentials you need to [create account](https://developers.sber.ru/studio/login) and [get access to API](https://developers.sber.ru/docs/ru/gigachat/api/integration)\n",
"## Example"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 1,
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ['GIGACHAT_CREDENTIALS'] = getpass()"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 2,
"outputs": [],
"source": [
"from langchain.llms import GigaChat\n",
"\n",
"llm = GigaChat(verify_ssl_certs=False)"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 3,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The capital of Russia is Moscow.\n"
]
}
],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain\n",
"\n",
"template = \"What is capital of {country}?\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"country\"])\n",
"\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"generated = llm_chain.run(country=\"Russia\")\n",
"print(generated)"
],
"metadata": {
"collapsed": false
}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# GCP Vertex AI\n",
"# Google Cloud Vertex AI\n",
"\n",
"**Note:** This is separate from the `Google PaLM` integration, it exposes [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) on `Google Cloud`. \n"
]
@@ -41,7 +41,7 @@
},
"outputs": [],
"source": [
"#!pip install langchain google-cloud-aiplatform"
"#!pip install langchain google-cloud-aiplatform\n"
]
},
{
@@ -50,7 +50,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import VertexAI"
"from langchain.llms import VertexAI\n"
]
},
{
@@ -74,7 +74,7 @@
],
"source": [
"llm = VertexAI()\n",
"print(llm(\"What are some of the pros and cons of Python as a programming language?\"))"
"print(llm(\"What are some of the pros and cons of Python as a programming language?\"))\n"
]
},
{
@@ -90,7 +90,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate"
"from langchain.prompts import PromptTemplate\n"
]
},
{
@@ -102,7 +102,7 @@
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = PromptTemplate.from_template(template)"
"prompt = PromptTemplate.from_template(template)\n"
]
},
{
@@ -111,7 +111,7 @@
"metadata": {},
"outputs": [],
"source": [
"chain = prompt | llm"
"chain = prompt | llm\n"
]
},
{
@@ -130,7 +130,7 @@
],
"source": [
"question = \"Who was the president in the year Justin Beiber was born?\"\n",
"print(chain.invoke({\"question\": question}))"
"print(chain.invoke({\"question\": question}))\n"
]
},
{
@@ -159,7 +159,7 @@
},
"outputs": [],
"source": [
"llm = VertexAI(model_name=\"code-bison\", max_output_tokens=1000, temperature=0.3)"
"llm = VertexAI(model_name=\"code-bison\", max_output_tokens=1000, temperature=0.3)\n"
]
},
{
@@ -168,7 +168,7 @@
"metadata": {},
"outputs": [],
"source": [
"question = \"Write a python function that checks if a string is a valid email address\""
"question = \"Write a python function that checks if a string is a valid email address\"\n"
]
},
{
@@ -193,7 +193,7 @@
}
],
"source": [
"print(llm(question))"
"print(llm(question))\n"
]
},
{
@@ -223,7 +223,7 @@
],
"source": [
"result = llm.generate([question])\n",
"result.generations"
"result.generations\n"
]
},
{
@@ -243,7 +243,7 @@
"source": [
"# If running in a Jupyter notebook you'll need to install nest_asyncio\n",
"\n",
"# !pip install nest_asyncio"
"# !pip install nest_asyncio\n"
]
},
{
@@ -254,7 +254,7 @@
"source": [
"import asyncio\n",
"# import nest_asyncio\n",
"# nest_asyncio.apply()"
"# nest_asyncio.apply()\n"
]
},
{
@@ -274,7 +274,7 @@
}
],
"source": [
"asyncio.run(llm.agenerate([question]))"
"asyncio.run(llm.agenerate([question]))\n"
]
},
{
@@ -292,7 +292,7 @@
"metadata": {},
"outputs": [],
"source": [
"import sys"
"import sys\n"
]
},
{
@@ -337,7 +337,7 @@
"source": [
"for chunk in llm.stream(question):\n",
" sys.stdout.write(chunk)\n",
" sys.stdout.flush()"
" sys.stdout.flush()\n"
]
},
{
@@ -360,7 +360,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import VertexAIModelGarden"
"from langchain.llms import VertexAIModelGarden\n"
]
},
{
@@ -372,7 +372,7 @@
"llm = VertexAIModelGarden(\n",
" project=\"YOUR PROJECT\",\n",
" endpoint_id=\"YOUR ENDPOINT_ID\"\n",
")"
")\n"
]
},
{
@@ -381,7 +381,7 @@
"metadata": {},
"outputs": [],
"source": [
"print(llm(\"What is the meaning of life?\"))"
"print(llm(\"What is the meaning of life?\"))\n"
]
},
{
@@ -397,7 +397,7 @@
"metadata": {},
"outputs": [],
"source": [
"prompt = PromptTemplate.from_template(\"What is the meaning of {thing}?\")"
"prompt = PromptTemplate.from_template(\"What is the meaning of {thing}?\")\n"
]
},
{
@@ -407,7 +407,7 @@
"outputs": [],
"source": [
"chian = prompt | llm\n",
"print(chain.invoke({\"thing\": \"life\"}))"
"print(chain.invoke({\"thing\": \"life\"}))\n"
]
}
],

View File

@@ -7,7 +7,7 @@
"source": [
"# JSONFormer\n",
"\n",
"[JSONFormer](https://github.com/1rgs/jsonformer) is a library that wraps local HuggingFace pipeline models for structured decoding of a subset of the JSON Schema.\n",
"[JSONFormer](https://github.com/1rgs/jsonformer) is a library that wraps local Hugging Face pipeline models for structured decoding of a subset of the JSON Schema.\n",
"\n",
"It works by filling in the structure tokens and then sampling the content tokens from the model.\n",
"\n",
@@ -31,7 +31,7 @@
"id": "66bd89f1-8daa-433d-bb8f-5b0b3ae34b00",
"metadata": {},
"source": [
"### HuggingFace Baseline\n",
"### Hugging Face Baseline\n",
"\n",
"First, let's establish a qualitative baseline by checking the output of the model without structured decoding."
]

View File

@@ -0,0 +1,93 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# AliCloud PAI EAS\n",
"Machine Learning Platform for AI of Alibaba Cloud is a machine learning or deep learning engineering platform intended for enterprises and developers. It provides easy-to-use, cost-effective, high-performance, and easy-to-scale plug-ins that can be applied to various industry scenarios. With over 140 built-in optimization algorithms, Machine Learning Platform for AI provides whole-process AI engineering capabilities including data labeling (PAI-iTAG), model building (PAI-Designer and PAI-DSW), model training (PAI-DLC), compilation optimization, and inference deployment (PAI-EAS). PAI-EAS supports different types of hardware resources, including CPUs and GPUs, and features high throughput and low latency. It allows you to deploy large-scale complex models with a few clicks and perform elastic scale-ins and scale-outs in real time. It also provides a comprehensive O&M and monitoring system."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms.pai_eas_endpoint import PaiEasEndpoint\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"One who want to use eas llms must set up eas service first. When the eas service is launched, eas_service_rul and eas_service token can be got. Users can refer to https://www.alibabacloud.com/help/en/pai/user-guide/service-deployment/ for more information,"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"EAS_SERVICE_URL\"] = \"Your_EAS_Service_URL\"\n",
"os.environ[\"EAS_SERVICE_TOKEN\"] = \"Your_EAS_Service_Token\"\n",
"llm = PaiEasEndpoint(eas_service_url=os.environ[\"EAS_SERVICE_URL\"], eas_service_token=os.environ[\"EAS_SERVICE_TOKEN\"])"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' Thank you for asking! However, I must respectfully point out that the question contains an error. Justin Bieber was born in 1994, and the Super Bowl was first played in 1967. Therefore, it is not possible for any NFL team to have won the Super Bowl in the year Justin Bieber was born.\\n\\nI hope this clarifies things! If you have any other questions, please feel free to ask.'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"llm_chain.run(question)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -82,6 +82,15 @@
"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example to initialize with external boto3 session\n",
"\n",
"### for cross account scenarios"
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -92,7 +101,77 @@
"source": [
"from typing import Dict\n",
"\n",
"from langchain.prompts import PromptTemplate\nfrom langchain.llms import SagemakerEndpoint\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.llms import SagemakerEndpoint\n",
"from langchain.llms.sagemaker_endpoint import LLMContentHandler\n",
"from langchain.chains.question_answering import load_qa_chain\n",
"import json\n",
"import boto3\n",
"\n",
"query = \"\"\"How long was Elizabeth hospitalized?\n",
"\"\"\"\n",
"\n",
"prompt_template = \"\"\"Use the following pieces of context to answer the question at the end.\n",
"\n",
"{context}\n",
"\n",
"Question: {question}\n",
"Answer:\"\"\"\n",
"PROMPT = PromptTemplate(\n",
" template=prompt_template, input_variables=[\"context\", \"question\"]\n",
")\n",
"\n",
"roleARN = 'arn:aws:iam::123456789:role/cross-account-role'\n",
"sts_client = boto3.client('sts')\n",
"response = sts_client.assume_role(RoleArn=roleARN, \n",
" RoleSessionName='CrossAccountSession')\n",
"\n",
"client = boto3.client(\n",
" \"sagemaker-runtime\",\n",
" region_name=\"us-west-2\", \n",
" aws_access_key_id=response['Credentials']['AccessKeyId'],\n",
" aws_secret_access_key=response['Credentials']['SecretAccessKey'],\n",
" aws_session_token = response['Credentials']['SessionToken']\n",
")\n",
"\n",
"class ContentHandler(LLMContentHandler):\n",
" content_type = \"application/json\"\n",
" accepts = \"application/json\"\n",
"\n",
" def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:\n",
" input_str = json.dumps({prompt: prompt, **model_kwargs})\n",
" return input_str.encode(\"utf-8\")\n",
"\n",
" def transform_output(self, output: bytes) -> str:\n",
" response_json = json.loads(output.read().decode(\"utf-8\"))\n",
" return response_json[0][\"generated_text\"]\n",
"\n",
"\n",
"content_handler = ContentHandler()\n",
"\n",
"chain = load_qa_chain(\n",
" llm=SagemakerEndpoint(\n",
" endpoint_name=\"endpoint-name\",\n",
" client=client,\n",
" model_kwargs={\"temperature\": 1e-10},\n",
" content_handler=content_handler,\n",
" ),\n",
" prompt=PROMPT,\n",
")\n",
"\n",
"chain({\"input_documents\": docs, \"question\": query}, return_only_outputs=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from typing import Dict\n",
"\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.llms import SagemakerEndpoint\n",
"from langchain.llms.sagemaker_endpoint import LLMContentHandler\n",
"from langchain.chains.question_answering import load_qa_chain\n",
"import json\n",

View File

@@ -30,7 +30,6 @@ Access PaLM chat models like `chat-bison` and `codechat-bison` via Google Cloud.
from langchain.chat_models import ChatVertexAI
```
## Document Loader
### Google BigQuery
@@ -51,7 +50,7 @@ from langchain.document_loaders import BigQueryLoader
### Google Cloud Storage
>[Google Cloud Storage](https://en.wikipedia.org/wiki/Google_Cloud_Storage) is a managed service for storing unstructured data.
> [Google Cloud Storage](https://en.wikipedia.org/wiki/Google_Cloud_Storage) is a managed service for storing unstructured data.
First, we need to install `google-cloud-storage` python package.
@@ -74,11 +73,11 @@ from langchain.document_loaders import GCSFileLoader
### Google Drive
>[Google Drive](https://en.wikipedia.org/wiki/Google_Drive) is a file storage and synchronization service developed by Google.
> [Google Drive](https://en.wikipedia.org/wiki/Google_Drive) is a file storage and synchronization service developed by Google.
Currently, only `Google Docs` are supported.
First, we need to install several python package.
First, we need to install several python packages.
```bash
pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib
@@ -91,10 +90,11 @@ from langchain.document_loaders import GoogleDriveLoader
```
## Vector Store
### Google Vertex AI MatchingEngine
### Google Vertex AI Vector Search
> [Google Vertex AI Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview) provides
> the industry's leading high-scale low latency vector database. These vector databases are commonly
> [Google Vertex AI Vector Search](https://cloud.google.com/vertex-ai/docs/matching-engine/overview),
> formerly known as Vertex AI Matching Engine, provides the industry's leading high-scale
> low latency vector database. These vector databases are commonly
> referred to as vector similarity-matching or an approximate nearest neighbor (ANN) service.
We need to install several python packages.
@@ -181,14 +181,28 @@ There exists a `GoogleSearchAPIWrapper` utility which wraps this API. To import
```python
from langchain.utilities import GoogleSearchAPIWrapper
```
For a more detailed walkthrough of this wrapper, see [this notebook](/docs/integrations/tools/google_search.html).
We can easily load this wrapper as a Tool (to use with an Agent). We can do this with:
```python
from langchain.agents import load_tools
tools = load_tools(["google-search"])
```
### Google Places
See a [usage example](/docs/integrations/tools/google_places).
```
pip install googlemaps
```
```python
from langchain.tools import GooglePlacesTool
```
## Document Transformer
### Google Document AI
@@ -216,3 +230,40 @@ See a [usage example](/docs/integrations/document_transformers/docai).
from langchain.document_loaders.blob_loaders import Blob
from langchain.document_loaders.parsers import DocAIParser
```
## Chat loaders
### Gmail
> [Gmail](https://en.wikipedia.org/wiki/Gmail) is a free email service provided by Google.
First, we need to install several python packages.
```bash
pip install --upgrade google-auth google-auth-oauthlib google-auth-httplib2 google-api-python-client
```
See a [usage example and authorizing instructions](/docs/integrations/chat_loaders/gmail).
```python
from langchain.chat_loaders.gmail import GMailLoader
```
## Agents and Toolkits
### Gmail
See a [usage example and authorizing instructions](/docs/integrations/toolkits/gmail).
```python
from langchain.agents.agent_toolkits import GmailToolkit
toolkit = GmailToolkit()
```
### Google Drive
See a [usage example and authorizing instructions](/docs/integrations/toolkits/google_drive).
```python
from langchain_googledrive.utilities.google_drive import GoogleDriveAPIWrapper
from langchain_googledrive.tools.google_drive.tool import GoogleDriveSearchTool
```

View File

@@ -1,6 +1,6 @@
# Microsoft
All functionality related to Microsoft Azure
All functionality related to `Microsoft Azure` and other `Microsoft` products.
## LLM
### Azure OpenAI
@@ -161,3 +161,59 @@ See a [usage example](/docs/integrations/retrievers/azure_cognitive_search).
from langchain.retrievers import AzureCognitiveSearchRetriever
```
## Utilities
### Bing Search API
See a [usage example](/docs/integrations/tools/bing_search).
```python
from langchain.utilities import BingSearchAPIWrapper
```
## Toolkits
### Azure Cognitive Services
We need to install several python packages.
```bash
pip install azure-ai-formrecognizer azure-cognitiveservices-speech azure-ai-vision
```
See a [usage example](/docs/integrations/toolkits/azure_cognitive_services).
```python
from langchain.agents.agent_toolkits import O365Toolkit
```
### Microsoft Office 365 email and calendar
We need to install `O365` python package.
```bash
pip install O365
```
See a [usage example](/docs/integrations/toolkits/office365).
```python
from langchain.agents.agent_toolkits import O365Toolkit
```
### Microsoft Azure PowerBI
We need to install `azure-identity` python package.
```bash
pip install azure-identity
```
See a [usage example](/docs/integrations/toolkits/powerbi).
```python
from langchain.agents.agent_toolkits import PowerBIToolkit
from langchain.utilities.powerbi import PowerBIDataset
```

View File

@@ -10,16 +10,27 @@ It is broken into two parts: installation and setup, and then references to spec
## Available Models
DeepInfra provides a range of Open Source LLMs ready for deployment.
You can list supported models [here](https://deepinfra.com/models?type=text-generation).
You can list supported models for
[text-generation](https://deepinfra.com/models?type=text-generation) and
[embeddings](https://deepinfra.com/models?type=embeddings).
google/flan\* models can be viewed [here](https://deepinfra.com/models?type=text2text-generation).
You can view a list of request and response parameters [here](https://deepinfra.com/databricks/dolly-v2-12b#API)
You can view a [list of request and response parameters](https://deepinfra.com/meta-llama/Llama-2-70b-chat-hf/api).
## Wrappers
### LLM
There exists an DeepInfra LLM wrapper, which you can access with
```python
from langchain.llms import DeepInfra
```
### Embeddings
There is also an DeepInfra Embeddings wrapper, you can access with
```python
from langchain.embeddings import DeepInfraEmbeddings
```

View File

@@ -1,14 +1,14 @@
# Dingo
# DingoDB
This page covers how to use the Dingo ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Dingo wrappers.
This page covers how to use the DingoDB ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific DingoDB wrappers.
## Installation and Setup
- Install the Python SDK with `pip install dingodb`
## VectorStore
There exists a wrapper around Dingo indexes, allowing you to use it as a vectorstore,
There exists a wrapper around DingoDB indexes, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
@@ -16,4 +16,4 @@ To import this vectorstore:
from langchain.vectorstores import Dingo
```
For a more detailed walkthrough of the Dingo wrapper, see [this notebook](/docs/integrations/vectorstores/dingo.html)
For a more detailed walkthrough of the DingoDB wrapper, see [this notebook](/docs/integrations/vectorstores/dingo.html)

View File

@@ -1,28 +0,0 @@
# Google Document AI
>[Document AI](https://cloud.google.com/document-ai/docs/overview) is a `Google Cloud Platform`
> service to transform unstructured data from documents into structured data, making it easier
> to understand, analyze, and consume.
## Installation and Setup
You need to set up a [`GCS` bucket and create your own OCR processor](https://cloud.google.com/document-ai/docs/create-processor)
The `GCS_OUTPUT_PATH` should be a path to a folder on GCS (starting with `gs://`)
and a processor name should look like `projects/PROJECT_NUMBER/locations/LOCATION/processors/PROCESSOR_ID`.
You can get it either programmatically or copy from the `Prediction endpoint` section of the `Processor details`
tab in the Google Cloud Console.
```bash
pip install google-cloud-documentai
pip install google-cloud-documentai-toolbox
```
## Document Transformer
See a [usage example](/docs/integrations/document_transformers/docai).
```python
from langchain.document_loaders.blob_loaders import Blob
from langchain.document_loaders.parsers import DocAIParser
```

View File

@@ -1,4 +1,4 @@
# Google Serper
# Serper - Google Search API
This page covers how to use the [Serper](https://serper.dev) Google Search API within LangChain. Serper is a low-cost Google Search API that can be used to add answer box, knowledge graph, and organic results data from Google Search.
It is broken into two parts: setup, and then references to the specific Google Serper wrapper.

View File

@@ -0,0 +1,29 @@
# Salute Devices
Salute Devices provides GigaChat LLM's models.
For more info how to get access to GigaChat [follow here](https://developers.sber.ru/docs/ru/gigachat/api/integration).
## Installation and Setup
GigaChat package can be installed via pip from PyPI:
```bash
pip install gigachat
```
## LLMs
See a [usage example](/docs/integrations/llms/gigachat).
```python
from langchain.llms import GigaChat
```
## Chat models
See a [usage example](/docs/integrations/chat/gigachat).
```python
from langchain.chat_models import GigaChat
```

View File

@@ -1,272 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Google Cloud Enterprise Search\n",
"\n",
"\n",
"[Enterprise Search](https://cloud.google.com/enterprise-search) is a part of the Generative AI App Builder suite of tools offered by Google Cloud.\n",
"\n",
"Gen AI App Builder lets developers, even those with limited machine learning skills, quickly and easily tap into the power of Googles foundation models, search expertise, and conversational AI technologies to create enterprise-grade generative AI applications. \n",
"\n",
"Enterprise Search lets organizations quickly build generative AI powered search engines for customers and employees.Enterprise Search is underpinned by a variety of Google Search technologies, including semantic search, which helps deliver more relevant results than traditional keyword-based search techniques by using natural language processing and machine learning techniques to infer relationships within the content and intent from the users query input. Enterprise Search also benefits from Googles expertise in understanding how users search and factors in content relevance to order displayed results. \n",
"\n",
"Google Cloud offers Enterprise Search via Gen App Builder in Google Cloud Console and via an API for enterprise workflow integration. \n",
"\n",
"This notebook demonstrates how to configure Enterprise Search and use the Enterprise Search retriever. The Enterprise Search retriever encapsulates the [Generative AI App Builder Python client library](https://cloud.google.com/generative-ai-app-builder/docs/libraries#client-libraries-install-python) and uses it to access the Enterprise Search [Search Service API](https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1beta.services.search_service)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install pre-requisites\n",
"\n",
"You need to install the `google-cloud-discoverengine` package to use the Enterprise Search retriever."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! pip install google-cloud-discoveryengine"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure access to Google Cloud and Google Cloud Enterprise Search\n",
"\n",
"Enterprise Search is generally available for the allowlist (which means customers need to be approved for access) as of June 6, 2023. Contact your Google Cloud sales team for access and pricing details. We are previewing additional features that are coming soon to the generally available offering as part of our [Trusted Tester](https://cloud.google.com/ai/earlyaccess/join?hl=en) program. Sign up for [Trusted Tester](https://cloud.google.com/ai/earlyaccess/join?hl=en) and contact your Google Cloud sales team for an expedited trial.\n",
"\n",
"Before you can run this notebook you need to:\n",
"- Set or create a Google Cloud project and turn on Gen App Builder\n",
"- Create and populate an unstructured data store\n",
"- Set credentials to access `Enterprise Search API`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set or create a Google Cloud poject and turn on Gen App Builder\n",
"\n",
"Follow the instructions in the [Enterprise Search Getting Started guide](https://cloud.google.com/generative-ai-app-builder/docs/before-you-begin) to set/create a GCP project and enable Gen App Builder.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create and populate an unstructured data store\n",
"\n",
"[Use Google Cloud Console to create an unstructured data store](https://cloud.google.com/generative-ai-app-builder/docs/create-engine-es#unstructured-data) and populate it with the example PDF documents from the `gs://cloud-samples-data/gen-app-builder/search/alphabet-investor-pdfs` Cloud Storage folder. Make sure to use the `Cloud Storage (without metadata)` option."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set credentials to access Enterprise Search API\n",
"\n",
"The [Gen App Builder client libraries](https://cloud.google.com/generative-ai-app-builder/docs/libraries) used by the Enterprise Search retriever provide high-level language support for authenticating to Gen App Builder programmatically. Client libraries support [Application Default Credentials (ADC)](https://cloud.google.com/docs/authentication/application-default-credentials); the libraries look for credentials in a set of defined locations and use those credentials to authenticate requests to the API. With ADC, you can make credentials available to your application in a variety of environments, such as local development or production, without needing to modify your application code.\n",
"\n",
"If running in [Google Colab](https://colab.google) authenticate with `google.colab.google.auth` otherwise follow one of the [supported methods](https://cloud.google.com/docs/authentication/application-default-credentials) to make sure that you Application Default Credentials are properly set."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"\n",
"if \"google.colab\" in sys.modules:\n",
" from google.colab import auth as google_auth\n",
"\n",
" google_auth.authenticate_user()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure and use the Enterprise Search retriever\n",
"\n",
"The Enterprise Search retriever is implemented in the `langchain.retriever.GoogleCloudEntepriseSearchRetriever` class. The `get_relevant_documents` method returns a list of `langchain.schema.Document` documents where the `page_content` field of each document is populated the document content.\n",
"Depending on the data type used in Enterprise search (structured or unstructured) the `page_content` field is populated as follows:\n",
"- Structured data source: either an `extractive segment` or an `extractive answer` that matches a query. The `metadata` field is populated with metadata (if any) of the document from which the segments or answers were extracted.\n",
"- Unstructured data source: a string json containing all the fields returned from the structured data source. The `metadata` field is populated with metadata (if any) of the document \n",
"\n",
"### Only for Unstructured data sources:\n",
"An extractive answer is verbatim text that is returned with each search result. It is extracted directly from the original document. Extractive answers are typically displayed near the top of web pages to provide an end user with a brief answer that is contextually relevant to their query. Extractive answers are available for website and unstructured search.\n",
"\n",
"An extractive segment is verbatim text that is returned with each search result. An extractive segment is usually more verbose than an extractive answer. Extractive segments can be displayed as an answer to a query, and can be used to perform post-processing tasks and as input for large language models to generate answers or new text. Extractive segments are available for unstructured search.\n",
"\n",
"For more information about extractive segments and extractive answers refer to [product documentation](https://cloud.google.com/generative-ai-app-builder/docs/snippets).\n",
"\n",
"When creating an instance of the retriever you can specify a number of parameters that control which Enterprise data store to access and how a natural language query is processed, including configurations for extractive answers and segments.\n",
"\n",
"\n",
"### The mandatory parameters are:\n",
"\n",
"- `project_id` - Your Google Cloud PROJECT_ID\n",
"- `search_engine_id` - The ID of the data store you want to use. \n",
"\n",
"The `project_id` and `search_engine_id` parameters can be provided explicitly in the retriever's constructor or through the environment variables - `PROJECT_ID` and `SEARCH_ENGINE_ID`.\n",
"\n",
"You can also configure a number of optional parameters, including:\n",
"\n",
"- `max_documents` - The maximum number of documents used to provide extractive segments or extractive answers\n",
"- `get_extractive_answers` - By default, the retriever is configured to return extractive segments. Set this field to `True` to return extractive answers. This is used only when `engine_data_type` set to 0 (unstructured) \n",
"- `max_extractive_answer_count` - The maximum number of extractive answers returned in each search result.\n",
" At most 5 answers will be returned. This is used only when `engine_data_type` set to 0 (unstructured) \n",
"- `max_extractive_segment_count` - The maximum number of extractive segments returned in each search result.\n",
" Currently one segment will be returned. This is used only when `engine_data_type` set to 0 (unstructured) \n",
"- `filter` - The filter expression that allows you filter the search results based on the metadata associated with the documents in the searched data store. \n",
"- `query_expansion_condition` - Specification to determine under which conditions query expansion should occur.\n",
" 0 - Unspecified query expansion condition. In this case, server behavior defaults to disabled.\n",
" 1 - Disabled query expansion. Only the exact search query is used, even if SearchResponse.total_size is zero.\n",
" 2 - Automatic query expansion built by the Search API.\n",
"- `engine_data_type` - Defines the enterprise search data type\n",
" 0 - Unstructured data \n",
" 1 - Structured data\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure and use the retriever for **unstructured** data with extractve segments "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.retrievers import GoogleCloudEnterpriseSearchRetriever\n",
"\n",
"PROJECT_ID = \"<YOUR PROJECT ID>\" # Set to your Project ID\n",
"SEARCH_ENGINE_ID = \"<YOUR SEARCH ENGINE ID>\" # Set to your data store ID"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever = GoogleCloudEnterpriseSearchRetriever(\n",
" project_id=PROJECT_ID,\n",
" search_engine_id=SEARCH_ENGINE_ID,\n",
" max_documents=3,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"What are Alphabet's Other Bets?\"\n",
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure and use the retriever for **unstructured** data with extractve answers "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever = GoogleCloudEnterpriseSearchRetriever(\n",
" project_id=PROJECT_ID,\n",
" search_engine_id=SEARCH_ENGINE_ID,\n",
" max_documents=3,\n",
" max_extractive_answer_count=3,\n",
" get_extractive_answers=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"What are Alphabet's Other Bets?\"\n",
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure and use the retriever for **structured** data with extractve answers "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever = GoogleCloudEnterpriseSearchRetriever(\n",
" project_id=PROJECT_ID,\n",
" search_engine_id=SEARCH_ENGINE_ID,\n",
" max_documents=3,\n",
" engine_data_type=1\n",
")\n",
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -30,7 +30,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install google-cloud-discoveryengine"
"! pip install google-cloud-discoveryengine\n"
]
},
{
@@ -80,7 +80,7 @@
"if \"google.colab\" in sys.modules:\n",
" from google.colab import auth as google_auth\n",
"\n",
" google_auth.authenticate_user()"
" google_auth.authenticate_user()\n"
]
},
{
@@ -90,12 +90,13 @@
"## Configure and use the Vertex AI Search retriever\n",
"\n",
"The Vertex AI Search retriever is implemented in the `langchain.retriever.GoogleVertexAISearchRetriever` class. The `get_relevant_documents` method returns a list of `langchain.schema.Document` documents where the `page_content` field of each document is populated the document content.\n",
"Depending on the data type used in Vertex AI Search (structured or unstructured) the `page_content` field is populated as follows:\n",
"Depending on the data type used in Vertex AI Search (website, structured or unstructured) the `page_content` field is populated as follows:\n",
"\n",
"- Structured data source: either an `extractive segment` or an `extractive answer` that matches a query. The `metadata` field is populated with metadata (if any) of the document from which the segments or answers were extracted.\n",
"- Unstructured data source: a string json containing all the fields returned from the structured data source. The `metadata` field is populated with metadata (if any) of the document\n",
"- Website with advanced indexing: an `extractive answer` that matches a query. The `metadata` field is populated with metadata (if any) of the document from which the segments or answers were extracted.\n",
"- Unstructured data source: either an `extractive segment` or an `extractive answer` that matches a query. The `metadata` field is populated with metadata (if any) of the document from which the segments or answers were extracted.\n",
"- Structured data source: a string json containing all the fields returned from the structured data source. The `metadata` field is populated with metadata (if any) of the document\n",
"\n",
"### Only for Unstructured data sources:\n",
"### Extractive answers & extractive segments\n",
"\n",
"An extractive answer is verbatim text that is returned with each search result. It is extracted directly from the original document. Extractive answers are typically displayed near the top of web pages to provide an end user with a brief answer that is contextually relevant to their query. Extractive answers are available for website and unstructured search.\n",
"\n",
@@ -136,6 +137,7 @@
"- `engine_data_type` - Defines the Vertex AI Search data type\n",
" - `0` - Unstructured data\n",
" - `1` - Structured data\n",
" - `2` - Website data with [Advanced Website Indexing](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#advanced-website-indexing)\n",
"\n",
"### Migration guide for `GoogleCloudEnterpriseSearchRetriever`\n",
"\n",
@@ -165,7 +167,7 @@
"\n",
"PROJECT_ID = \"<YOUR PROJECT ID>\" # Set to your Project ID\n",
"LOCATION_ID = \"<YOUR LOCATION>\" # Set to your data store location\n",
"DATA_STORE_ID = \"<YOUR DATA STORE ID>\" # Set to your data store ID"
"DATA_STORE_ID = \"<YOUR DATA STORE ID>\" # Set to your data store ID\n"
]
},
{
@@ -179,7 +181,7 @@
" location_id=LOCATION_ID,\n",
" data_store_id=DATA_STORE_ID,\n",
" max_documents=3,\n",
")"
")\n"
]
},
{
@@ -192,7 +194,7 @@
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)"
" print(doc)\n"
]
},
{
@@ -219,7 +221,7 @@
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)"
" print(doc)\n"
]
},
{
@@ -245,21 +247,44 @@
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)"
" print(doc)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure and use the retrieve for multi-turn search"
"### Configure and use the retriever for **website** data with Advanced Website Indexing\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever = GoogleVertexAISearchRetriever(\n",
" project_id=PROJECT_ID,\n",
" location_id=LOCATION_ID,\n",
" data_store_id=DATA_STORE_ID,\n",
" max_documents=3,\n",
" max_extractive_answer_count=3,\n",
" get_extractive_answers=True,\n",
" engine_data_type=2,\n",
")\n",
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Search with follow-ups is [based](https://cloud.google.com/generative-ai-app-builder/docs/multi-turn-search) on generative AI models and it is different from the regular unstructured data search."
"### Configure and use the retriever for multi-turn search\n",
"\n",
"[Search with follow-ups](https://cloud.google.com/generative-ai-app-builder/docs/multi-turn-search) is based on generative AI models and it is different from the regular unstructured data search.\n"
]
},
{
@@ -276,7 +301,7 @@
"\n",
"result = retriever.get_relevant_documents(query)\n",
"for doc in result:\n",
" print(doc)"
" print(doc)\n"
]
}
],

View File

@@ -64,13 +64,13 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 3,
"id": "b4d4d386-2a6b-4942-863e-9202f5a9f1d6",
"metadata": {},
"outputs": [],
"source": [
"from langchain.retrievers import KayAiRetriever\n",
"import os\n",
"from langchain.retrievers import KayAiRetriever\n",
"from kay.rag.retrievers import KayRetriever\n",
"os.environ[\"KAY_API_KEY\"] = KAY_API_KEY\n",
"retriever = KayAiRetriever.create(dataset_id=\"company\", data_types=[\"10-K\", \"10-Q\", \"PressRelease\"], num_contexts=3)\n",
@@ -79,19 +79,19 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 4,
"id": "04ee2d6b-c2ab-4e15-8a8b-afaf6ef8c0f6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Company Name: ROKU INC\\nCompany Industry: CABLE & OTHER PAY TELEVISION SERVICES\\nArticle Title: Roku and FreeWheel Announce Strategic Partnership to Bring Rokus Leading Ad Tech to FreeWheel Customers\\nText: Additionally, eMarketer Link: https://cts.businesswire.com/ct/CT?id=smartlink&url=https%3A%2F%2Fwww.insiderintelligence.com%2Finsights%2Favod-more-than-50-percent-of-us-digital-video-viewers%2F&esheet=53451144&newsitemid=20230712907788&lan=en-US&anchor=eMarketer&index=4&md5=b64dea72bcf6b6379474462602781d83 projects 57% of U.S. digital video users will stream an advertising-based video on demand (AVOD) service this year.\\nHaving solutions aimed at driving greater interoperability and automation will help accelerate this growth.\\nKey highlights of this collaboration include:\\nStreamlined Integration: Roku has now integrated its demand application programming interface (dAPI) with FreeWheel s TV platform. Roku s demand API gives publishers direct, automatic and real-time access to more advertiser demand. This enhanced integration allows for streamlined ad operation workflows and better inventory quality control, both of which will improve publisher yield and revenue.\\nSeamless Data Targeting: Publishers can now use Roku platform signals to enable advertisers to target audiences and measure campaign performance without relying on cookies. Additionally, FreeWheel and Roku will rely on data clean room technology to enable the activation of additional data sets providing better measurement and monetization to publishers and agencies.', metadata={'_additional': {'id': '962b79e0-f9d1-43ae-9f7a-8a9b42bc7a9a'}, 'chunk_type': 'text', 'chunk_years_mentioned': [], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': 'PressRelease', 'data_source_link': 'https://www.nasdaq.com/press-release/roku-and-freewheel-announce-strategic-partnership-to-bring-rokus-leading-ad-tech-to', 'data_source_publish_date': '2023-07-12T00:00:00Z', 'data_source_uid': 'a46f309c-705d-3946-96db-87aa4e73261f', 'title': 'ROKU INC | Roku and FreeWheel Announce Strategic Partnership to Bring Rokus Leading Ad Tech to FreeWheel Customers'}),\n",
" Document(page_content='Company Name: ROKU INC \\n Company Industry: CABLE & OTHER PAY TELEVISION SERVICES \\n Form Title: 10-K 2022-FY \\n Form Section: Risk Factors \\n Text: nd the Note Regarding Forward Looking Statements.This section of this Annual Report generally discusses fiscal years 2022 and 2021 and year to year comparisons between those years.Discussions of fiscal year 2020 and year to year comparisons between fiscal years 2021 and 2020 that are not included in this Annual Report can be found in Management\\'s Discussion and Analysis of Financial Condition and Results of Operations in Part II, Item 7 of our Annual Report for the fiscal year ended December 31, 2021 filed with the SEC on February 18, 2022.Overview Effective as of the fourth quarter of fiscal 2022, we reorganized our reportable segments to better align with management\\'s reporting of information reviewed by the Chief Operating Decision Maker (\"CODM\") for each segment.We renamed our \"player\" segment to \"devices\" which now includes our licensing arrangements with service operators and licensed Roku TV partners in addition to sales of our streaming players, audio products, smart home products and Roku branded TVs that will be designed, made, and sold by us in 2023.Our historical segment information is recast to conform to our new presentation in our financial statements and accompanying notes included in Item 8 of this Annual Report.Our two reportable segments are the platform segment and the devices segment.', metadata={'_additional': {'id': 'a76c5fed-5d63-45a7-b63a-2c30e05140fc'}, 'chunk_type': 'text', 'chunk_years_mentioned': [2020, 2021, 2022, 2023], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': '10-K', 'data_source_link': 'https://www.sec.gov/Archives/edgar/data/1428439/000142843923000007', 'data_source_publish_date': '2022-01-01T00:00:00Z', 'data_source_uid': '0001428439-23-000007', 'title': 'ROKU INC | 10-K 2022-FY '}),\n",
" Document(page_content='Company Name: ROKU INC \\n Company Industry: CABLE & OTHER PAY TELEVISION SERVICES \\n Form Title: 10-Q 2023-Q1 \\n Form Section: Risk Factors \\n Text: Our current and potential partners include TV brands, cable and satellite companies, and telecommunication providers.Under these license arrangements, we generally have limited or no control over the amount and timing of resources these entities dedicate to the relationship.In the past, our licensed Roku TV partners have failed to meet their forecasts and anticipated market launch dates for distributing Roku TV models, and they may fail to meet their forecasts or such launches in the future.If our licensed Roku TV partners or service operator partners fail to meet their forecasts or such launches for distributing licensed streaming devices or choose to deploy competing streaming solutions within their product lines, our business may be harmed.We depend on a small number of content publishers for a majority of our streaming hours, and if we fail to maintain these relationships, our business could be harmed.*Historically, a small number of content publishers have accounted for a significant portion of the hours streamed on our platform.In the three months ended March 31, 2023, the top three streaming services represented over 50% of all hours streamed in the period.If, for any reason, we cease distributing channels that have historically streamed a large percentage of the aggregate streaming hours on our platform, our streaming hours, our active accounts, or Roku streaming device sales may be adversely affected, and our business may be harmed.', metadata={'_additional': {'id': '2a92b2bb-02a0-4e15-8b64-d7e04078a205'}, 'chunk_type': 'text', 'chunk_years_mentioned': [2023], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': '10-Q', 'data_source_link': 'https://www.sec.gov/Archives/edgar/data/1428439/000142843923000017', 'data_source_publish_date': '2023-01-01T00:00:00Z', 'data_source_uid': '0001428439-23-000017', 'title': 'ROKU INC | 10-Q 2023-Q1 '})]"
"[Document(page_content='Company Name: ROKU INC\\nCompany Industry: CABLE & OTHER PAY TELEVISION SERVICES\\nArticle Title: Roku Is One of Fast Company\\'s Most Innovative Companies for 2023\\nText: The company launched several new devices, including the Roku Voice Remote Pro; upgraded its most premium player, the Roku Ultra; and expanded its products with a new line of smart home devices such as video doorbells, lights, and plugs integrated into the Roku ecosystem. Recently, the company announced it will launch Roku-branded TVs this spring to offer more choice and innovation to both consumers and Roku TV partners. Throughout 2022, Roku also updated its operating system (OS), the only OS purpose-built for TV, with more personalization features and enhancements across search, audio, and content discovery, launching The Buzz, Sports, and What to Watch, which provides tailored movie and TV recommendations on the Home Screen Menu. The company also released a new feature for streamers, Photo Streams, that allows customers to display and share photo albums through Roku streaming devices. Additionally, Roku unveiled Shoppable Ads, a new ad innovation that makes shopping on TV streaming as easy as it is on social media. Viewers simply press \"OK\" with their Roku remote on a shoppable ad and proceed to check out with their shipping and payment details pre-populated from Roku Pay, its proprietary payments platform. Walmart was the exclusive retailer for the launch, a first-of-its-kind partnership.', metadata={'chunk_type': 'text', 'chunk_years_mentioned': [2022, 2023], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': 'PressRelease', 'data_source_link': 'https://newsroom.roku.com/press-releases', 'data_source_publish_date': '2023-03-02T09:30:00-04:00', 'data_source_uid': '963d4a81-f58e-3093-af68-987fb1758c15', 'title': \"ROKU INC | Roku Is One of Fast Company's Most Innovative Companies for 2023\"}),\n",
" Document(page_content='Company Name: ROKU INC\\nCompany Industry: CABLE & OTHER PAY TELEVISION SERVICES\\nArticle Title: Roku Is One of Fast Company\\'s Most Innovative Companies for 2023\\nText: Finally, Roku grew its content offering with thousands of apps and watching options for users, including content on The Roku Channel, a top five app by reach and engagement on the Roku platform in the U.S. in 2022. In November, Roku released its first feature film, \"WEIRD: The Weird Al\\' Yankovic Story,\" a biopic starring Daniel Radcliffe. Throughout the year, The Roku Channel added FAST channels from NBCUniversal and the National Hockey League, as well as an exclusive AMC channel featuring its signature drama \"Mad Men.\" This year, the company announced a deal with Warner Bros. Discovery, launching new channels that will include \"Westworld\" and \"The Bachelor,\" in addition to 2,000 hours of on-demand content. Read more about Roku\\'s journey here . Fast Company\\'s Most Innovative Companies issue (March/April 2023) is available online here , as well as in-app via iTunes and on newsstands beginning March 14. About Roku, Inc.\\nRoku pioneered streaming to the TV. We connect users to the streaming content they love, enable content publishers to build and monetize large audiences, and provide advertisers with unique capabilities to engage consumers. Roku streaming players and TV-related audio devices are available in the U.S. and in select countries through direct retail sales and licensing arrangements with service operators. Roku TV models are available in the U.S. and select countries through licensing arrangements with TV OEM brands.', metadata={'chunk_type': 'text', 'chunk_years_mentioned': [2022, 2023], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': 'PressRelease', 'data_source_link': 'https://newsroom.roku.com/press-releases', 'data_source_publish_date': '2023-03-02T09:30:00-04:00', 'data_source_uid': '963d4a81-f58e-3093-af68-987fb1758c15', 'title': \"ROKU INC | Roku Is One of Fast Company's Most Innovative Companies for 2023\"}),\n",
" Document(page_content='Company Name: ROKU INC\\nCompany Industry: CABLE & OTHER PAY TELEVISION SERVICES\\nArticle Title: Roku\\'s New NFL Zone Gives Fans Easy Access to NFL Games Right On Time for 2023 Season\\nText: In partnership with the NFL, the new NFL Zone offers viewers an easy way to find where to watch NFL live games Today, Roku (NASDAQ: ROKU ) and the National Football League (NFL) announced the recently launched NFL Zone within the Roku Sports experience to kick off the 2023 NFL season. This strategic partnership between Roku and the NFL marks the first official league-branded zone within Roku\\'s Sports experience. Available now, the NFL Zone offers football fans a centralized location to find live and upcoming games, so they can spend less time figuring out where to watch the game and more time rooting for their favorite teams. Users can also tune in for weekly game previews, League highlights, and additional NFL content, all within the zone. This press release features multimedia. View the full release here: In partnership with the NFL, Roku\\'s new NFL Zone offers viewers an easy way to find where to watch NFL live games (Photo: Business Wire) \"Last year we introduced the Sports experience for our highly engaged sports audience, making it simpler for Roku users to watch sports programming,\" said Gidon Katz, President, Consumer Experience, at Roku. \"As we start the biggest sports season of the year, providing easy access to NFL games and content to our millions of users is a top priority for us. We look forward to fans immersing themselves within the NFL Zone and making it their destination to find NFL games.', metadata={'chunk_type': 'text', 'chunk_years_mentioned': [2023], 'company_name': 'ROKU INC', 'company_sic_code_description': 'CABLE & OTHER PAY TELEVISION SERVICES', 'data_source': 'PressRelease', 'data_source_link': 'https://newsroom.roku.com/press-releases', 'data_source_publish_date': '2023-09-12T09:00:00-04:00', 'data_source_uid': '963d4a81-f58e-3093-af68-987fb1758c15', 'title': \"ROKU INC | Roku's New NFL Zone Gives Fans Easy Access to NFL Games Right On Time for 2023 Season\"})]"
]
},
"execution_count": 21,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -28,19 +28,29 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 11,
"id": "63a8af5b",
"metadata": {
"tags": []
},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[33mWARNING: You are using pip version 22.0.4; however, version 23.3 is available.\n",
"You should consider upgrading via the '/Users/joe/projects/elastic/langchain/libs/langchain/.venv/bin/python3 -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n",
"\u001b[0m"
]
}
],
"source": [
"#!pip install lark elasticsearch"
"#!pip install -qU lark elasticsearch"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "cb4a5787",
"metadata": {
"tags": []
@@ -60,7 +70,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 2,
"id": "bcbe04d9",
"metadata": {
"tags": []
@@ -115,7 +125,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 3,
"id": "86e34dbf",
"metadata": {
"tags": []
@@ -164,17 +174,10 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 4,
"id": "38a126e9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query='dinosaur' filter=None limit=None\n"
]
},
{
"data": {
"text/plain": [
@@ -184,7 +187,7 @@
" Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.6})]"
]
},
"execution_count": 10,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -196,24 +199,17 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 5,
"id": "b19d4da0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query='women' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='director', value='Greta Gerwig') limit=None\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'director': 'Greta Gerwig', 'rating': 8.3})]"
]
},
"execution_count": 11,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -237,7 +233,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 6,
"id": "bff36b88-b506-4877-9c63-e5a1a8d78e64",
"metadata": {
"tags": []
@@ -256,19 +252,12 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 7,
"id": "2758d229-4f97-499c-819f-888acaf8ee10",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query='dinosaur' filter=None limit=2\n"
]
},
{
"data": {
"text/plain": [
@@ -276,7 +265,7 @@
" Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
]
},
"execution_count": 13,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -297,24 +286,17 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 8,
"id": "e460da93",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"query='animated toys' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Operation(operator=<Operator.OR: 'or'>, arguments=[Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='animated'), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='comedy')]), Comparison(comparator=<Comparator.GTE: 'gte'>, attribute='year', value=1990)]) limit=None\n"
]
},
{
"data": {
"text/plain": [
"[Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
]
},
"execution_count": 18,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
@@ -325,21 +307,10 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"id": "0851fc42",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ObjectApiResponse({'acknowledged': True})"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"vectorstore.client.indices.delete(index=\"elasticsearch-self-query-demo\")"
]
@@ -361,7 +332,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.10.3"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,120 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ab66dd43",
"metadata": {},
"source": [
"# SingleStoreDB\n",
"\n",
">[SingleStoreDB](https://singlestore.com/) is a high-performance distributed SQL database that supports deployment both in the [cloud](https://www.singlestore.com/cloud/) and on-premises. It provides vector storage, and vector functions including [dot_product](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/dot_product.html) and [euclidean_distance](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/euclidean_distance.html), thereby supporting AI applications that require text similarity matching. \n",
"\n",
"\n",
"This notebook shows how to use a retriever that uses `SingleStoreDB`.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51b49135-a61a-49e8-869d-7c1d76794cd7",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Establishing a connection to the database is facilitated through the singlestoredb Python connector.\n",
"# Please ensure that this connector is installed in your working environment.\n",
"!pip install singlestoredb"
]
},
{
"cell_type": "markdown",
"id": "aaf80e7f",
"metadata": {},
"source": [
"## Create Retriever from vector store"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bcb3c8c2",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"# We want to use OpenAIEmbeddings so we have to get the OpenAI API Key.\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
"\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import SingleStoreDB\n",
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../modules/state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"\n",
"# Setup connection url as environment variable\n",
"os.environ[\"SINGLESTOREDB_URL\"] = \"root:pass@localhost:3306/db\"\n",
"\n",
"# Load documents to the store\n",
"docsearch = SingleStoreDB.from_documents(\n",
" docs,\n",
" embeddings,\n",
" table_name=\"notebook\", # use table with a custom name\n",
")\n",
"\n",
"# create retriever from the vector store\n",
"retriever = docsearch.as_retriever(search_kwargs={\"k\": 2})"
]
},
{
"cell_type": "markdown",
"id": "fc0915db",
"metadata": {},
"source": [
"## Search with retriever"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "b605284d",
"metadata": {},
"outputs": [],
"source": [
"result = retriever.get_relevant_documents(\"What did the president say about Ketanji Brown Jackson\")\n",
"print(docs[0].page_content)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -43,7 +43,7 @@
"from langchain.embeddings import SagemakerEndpointEmbeddings\n",
"from langchain.embeddings.sagemaker_endpoint import EmbeddingsContentHandler\n",
"import json\n",
"\n",
"import boto3\n",
"\n",
"class ContentHandler(EmbeddingsContentHandler):\n",
" content_type = \"application/json\"\n",
@@ -87,7 +87,18 @@
" endpoint_name=\"huggingface-pytorch-inference-2023-03-21-16-14-03-834\",\n",
" region_name=\"us-east-1\",\n",
" content_handler=content_handler,\n",
")"
")\n",
"\n",
"\n",
"# client = boto3.client(\n",
"# \"sagemaker-runtime\",\n",
"# region_name=\"us-west-2\" \n",
"# )\n",
"# embeddings = SagemakerEndpointEmbeddings(\n",
"# endpoint_name=\"huggingface-pytorch-inference-2023-03-21-16-14-03-834\", \n",
"# client=client\n",
"# content_handler=content_handler,\n",
"# )"
]
},
{

View File

@@ -55,7 +55,7 @@
"id": "ac5c88ce",
"metadata": {},
"source": [
"Let's add some files to the the sandbox"
"Let's add some files to the sandbox"
]
},
{

View File

@@ -0,0 +1,367 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# E2B Data Analysis\n",
"\n",
"[E2B's cloud environments](https://e2b.dev) are great runtime sandboxes for LLMs.\n",
"\n",
"E2B's Data Analysis sandbox allows for safe code execution in a sandboxed environment. This is ideal for building tools such as code interpreters, or Advanced Data Analysis like in ChatGPT.\n",
"\n",
"E2B Data Analysis sandbox allows you to:\n",
"- Run Python code\n",
"- Generate charts via matplotlib\n",
"- Install Python packages dynamically durint runtime\n",
"- Install system packages dynamically during runtime\n",
"- Run shell commands\n",
"- Upload and download files\n",
"\n",
"We'll create a simple OpenAI agent that will use E2B's Data Analysis sandbox to perform analysis on a uploaded files using Python."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Get your OpenAI API key and [E2B API key here](https://e2b.dev/docs/getting-started/api-key) and set them as environment variables.\n",
"\n",
"You can find the full API documentation [here](https://e2b.dev/docs).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You'll need to install `e2b` to get started:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install langchain e2b"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.tools import E2BDataAnalysisTool\n",
"from langchain.agents import initialize_agent, AgentType\n",
"\n",
"os.environ[\"E2B_API_KEY\"] = \"<E2B_API_KEY>\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<OPENAI_API_KEY>\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"When creating an instance of the `E2BDataAnalysisTool`, you can pass callbacks to listen to the output of the sandbox. This is useful, for example, when creating more responsive UI. Especially with the combination of streaming output from LLMs."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# Artifacts are charts created by matplotlib when `plt.show()` is called\n",
"def save_artifact(artifact):\n",
" print(\"New matplotlib chart generated:\", artifact.name)\n",
" # Download the artifact as `bytes` and leave it up to the user to display them (on frontend, for example)\n",
" file = artifact.download()\n",
" basename = os.path.basename(artifact.name)\n",
"\n",
" # Save the chart to the `charts` directory\n",
" with open(f\"./charts/{basename}\", \"wb\") as f:\n",
" f.write(file)\n",
"\n",
"e2b_data_analysis_tool = E2BDataAnalysisTool(\n",
" # Pass environment variables to the sandbox\n",
" env_vars={\"MY_SECRET\": \"secret_value\"},\n",
" on_stdout=lambda stdout: print(\"stdout:\", stdout),\n",
" on_stderr=lambda stderr: print(\"stderr:\", stderr),\n",
" on_artifact=save_artifact,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Upload an example CSV data file to the sandbox so we can analyze it with our agent. You can use for example [this file](https://storage.googleapis.com/e2b-examples/netflix.csv) about Netflix tv shows."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"name='netflix.csv' remote_path='/home/user/netflix.csv' description='Data about Netflix tv shows including their title, category, director, release date, casting, age rating, etc.'\n"
]
}
],
"source": [
"with open(\"./netflix.csv\") as f:\n",
" remote_path = e2b_data_analysis_tool.upload_file(\n",
" file=f,\n",
" description=\"Data about Netflix tv shows including their title, category, director, release date, casting, age rating, etc.\",\n",
" )\n",
" print(remote_path)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a `Tool` object and initialize the Langchain agent."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"\n",
"\n",
"tools = [e2b_data_analysis_tool.as_tool()]\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-4\", temperature=0)\n",
"agent = initialize_agent(\n",
" tools, llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=True, handle_parsing_errors=True\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can ask the agent questions about the CSV file we uploaded earlier."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Invoking: `e2b_data_analysis` with `{'python_code': \"import pandas as pd\\n\\n# Load the data\\nnetflix_data = pd.read_csv('/home/user/netflix.csv')\\n\\n# Convert the 'release_year' column to integer\\nnetflix_data['release_year'] = netflix_data['release_year'].astype(int)\\n\\n# Filter the data for movies released between 2000 and 2010\\nfiltered_data = netflix_data[(netflix_data['release_year'] >= 2000) & (netflix_data['release_year'] <= 2010) & (netflix_data['type'] == 'Movie')]\\n\\n# Remove rows where 'duration' is not available\\nfiltered_data = filtered_data[filtered_data['duration'].notna()]\\n\\n# Convert the 'duration' column to integer\\nfiltered_data['duration'] = filtered_data['duration'].str.replace(' min','').astype(int)\\n\\n# Get the top 5 longest movies\\nlongest_movies = filtered_data.nlargest(5, 'duration')\\n\\n# Create a bar chart\\nimport matplotlib.pyplot as plt\\n\\nplt.figure(figsize=(10,5))\\nplt.barh(longest_movies['title'], longest_movies['duration'], color='skyblue')\\nplt.xlabel('Duration (minutes)')\\nplt.title('Top 5 Longest Movies on Netflix (2000-2010)')\\nplt.gca().invert_yaxis()\\nplt.savefig('/home/user/longest_movies.png')\\n\\nlongest_movies[['title', 'duration']]\"}`\n",
"\n",
"\n",
"\u001b[0mstdout: title duration\n",
"stdout: 1019 Lagaan 224\n",
"stdout: 4573 Jodhaa Akbar 214\n",
"stdout: 2731 Kabhi Khushi Kabhie Gham 209\n",
"stdout: 2632 No Direction Home: Bob Dylan 208\n",
"stdout: 2126 What's Your Raashee? 203\n",
"\u001b[36;1m\u001b[1;3m{'stdout': \" title duration\\n1019 Lagaan 224\\n4573 Jodhaa Akbar 214\\n2731 Kabhi Khushi Kabhie Gham 209\\n2632 No Direction Home: Bob Dylan 208\\n2126 What's Your Raashee? 203\", 'stderr': ''}\u001b[0m\u001b[32;1m\u001b[1;3mThe 5 longest movies on Netflix released between 2000 and 2010 are:\n",
"\n",
"1. Lagaan - 224 minutes\n",
"2. Jodhaa Akbar - 214 minutes\n",
"3. Kabhi Khushi Kabhie Gham - 209 minutes\n",
"4. No Direction Home: Bob Dylan - 208 minutes\n",
"5. What's Your Raashee? - 203 minutes\n",
"\n",
"Here is the chart showing their lengths:\n",
"\n",
"![Longest Movies](sandbox:/home/user/longest_movies.png)\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"\"The 5 longest movies on Netflix released between 2000 and 2010 are:\\n\\n1. Lagaan - 224 minutes\\n2. Jodhaa Akbar - 214 minutes\\n3. Kabhi Khushi Kabhie Gham - 209 minutes\\n4. No Direction Home: Bob Dylan - 208 minutes\\n5. What's Your Raashee? - 203 minutes\\n\\nHere is the chart showing their lengths:\\n\\n![Longest Movies](sandbox:/home/user/longest_movies.png)\""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"What are the 5 longest movies on netflix released between 2000 and 2010? Create a chart with their lengths.\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"E2B also allows you to install both Python and system (via `apt`) packages dynamically during runtime like this:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"stdout: Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (2.1.1)\n",
"stdout: Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas) (2.8.2)\n",
"stdout: Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas) (2023.3.post1)\n",
"stdout: Requirement already satisfied: numpy>=1.22.4 in /usr/local/lib/python3.10/dist-packages (from pandas) (1.26.1)\n",
"stdout: Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas) (2023.3)\n",
"stdout: Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)\n"
]
}
],
"source": [
"# Install Python package\n",
"e2b_data_analysis_tool.install_python_packages('pandas')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Additionally, you can download any file from the sandbox like this:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# The path is a remote path in the sandbox\n",
"files_in_bytes = e2b_data_analysis_tool.download_file('/home/user/netflix.csv')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Lastly, you can run any shell command inside the sandbox via `run_command`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"stderr: \n",
"stderr: WARNING: apt does not have a stable CLI interface. Use with caution in scripts.\n",
"stderr: \n",
"stdout: Hit:1 http://security.ubuntu.com/ubuntu jammy-security InRelease\n",
"stdout: Hit:2 http://archive.ubuntu.com/ubuntu jammy InRelease\n",
"stdout: Hit:3 http://archive.ubuntu.com/ubuntu jammy-updates InRelease\n",
"stdout: Hit:4 http://archive.ubuntu.com/ubuntu jammy-backports InRelease\n",
"stdout: Reading package lists...\n",
"stdout: Building dependency tree...\n",
"stdout: Reading state information...\n",
"stdout: All packages are up to date.\n",
"stdout: Reading package lists...\n",
"stdout: Building dependency tree...\n",
"stdout: Reading state information...\n",
"stdout: Suggested packages:\n",
"stdout: sqlite3-doc\n",
"stdout: The following NEW packages will be installed:\n",
"stdout: sqlite3\n",
"stdout: 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.\n",
"stdout: Need to get 768 kB of archives.\n",
"stdout: After this operation, 1873 kB of additional disk space will be used.\n",
"stdout: Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 sqlite3 amd64 3.37.2-2ubuntu0.1 [768 kB]\n",
"stderr: debconf: delaying package configuration, since apt-utils is not installed\n",
"stdout: Fetched 768 kB in 0s (2258 kB/s)\n",
"stdout: Selecting previously unselected package sqlite3.\n",
"(Reading database ... 23999 files and directories currently installed.)\n",
"stdout: Preparing to unpack .../sqlite3_3.37.2-2ubuntu0.1_amd64.deb ...\n",
"stdout: Unpacking sqlite3 (3.37.2-2ubuntu0.1) ...\n",
"stdout: Setting up sqlite3 (3.37.2-2ubuntu0.1) ...\n",
"stdout: 3.37.2 2022-01-06 13:25:41 872ba256cbf61d9290b571c0e6d82a20c224ca3ad82971edc46b29818d5dalt1\n",
"version: 3.37.2 2022-01-06 13:25:41 872ba256cbf61d9290b571c0e6d82a20c224ca3ad82971edc46b29818d5dalt1\n",
"error: \n",
"exit code: 0\n"
]
}
],
"source": [
"# Install SQLite\n",
"e2b_data_analysis_tool.run_command(\"sudo apt update\")\n",
"e2b_data_analysis_tool.install_system_packages(\"sqlite3\")\n",
"\n",
"# Check the SQLite version\n",
"output = e2b_data_analysis_tool.run_command(\"sqlite3 --version\")\n",
"print(\"version: \", output[\"stdout\"])\n",
"print(\"error: \", output[\"stderr\"])\n",
"print(\"exit code: \", output[\"exit_code\"])"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"When your agent is finished, don't forget to close the sandbox"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"e2b_data_analysis_tool.close()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,102 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Google Scholar\n",
"\n",
"This notebook goes through how to use Google Scholar Tool"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: google-search-results in /home/mohtashimkhan/mambaforge/envs/langchain/lib/python3.9/site-packages (2.4.2)\n",
"Requirement already satisfied: requests in /home/mohtashimkhan/mambaforge/envs/langchain/lib/python3.9/site-packages (from google-search-results) (2.31.0)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /home/mohtashimkhan/mambaforge/envs/langchain/lib/python3.9/site-packages (from requests->google-search-results) (3.3.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /home/mohtashimkhan/mambaforge/envs/langchain/lib/python3.9/site-packages (from requests->google-search-results) (3.4)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /home/mohtashimkhan/mambaforge/envs/langchain/lib/python3.9/site-packages (from requests->google-search-results) (1.26.17)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /home/mohtashimkhan/mambaforge/envs/langchain/lib/python3.9/site-packages (from requests->google-search-results) (2023.5.7)\n"
]
}
],
"source": [
"!pip install google-search-results"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from langchain.tools.google_scholar import GoogleScholarQueryRun\n",
"from langchain.utilities.google_scholar import GoogleScholarAPIWrapper\n",
"import os"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Title: Large language models (LLM) and ChatGPT: what will the impact on nuclear medicine be?\\nAuthors: IL Alberts,K Shi\\nSummary: IL Alberts, L Mercolli, T Pyka, G Prenosil, K Shi… - European journal of …, 2023 - Springer\\nTotal-Citations: 28\\n\\nTitle: Dynamic Planning with a LLM\\nAuthors: G Dagan,F Keller,A Lascarides\\nSummary: G Dagan, F Keller, A Lascarides - arXiv preprint arXiv:2308.06391, 2023 - arxiv.org\\nTotal-Citations: 3\\n\\nTitle: Openagi: When llm meets domain experts\\nAuthors: Y Ge,W Hua,J Ji,J Tan,S Xu,Y Zhang\\nSummary: Y Ge, W Hua, J Ji, J Tan, S Xu, Y Zhang - arXiv preprint arXiv:2304.04370, 2023 - arxiv.org\\nTotal-Citations: 19\\n\\nTitle: Llm-planner: Few-shot grounded planning for embodied agents with large language models\\nAuthors: CH Song\\nSummary: CH Song, J Wu, C Washington… - Proceedings of the …, 2023 - openaccess.thecvf.com\\nTotal-Citations: 28\\n\\nTitle: The science of detecting llm-generated texts\\nAuthors: R Tang,YN Chuang,X Hu\\nSummary: R Tang, YN Chuang, X Hu - arXiv preprint arXiv:2303.07205, 2023 - arxiv.org\\nTotal-Citations: 23\\n\\nTitle: X-llm: Bootstrapping advanced large language models by treating multi-modalities as foreign languages\\nAuthors: F Chen,M Han,J Shi\\nSummary: F Chen, M Han, H Zhao, Q Zhang, J Shi, S Xu… - arXiv preprint arXiv …, 2023 - arxiv.org\\nTotal-Citations: 12\\n\\nTitle: 3d-llm: Injecting the 3d world into large language models\\nAuthors: Y Hong,H Zhen,P Chen,S Zheng,Y Du\\nSummary: Y Hong, H Zhen, P Chen, S Zheng, Y Du… - arXiv preprint arXiv …, 2023 - arxiv.org\\nTotal-Citations: 4\\n\\nTitle: The internal state of an llm knows when its lying\\nAuthors: A Azaria,T Mitchell\\nSummary: A Azaria, T Mitchell - arXiv preprint arXiv:2304.13734, 2023 - arxiv.org\\nTotal-Citations: 18\\n\\nTitle: LLM-Pruner: On the Structural Pruning of Large Language Models\\nAuthors: X Ma,G Fang,X Wang\\nSummary: X Ma, G Fang, X Wang - arXiv preprint arXiv:2305.11627, 2023 - arxiv.org\\nTotal-Citations: 15\\n\\nTitle: Large language models are few-shot testers: Exploring llm-based general bug reproduction\\nAuthors: S Kang,J Yoon,S Yoo\\nSummary: S Kang, J Yoon, S Yoo - 2023 IEEE/ACM 45th International …, 2023 - ieeexplore.ieee.org\\nTotal-Citations: 17'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"os.environ[\"SERP_API_KEY\"] = \"\"\n",
"tool = GoogleScholarQueryRun(api_wrapper=GoogleScholarAPIWrapper())\n",
"tool.run(\"LLM Models\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.16 ('langchain')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "15e58ce194949b77a891bd4339ce3d86a9bd138e905926019517993f97db9e6c"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,140 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a6f91f20",
"metadata": {},
"source": [
"# Tavily Search"
]
},
{
"cell_type": "markdown",
"id": "5e24a889",
"metadata": {},
"source": [
"Tavily Search is a robust search API tailored specifically for LLM Agents. It seamlessly integrates with diverse data sources to ensure a superior, relevant search experience.\n",
"\n",
"Set up API key [here](https://app.tavily.com/)."
]
},
{
"cell_type": "markdown",
"id": "b50d6c92",
"metadata": {},
"source": [
"## Try it out!"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8cc8ded6",
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-21T13:15:37.974229Z",
"start_time": "2023-10-21T13:15:10.007898Z"
},
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I'm not aware of the current situation regarding the Burning Man event. I'll need to search for recent news about any flooding that might have affected it.\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"tavily_search_results_json\",\n",
" \"action_input\": {\"query\": \"Burning Man floods latest news\"}\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m[{'url': 'https://www.theguardian.com/culture/2023/sep/03/burning-man-nevada-festival-floods', 'content': 'More on this story\\nMore on this story\\nBurning Man revelers begin exodus from festival after road reopens\\nBurning Man festival-goers trapped in desert as rain turns site to mud\\n\\nOfficials investigate death at Burning Man as thousands stranded by floods\\n\\nBurning Man festivalgoers surrounded by mud in Nevada desert video\\nBurning Man attendees roadblocked by climate activists: They have a privileged mindset\\n\\nin our favor. We will let you know. It could be sooner, and it could be later,” said an update on the Burning Man website on Saturday evening.'}, {'url': 'https://www.npr.org/2023/09/03/1197497458/the-latest-on-the-burning-man-flooding', 'content': \"National\\nThe latest on the Burning Man flooding\\nClaudia Peschiutta\\n\\nClaudia Peschiutta\\nAuthorities are investigating a death at the Burning Man festival in the Nevada desert after tens of thousands of people are stuck in camps because of rain.\\nSCOTT DETROW, HOST:\\n\\nDETROW: Well, that's NPR's Claudia Peschiutta covered and caked in a lot of mud at Burning Man. Thanks for talking to us.\\nPESCHIUTTA: Confirmed.\\nDETROW: Stay dry as much as you can.\\n\\nwith NPR's Claudia Peschiutta, who's at her first burn, and she told me it's muddy where she is, but that she and her camp family have been making the best of things.\"}, {'url': 'https://www.npr.org/2023/09/03/1197497458/the-latest-on-the-burning-man-flooding', 'content': \"National\\nThe latest on the Burning Man flooding\\nClaudia Peschiutta\\n\\nClaudia Peschiutta\\nAuthorities are investigating a death at the Burning Man festival in the Nevada desert after tens of thousands of people are stuck in camps because of rain.\\nSCOTT DETROW, HOST:\\n\\nDETROW: Well, that's NPR's Claudia Peschiutta covered and caked in a lot of mud at Burning Man. Thanks for talking to us.\\nPESCHIUTTA: Confirmed.\\nDETROW: Stay dry as much as you can.\\n\\nwith NPR's Claudia Peschiutta, who's at her first burn, and she told me it's muddy where she is, but that she and her camp family have been making the best of things.\"}, {'url': 'https://abcnews.go.com/US/burning-man-flooding-happened-stranded-festivalgoers/story?id=102908331', 'content': 'Tens of thousands of Burning Man attendees are now able to leave the festival after a downpour and massive flooding left them stranded over the weekend.\\n\\nIn 2013, according to a blog post in the \"Burning Man Journal,\" a rainstorm similarly rolled in, unexpectedly \"trapping 160 people on the playa overnight.\"\\n\\nABC News\\nVideo\\nLive\\nShows\\nElection 2024\\n538\\nStream on\\nBurning Man flooding: What happened to stranded festivalgoers?\\nSome 64,000 people were still on site Monday as the exodus began.\\n\\nBurning Man has been hosted for over 30 years, according to a statement from the organizers.'}, {'url': 'https://www.today.com/news/what-is-burning-man-flood-death-rcna103231', 'content': 'Tens of thousands of Burning Man festivalgoers are slowly making their way home from the Nevada desert after muddy conditions from heavy rains made it nearly impossible to leave over the weekend.\\n\\naccording to burningman.org.\\n\\nPresident Biden was notified of the situation and, according to a spokesperson, administration officials monitored and received updates on the latest details.\\nWhy are people stranded at Burning Man?\\n\\n\"Thank goodness this community knows how to take care of each other,\" the Instagram page for Burning Man Information Radio wrote on a post predicting more rain.'}]\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe latest Burning Man event was severely affected by heavy rainfall that led to flooding. This resulted in tens of thousands of festival attendees getting stuck in their camps due to the muddy conditions. As a result, the exodus from the festival was delayed. An unfortunate incident also occurred, with a death being investigated at the festival. The situation was severe enough that President Biden was informed about it and administration officials were monitoring it. However, it seems that the festival goers were able to handle the situation well, as the Burning Man community is known for looking out for each other. This is not the first time a rainstorm has disrupted the Burning Man event; a similar incident occurred in 2013 where a sudden storm trapped people overnight. \n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"The latest Burning Man event was severely affected by heavy rainfall that led to flooding. This resulted in tens of thousands of festival attendees getting stuck in their camps due to the muddy conditions, delaying their exit from the festival. An unfortunate incident also occurred, with a death being investigated at the festival. The situation was severe enough that President Biden was informed about it and administration officials were monitoring it. However, the festival goers were able to handle the situation well, as the Burning Man community is known for looking out for each other. This is not the first time a rainstorm has disrupted the Burning Man event; a similar incident occurred in 2013 when a sudden storm trapped people overnight.\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The latest Burning Man event was severely affected by heavy rainfall that led to flooding. This resulted in tens of thousands of festival attendees getting stuck in their camps due to the muddy conditions, delaying their exit from the festival. An unfortunate incident also occurred, with a death being investigated at the festival. The situation was severe enough that President Biden was informed about it and administration officials were monitoring it. However, the festival goers were able to handle the situation well, as the Burning Man community is known for looking out for each other. This is not the first time a rainstorm has disrupted the Burning Man event; a similar incident occurred in 2013 when a sudden storm trapped people overnight.'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# libraries\n",
"import os\n",
"from langchain.utilities.tavily_search import TavilySearchAPIWrapper\n",
"from langchain.agents import initialize_agent, AgentType\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.tools.tavily_search import TavilySearchResults\n",
"\n",
"# set up API key\n",
"os.environ[\"TAVILY_API_KEY\"] = \"...\"\n",
"\n",
"# set up the agent\n",
"llm = ChatOpenAI(model_name=\"gpt-4\", temperature=0.7)\n",
"search = TavilySearchAPIWrapper()\n",
"tavily_tool = TavilySearchResults(api_wrapper=search)\n",
"\n",
"# initialize the agent\n",
"agent_chain = initialize_agent(\n",
" [tavily_tool],\n",
" llm,\n",
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
" verbose=True,\n",
")\n",
"\n",
"# run the agent\n",
"agent_chain.run(\n",
" \"What happened in the latest burning man floods?\",\n",
")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "86cd0a02",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -7,6 +7,8 @@
"source": [
"# Zapier Natural Language Actions\n",
"\n",
"**Deprecated** This API will be sunset on 2023-11-17: https://nla.zapier.com/start/\n",
" \n",
">[Zapier Natural Language Actions](https://nla.zapier.com/start/) gives you access to the 5k+ apps, 20k+ actions on Zapier's platform through a natural language API interface.\n",
">\n",
">NLA supports apps like `Gmail`, `Salesforce`, `Trello`, `Slack`, `Asana`, `HubSpot`, `Google Sheets`, `Microsoft Teams`, and thousands more apps: https://zapier.com/apps\n",

View File

@@ -5,9 +5,9 @@
"id": "683953b3",
"metadata": {},
"source": [
"# Dingo\n",
"# DingoDB\n",
"\n",
">[Dingo](https://dingodb.readthedocs.io/en/latest/) is a distributed multi-mode vector database, which combines the characteristics of data lakes and vector databases, and can store data of any type and size (Key-Value, PDF, audio, video, etc.). It has real-time low-latency processing capabilities to achieve rapid insight and response, and can efficiently conduct instant analysis and process multi-modal data.\n",
">[DingoDB](https://dingodb.readthedocs.io/en/latest/) is a distributed multi-mode vector database, which combines the characteristics of data lakes and vector databases, and can store data of any type and size (Key-Value, PDF, audio, video, etc.). It has real-time low-latency processing capabilities to achieve rapid insight and response, and can efficiently conduct instant analysis and process multi-modal data.\n",
"\n",
"This notebook shows how to use functionality related to the DingoDB vector database.\n",
"\n",

View File

@@ -139,7 +139,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 1,
"id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
"metadata": {
"id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
@@ -172,7 +172,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"id": "aac9563e",
"metadata": {
"id": "aac9563e",
@@ -186,7 +186,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "a3c3999a",
"metadata": {
"id": "a3c3999a",
@@ -207,7 +207,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 4,
"id": "12eb86d8",
"metadata": {
"id": "12eb86d8",
@@ -218,7 +218,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[Document(page_content='One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../modules/state_of_the_union.txt'}), Document(page_content='One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../modules/state_of_the_union.txt', 'date': '2016-01-01', 'rating': 2, 'author': 'John Doe'}), Document(page_content='One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../modules/state_of_the_union.txt', 'date': '2010-01-01', 'rating': 1, 'author': 'John Doe'}), Document(page_content='As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice.', metadata={'source': '../../modules/state_of_the_union.txt'})]\n"
"[Document(page_content='One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../modules/state_of_the_union.txt'}), Document(page_content='As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \\n\\nWhile it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice.', metadata={'source': '../../modules/state_of_the_union.txt'}), Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system.', metadata={'source': '../../modules/state_of_the_union.txt'}), Document(page_content='This is personal to me and Jill, to Kamala, and to so many of you. \\n\\nCancer is the #2 cause of death in Americasecond only to heart disease. \\n\\nLast month, I announced our plan to supercharge \\nthe Cancer Moonshot that President Obama asked me to lead six years ago. \\n\\nOur goal is to cut the cancer death rate by at least 50% over the next 25 years, turn more cancers from death sentences into treatable diseases. \\n\\nMore support for patients and families.', metadata={'source': '../../modules/state_of_the_union.txt'})]\n"
]
}
],
@@ -247,7 +247,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 5,
"id": "5d076412",
"metadata": {},
"outputs": [
@@ -284,12 +284,13 @@
"## Filtering Metadata\n",
"With metadata added to the documents, you can add metadata filtering at query time. \n",
"\n",
"### Example: Filter by keyword"
"### Example: Filter by Exact keyword\n",
"Notice: We are using the keyword subfield thats not analyzed"
]
},
{
"cell_type": "code",
"execution_count": 57,
"execution_count": 6,
"id": "b2a4bd1b",
"metadata": {},
"outputs": [
@@ -297,12 +298,42 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'source': '../../modules/state_of_the_union.txt', 'date': '2010-01-01', 'rating': 1, 'author': 'John Doe', 'geo_location': {'lat': 40.12, 'lon': -71.34}}\n"
"{'source': '../../modules/state_of_the_union.txt', 'date': '2016-01-01', 'rating': 2, 'author': 'John Doe'}\n"
]
}
],
"source": [
"docs = db.similarity_search(query, filter=[{ \"match\": { \"metadata.author\": \"John Doe\"}}])\n",
"docs = db.similarity_search(query, filter=[{ \"term\": { \"metadata.author.keyword\": \"John Doe\"}}])\n",
"print(docs[0].metadata)"
]
},
{
"cell_type": "markdown",
"id": "1898ab77",
"metadata": {},
"source": [
"### Example: Filter by Partial Match\n",
"This example shows how to filter by partial match. This is useful when you don't know the exact value of the metadata field. For example, if you want to filter by the metadata field `author` and you don't know the exact value of the author, you can use a partial match to filter by the author's last name. Fuzzy matching is also supported.\n",
"\n",
"\"Jon\" matches on \"John Doe\" as \"Jon\" is a close match to \"John\" token."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "f3d294ff",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'source': '../../modules/state_of_the_union.txt', 'date': '2016-01-01', 'rating': 2, 'author': 'John Doe'}\n"
]
}
],
"source": [
"docs = db.similarity_search(query, filter=[{ \"match\": { \"metadata.author\": { \"query\": \"Jon\", \"fuzziness\": \"AUTO\" } }}])\n",
"print(docs[0].metadata)"
]
},

View File

@@ -5,11 +5,11 @@
"id": "655b8f55-2089-4733-8b09-35dea9580695",
"metadata": {},
"source": [
"# Google Vertex AI MatchingEngine\n",
"# Google Vertex AI Vector Search\n",
"\n",
"This notebook shows how to use functionality related to the `GCP Vertex AI MatchingEngine` vector database.\n",
"This notebook shows how to use functionality related to the `Google Cloud Vertex AI Vector Search` vector database.\n",
"\n",
"> Vertex AI [Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview) provides the industry's leading high-scale low latency vector database. These vector databases are commonly referred to as vector similarity-matching or an approximate nearest neighbor (ANN) service.\n",
"> [Google Vertex AI Vector Search](https://cloud.google.com/vertex-ai/docs/matching-engine/overview), formerly known as Vertex AI Matching Engine, provides the industry's leading high-scale low latency vector database. These vector databases are commonly referred to as vector similarity-matching or an approximate nearest neighbor (ANN) service.\n",
"\n",
"**Note**: This module expects an endpoint and deployed index already created as the creation time takes close to one hour. To see how to create an index refer to the section [Create Index and deploy it to an Endpoint](#create-index-and-deploy-it-to-an-endpoint)"
]
@@ -29,7 +29,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.vectorstores import MatchingEngine"
"from langchain.vectorstores import MatchingEngine\n"
]
},
{
@@ -61,7 +61,7 @@
"\n",
"vector_store.add_texts(texts=texts)\n",
"\n",
"vector_store.similarity_search(\"lunch\", k=2)"
"vector_store.similarity_search(\"lunch\", k=2)\n"
]
},
{
@@ -93,7 +93,7 @@
"!pip install tensorflow \\\n",
" google-cloud-aiplatform \\\n",
" tensorflow-hub \\\n",
" tensorflow-text "
" tensorflow-text \n"
]
},
{
@@ -108,7 +108,7 @@
"\n",
"from google.cloud import aiplatform\n",
"import tensorflow_hub as hub\n",
"import tensorflow_text"
"import tensorflow_text\n"
]
},
{
@@ -137,7 +137,7 @@
"VPC_NETWORK_FULL = f\"projects/{PROJECT_NUMBER}/global/networks/{VPC_NETWORK}\"\n",
"\n",
"# Change this if you need the VPC to be created.\n",
"CREATE_VPC = False"
"CREATE_VPC = False\n"
]
},
{
@@ -148,7 +148,7 @@
"outputs": [],
"source": [
"# Set the project id\n",
"! gcloud config set project {PROJECT_ID}"
"! gcloud config set project {PROJECT_ID}\n"
]
},
{
@@ -177,7 +177,7 @@
"\n",
" # Set up peering with service networking\n",
" # Your account must have the \"Compute Network Admin\" role to run the following.\n",
" ! gcloud services vpc-peerings connect --service=servicenetworking.googleapis.com --network={VPC_NETWORK} --ranges={PEERING_RANGE_NAME} --project={PROJECT_ID}"
" ! gcloud services vpc-peerings connect --service=servicenetworking.googleapis.com --network={VPC_NETWORK} --ranges={PEERING_RANGE_NAME} --project={PROJECT_ID}\n"
]
},
{
@@ -188,7 +188,7 @@
"outputs": [],
"source": [
"# Creating bucket.\n",
"! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI"
"! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI\n"
]
},
{
@@ -208,7 +208,7 @@
"source": [
"# Load the Universal Sentence Encoder module\n",
"module_url = \"https://tfhub.dev/google/universal-sentence-encoder-multilingual/3\"\n",
"model = hub.load(module_url)"
"model = hub.load(module_url)\n"
]
},
{
@@ -219,7 +219,7 @@
"outputs": [],
"source": [
"# Generate embeddings for each word\n",
"embeddings = model([\"banana\"])"
"embeddings = model([\"banana\"])\n"
]
},
{
@@ -245,7 +245,7 @@
"with open(\"data.json\", \"w\") as f:\n",
" json.dump(initial_config, f)\n",
"\n",
"!gsutil cp data.json {EMBEDDING_DIR}/file.json"
"!gsutil cp data.json {EMBEDDING_DIR}/file.json\n"
]
},
{
@@ -255,7 +255,7 @@
"metadata": {},
"outputs": [],
"source": [
"aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)"
"aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)\n"
]
},
{
@@ -279,7 +279,7 @@
" dimensions=DIMENSIONS,\n",
" approximate_neighbors_count=150,\n",
" distance_measure_type=\"DOT_PRODUCT_DISTANCE\",\n",
")"
")\n"
]
},
{
@@ -300,7 +300,7 @@
"my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(\n",
" display_name=f\"{DISPLAY_NAME}-endpoint\",\n",
" network=VPC_NETWORK_FULL,\n",
")"
")\n"
]
},
{
@@ -322,7 +322,7 @@
" index=my_index, deployed_index_id=DEPLOYED_INDEX_ID\n",
")\n",
"\n",
"my_index_endpoint.deployed_indexes"
"my_index_endpoint.deployed_indexes\n"
]
}
],

View File

@@ -27,7 +27,7 @@
"# Setup\n",
"\n",
"You will need a Vectara account to use Vectara with LangChain. To get started, use the following steps (see our [quickstart](https://docs.vectara.com/docs/quickstart) guide):\n",
"1. [Sign up](https://console.vectara.com/signup) for a Vectara account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
"1. [Sign up](https://vectara.com/integrations/langchain) for a Vectara account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
"2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
"3. Next you'll need to create API keys to access the corpus. Click on the **\"Authorization\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
"\n",

View File

@@ -0,0 +1,374 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7d46647d-638f-497e-b51a-52bf8dd76e39",
"metadata": {},
"source": [
"# LLM\n",
"\n",
"The most common type of chaining in any LLM application is combining a prompt template with an LLM and optionally an output parser.\n",
"\n",
"The recommended way to do this is using LangChain Expression Language. We also continue to support the legacy `LLMChain`, which is a single class for composing these three components."
]
},
{
"cell_type": "markdown",
"id": "0ad20b88-f2e8-4ba0-b8e6-1892ab4d2190",
"metadata": {},
"source": [
"## Using LCEL\n",
"\n",
"`BasePromptTemplate`, `BaseLanguageModel` and `BaseOutputParser` all implement the `Runnable` interface and are designed to be piped into one another, making LCEL composition very easy:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "92ad7c9d-a1d2-49bd-a4a3-0f6f0fd1656b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'VibrantSocks'"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema import StrOutputParser\n",
"\n",
"prompt = PromptTemplate.from_template(\"What is a good name for a company that makes {product}?\")\n",
"runnable = prompt | ChatOpenAI() | StrOutputParser()\n",
"runnable.invoke({\"product\": \"colorful socks\"})"
]
},
{
"cell_type": "markdown",
"id": "784d8083-a2c8-4172-92b8-0bd0d74f032a",
"metadata": {},
"source": [
"Head to the [LCEL](/docs/expression_language) section for more on the interface, built-in features, and cookbook examples."
]
},
{
"cell_type": "markdown",
"id": "efee07bb-fc45-4e06-999f-a776e6d53333",
"metadata": {},
"source": [
"## [Legacy] LLMChain\n",
"\n",
":::note This is a legacy class, using LCEL as shown above is preffered.\n",
"\n",
"An `LLMChain` is a simple chain that adds some functionality around language models. It is used widely throughout LangChain, including in other chains and agents.\n",
"\n",
"An `LLMChain` consists of a `PromptTemplate` and a language model (either an LLM or chat model). It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output.\n",
"\n",
"### Get started"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "fc0b7d6c-b808-48d9-bdb5-818ab4a1ccca",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'product': 'colorful socks', 'text': '\\n\\nSocktastic!'}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.llms import OpenAI\n",
"from langchain.chains import LLMChain\n",
"\n",
"prompt_template = \"What is a good name for a company that makes {product}?\"\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"llm_chain = LLMChain(\n",
" llm=llm,\n",
" prompt=PromptTemplate.from_template(prompt_template)\n",
")\n",
"llm_chain(\"colorful socks\")"
]
},
{
"cell_type": "markdown",
"id": "040634f0-fe60-4b0e-b3f6-e9c15146e2cd",
"metadata": {},
"source": [
"### Additional ways of running `LLMChain`\n",
"\n",
"Aside from `__call__` and `run` methods shared by all `Chain` object, `LLMChain` offers a few more ways of calling the chain logic:\n",
"\n",
"- `apply` allows you run the chain against a list of inputs:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8cd8dd72-6d5a-488f-80a6-1a9324c743e8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'text': '\\n\\nSocktastic!'},\n",
" {'text': '\\n\\nTechCore Solutions.'},\n",
" {'text': '\\n\\nFootwear Factory.'}]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"input_list = [\n",
" {\"product\": \"socks\"},\n",
" {\"product\": \"computer\"},\n",
" {\"product\": \"shoes\"}\n",
"]\n",
"llm_chain.apply(input_list)"
]
},
{
"cell_type": "markdown",
"id": "18624d04-474a-425e-bcf3-58748b747e08",
"metadata": {},
"source": [
"- `generate` is similar to `apply`, except it return an `LLMResult` instead of string. `LLMResult` often contains useful generation such as token usages and finish reason."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "67e72139-686d-40eb-9c1e-4342d3b1abfe",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"LLMResult(generations=[[Generation(text='\\n\\nSocktastic!', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\nTechCore Solutions.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\nFootwear Factory.', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 36, 'total_tokens': 55}, 'model_name': 'text-davinci-003'}, run=[RunInfo(run_id=UUID('9a423a43-6d35-4e8f-9aca-cacfc8e0dc49')), RunInfo(run_id=UUID('a879c077-b521-461c-8f29-ba63adfc327c')), RunInfo(run_id=UUID('40b892fa-e8c2-47d0-a309-4f7a4ed5b64a'))])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_chain.generate(input_list)"
]
},
{
"cell_type": "markdown",
"id": "0480da3a-865d-4ec5-9366-e29c3967fef3",
"metadata": {},
"source": [
"- `predict` is similar to `run` method except that the input keys are specified as keyword arguments instead of a Python dict."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f4afb8a4-9113-4082-85cb-55a2d406c99a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nSocktastic!'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Single input example\n",
"llm_chain.predict(product=\"colorful socks\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e58eaab1-4db4-43cb-b523-7b3380332cad",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nQ: What did the duck say when his friend died?\\nA: Quack, quack, goodbye.'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Multiple inputs example\n",
"template = \"\"\"Tell me a {adjective} joke about {subject}.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"adjective\", \"subject\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0))\n",
"\n",
"llm_chain.predict(adjective=\"sad\", subject=\"ducks\")"
]
},
{
"cell_type": "markdown",
"id": "63f02d9e-6470-41d3-b91c-b064baf84733",
"metadata": {},
"source": [
"### Parsing the outputs\n",
"\n",
"By default, `LLMChain` does not parse the output even if the underlying `prompt` object has an output parser. If you would like to apply that output parser on the LLM output, use `predict_and_parse` instead of `predict` and `apply_and_parse` instead of `apply`.\n",
"\n",
"With `predict`:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "134126ca-2f1c-4829-94ba-810d91c92138",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nRed, orange, yellow, green, blue, indigo, violet'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.output_parsers import CommaSeparatedListOutputParser\n",
"\n",
"output_parser = CommaSeparatedListOutputParser()\n",
"template = \"\"\"List all the colors in a rainbow\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[], output_parser=output_parser)\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"llm_chain.predict()"
]
},
{
"cell_type": "markdown",
"id": "7a46f1e8-daaf-43d6-8045-9b187655631b",
"metadata": {},
"source": [
"With `predict_and_parse`:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "7ef9b74d-7ef5-4b80-80cc-f8226f79259b",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/bagatur/langchain/libs/langchain/langchain/chains/llm.py:280: UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.\n",
" warnings.warn(\n"
]
},
{
"data": {
"text/plain": [
"['Red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_chain.predict_and_parse()"
]
},
{
"cell_type": "markdown",
"id": "93446f7f-0a2d-4fc5-99a1-a26cc0605b4b",
"metadata": {},
"source": [
"### Initialize from string\n",
"\n",
"You can also construct an `LLMChain` from a string template directly."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "7e324174-e8ab-4095-87cb-17874a058da9",
"metadata": {},
"outputs": [],
"source": [
"template = \"\"\"Tell me a {adjective} joke about {subject}.\"\"\"\n",
"llm_chain = LLMChain.from_string(llm=llm, template=template)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "a4f10407-6519-4174-89fe-e7507765f1ae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nQ: What did the duck say when his friend died?\\nA: Quack, quack, goodbye.'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_chain.predict(adjective=\"sad\", subject=\"ducks\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,171 +0,0 @@
# LLM
An `LLMChain` is a simple chain that adds some functionality around language models. It is used widely throughout LangChain, including in other chains and agents.
An `LLMChain` consists of a `PromptTemplate` and a language model (either an LLM or chat model). It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output.
## Get started
```python
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain
prompt_template = "What is a good name for a company that makes {product}?"
llm = OpenAI(temperature=0)
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate.from_template(prompt_template)
)
llm_chain("colorful socks")
```
<CodeOutputBlock lang="python">
```
{'product': 'colorful socks', 'text': '\n\nSocktastic!'}
```
</CodeOutputBlock>
## Additional ways of running `LLMChain`
Aside from `__call__` and `run` methods shared by all `Chain` object, `LLMChain` offers a few more ways of calling the chain logic:
- `apply` allows you run the chain against a list of inputs:
```python
input_list = [
{"product": "socks"},
{"product": "computer"},
{"product": "shoes"}
]
llm_chain.apply(input_list)
```
<CodeOutputBlock lang="python">
```
[{'text': '\n\nSocktastic!'},
{'text': '\n\nTechCore Solutions.'},
{'text': '\n\nFootwear Factory.'}]
```
</CodeOutputBlock>
- `generate` is similar to `apply`, except it return an `LLMResult` instead of string. `LLMResult` often contains useful generation such as token usages and finish reason.
```python
llm_chain.generate(input_list)
```
<CodeOutputBlock lang="python">
```
LLMResult(generations=[[Generation(text='\n\nSocktastic!', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nTechCore Solutions.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nFootwear Factory.', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'prompt_tokens': 36, 'total_tokens': 55, 'completion_tokens': 19}, 'model_name': 'text-davinci-003'})
```
</CodeOutputBlock>
- `predict` is similar to `run` method except that the input keys are specified as keyword arguments instead of a Python dict.
```python
# Single input example
llm_chain.predict(product="colorful socks")
```
<CodeOutputBlock lang="python">
```
'\n\nSocktastic!'
```
</CodeOutputBlock>
```python
# Multiple inputs example
template = """Tell me a {adjective} joke about {subject}."""
prompt = PromptTemplate(template=template, input_variables=["adjective", "subject"])
llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0))
llm_chain.predict(adjective="sad", subject="ducks")
```
<CodeOutputBlock lang="python">
```
'\n\nQ: What did the duck say when his friend died?\nA: Quack, quack, goodbye.'
```
</CodeOutputBlock>
## Parsing the outputs
By default, `LLMChain` does not parse the output even if the underlying `prompt` object has an output parser. If you would like to apply that output parser on the LLM output, use `predict_and_parse` instead of `predict` and `apply_and_parse` instead of `apply`.
With `predict`:
```python
from langchain.output_parsers import CommaSeparatedListOutputParser
output_parser = CommaSeparatedListOutputParser()
template = """List all the colors in a rainbow"""
prompt = PromptTemplate(template=template, input_variables=[], output_parser=output_parser)
llm_chain = LLMChain(prompt=prompt, llm=llm)
llm_chain.predict()
```
<CodeOutputBlock lang="python">
```
'\n\nRed, orange, yellow, green, blue, indigo, violet'
```
</CodeOutputBlock>
With `predict_and_parse`:
```python
llm_chain.predict_and_parse()
```
<CodeOutputBlock lang="python">
```
['Red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']
```
</CodeOutputBlock>
## Initialize from string
You can also construct an `LLMChain` from a string template directly.
```python
template = """Tell me a {adjective} joke about {subject}."""
llm_chain = LLMChain.from_string(llm=llm, template=template)
```
```python
llm_chain.predict(adjective="sad", subject="ducks")
```
<CodeOutputBlock lang="python">
```
'\n\nQ: What did the duck say when his friend died?\nA: Quack, quack, goodbye.'
```
</CodeOutputBlock>

View File

@@ -7,7 +7,168 @@
"source": [
"# Router\n",
"\n",
"This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects the next chain to use for a given input. \n",
"Routing allows you to create non-deterministic chains where the output of a previous step defines the next step. Routing helps provide structure and consistency around interactions with LLMs.\n",
"\n",
"As a very simple example, let's suppose we have two templates optimized for different types of questions, and we want to choose the template based on the user input."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "8d11fa5c",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"\n",
"\n",
"physics_template = \"\"\"You are a very smart physics professor. \\\n",
"You are great at answering questions about physics in a concise and easy to understand manner. \\\n",
"When you don't know the answer to a question you admit that you don't know.\n",
"\n",
"Here is a question:\n",
"{input}\"\"\"\n",
"physics_prompt = PromptTemplate.from_template(physics_template)\n",
"\n",
"math_template = \"\"\"You are a very good mathematician. You are great at answering math questions. \\\n",
"You are so good because you are able to break down hard problems into their component parts, \\\n",
"answer the component parts, and then put them together to answer the broader question.\n",
"\n",
"Here is a question:\n",
"{input}\"\"\"\n",
"math_prompt = PromptTemplate.from_template(math_template)"
]
},
{
"cell_type": "markdown",
"id": "892bb71f-e4f4-431e-8321-fe6a40e71b78",
"metadata": {},
"source": [
"## Using LCEL\n",
"\n",
"We can easily do this using a `RunnableBranch`. A `RunnableBranch` is initialized with a list of (condition, runnable) pairs and a default runnable. It selects which branch by passing each condition the input it's invoked with. It selects the first condition to evaluate to True, and runs the corresponding runnable to that condition with the input. \n",
"\n",
"If no provided conditions match, it runs the default runnable."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f2c4cdb4-1108-491c-9f6f-bbceeb452e29",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnableBranch"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "49308c5c-8722-4fb0-b78d-3b1dac0e656d",
"metadata": {},
"outputs": [],
"source": [
"general_prompt = PromptTemplate.from_template(\n",
" \"You are a helpful assistant. Answer the question as accurately as you can.\\n\\n{input}\"\n",
")\n",
"prompt_branch = RunnableBranch(\n",
" (lambda x: x[\"topic\"] == \"math\", math_prompt),\n",
" (lambda x: x[\"topic\"] == \"physics\", physics_prompt),\n",
" general_prompt\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "750da8ec-7c1c-4a0e-9c94-e3a1da49319b",
"metadata": {},
"outputs": [],
"source": [
"from typing import Literal\n",
"\n",
"from langchain.pydantic_v1 import BaseModel\n",
"from langchain.output_parsers.openai_functions import PydanticAttrOutputFunctionsParser\n",
"from langchain.utils.openai_functions import convert_pydantic_to_openai_function\n",
"\n",
"\n",
"class TopicClassifier(BaseModel):\n",
" \"Classify the topic of the user question\"\n",
" \n",
" topic: Literal[\"math\", \"physics\", \"general\"]\n",
" \"The topic of the user question. One of 'math', 'phsyics' or 'general'.\"\n",
"\n",
"\n",
"classifier_function = convert_pydantic_to_openai_function(TopicClassifier)\n",
"llm = ChatOpenAI().bind(functions=[classifier_function], function_call={\"name\": \"TopicClassifier\"}) \n",
"parser = PydanticAttrOutputFunctionsParser(pydantic_schema=TopicClassifier, attr_name=\"topic\")\n",
"classifier_chain = llm | parser"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "35be97db-2b31-4503-af56-2cae802a9822",
"metadata": {},
"outputs": [],
"source": [
"from operator import itemgetter\n",
"\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnablePassthrough\n",
"\n",
"\n",
"final_chain = (\n",
" RunnablePassthrough.assign(topic=itemgetter(\"input\") | classifier_chain) \n",
" | prompt_branch \n",
" | ChatOpenAI()\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "9b161436-432b-4ecd-9752-5f458a7b1d54",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Thank you for your kind words! I'll be happy to help you with this math question.\\n\\nTo find the first prime number greater than 40 that satisfies the given condition, we need to follow a step-by-step approach. \\n\\nFirstly, let's list the prime numbers greater than 40:\\n41, 43, 47, 53, 59, 61, 67, 71, ...\\n\\nNow, we need to check if one plus each of these prime numbers is divisible by 3. We can do this by calculating the remainder when dividing each number by 3.\\n\\nFor 41, (41 + 1) % 3 = 42 % 3 = 0. It is divisible by 3.\\n\\nFor 43, (43 + 1) % 3 = 44 % 3 = 2. It is not divisible by 3.\\n\\nFor 47, (47 + 1) % 3 = 48 % 3 = 0. It is divisible by 3.\\n\\nSince 41 and 47 are both greater than 40 and satisfy the condition, the first prime number greater than 40 such that one plus the prime number is divisible by 3 is 41.\\n\\nTherefore, the answer to the question is 41.\""
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"final_chain.invoke(\n",
" {\"input\": \"What is the first prime number greater than 40 such that one plus the prime number is divisible by 3?\"}\n",
")"
]
},
{
"cell_type": "markdown",
"id": "aa11d8fb-b9f2-4427-9c7f-2146f84cba72",
"metadata": {},
"source": [
"For more on routing with LCEL [head here](/docs/expression_language/how_to/routing)."
]
},
{
"cell_type": "markdown",
"id": "681af961-388e-4b37-9572-4f084365abba",
"metadata": {},
"source": [
"## [Legacy] RouterChain\n",
"\n",
":::note The preferred approach as of version `0.0.293` is to use LCEL as above.\n",
"\n",
"Here we show how to use the `RouterChain` paradigm to create a chain that dynamically selects the next chain to use for a given input. \n",
"\n",
"Router chains are made up of two components:\n",
"\n",
@@ -15,12 +176,12 @@
"- `destination_chains`: chains that the router chain can route to\n",
"\n",
"\n",
"In this notebook, we will focus on the different types of routing chains. We will show these routing chains used in a `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt."
"In this example, we will focus on the different types of routing chains. We will show these routing chains used in a `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 16,
"id": "e8d624d4",
"metadata": {},
"outputs": [],
@@ -28,36 +189,22 @@
"from langchain.chains.router import MultiPromptChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.chains import ConversationChain\n",
"from langchain.chains.llm import LLMChain\n",
"from langchain.prompts import PromptTemplate"
"from langchain.chains.llm import LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8d11fa5c",
"cell_type": "markdown",
"id": "83cea2d5",
"metadata": {},
"outputs": [],
"source": [
"physics_template = \"\"\"You are a very smart physics professor. \\\n",
"You are great at answering questions about physics in a concise and easy to understand manner. \\\n",
"When you don't know the answer to a question you admit that you don't know.\n",
"### [Legacy] LLMRouterChain\n",
"\n",
"Here is a question:\n",
"{input}\"\"\"\n",
"\n",
"\n",
"math_template = \"\"\"You are a very good mathematician. You are great at answering math questions. \\\n",
"You are so good because you are able to break down hard problems into their component parts, \\\n",
"answer the component parts, and then put them together to answer the broader question.\n",
"\n",
"Here is a question:\n",
"{input}\"\"\""
"This chain uses an LLM to determine how to route things."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 17,
"id": "d0b8856e",
"metadata": {},
"outputs": [],
@@ -78,7 +225,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 18,
"id": "de2dc0f0",
"metadata": {},
"outputs": [],
@@ -88,7 +235,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 19,
"id": "f27c154a",
"metadata": {},
"outputs": [],
@@ -103,19 +250,9 @@
"default_chain = ConversationChain(llm=llm, output_key=\"text\")"
]
},
{
"cell_type": "markdown",
"id": "83cea2d5",
"metadata": {},
"source": [
"## LLMRouterChain\n",
"\n",
"This chain uses an LLM to determine how to route things."
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 20,
"id": "60142895",
"metadata": {},
"outputs": [],
@@ -126,7 +263,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 21,
"id": "60769f96",
"metadata": {},
"outputs": [],
@@ -144,7 +281,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 22,
"id": "db679975",
"metadata": {},
"outputs": [],
@@ -159,7 +296,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 23,
"id": "90fd594c",
"metadata": {},
"outputs": [
@@ -169,12 +306,26 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new MultiPromptChain chain...\u001b[0m\n",
"\u001b[1m> Entering new MultiPromptChain chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/bagatur/langchain/libs/langchain/langchain/chains/llm.py:280: UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.\n",
" warnings.warn(\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"physics: {'input': 'What is black body radiation?'}\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"Black body radiation is the term used to describe the electromagnetic radiation emitted by a “black body”—an object that absorbs all radiation incident upon it. A black body is an idealized physical body that absorbs all incident electromagnetic radiation, regardless of frequency or angle of incidence. It does not reflect, emit or transmit energy. This type of radiation is the result of the thermal motion of the body's atoms and molecules, and it is emitted at all wavelengths. The spectrum of radiation emitted is described by Planck's law and is known as the black body spectrum.\n"
"Black body radiation is the thermal electromagnetic radiation within or surrounding a body in thermodynamic equilibrium with its environment, or emitted by a black body (an idealized physical body which absorbs all incident electromagnetic radiation). It is a characteristic of the temperature of the body; if the body has a uniform temperature, the radiation is also uniform across the spectrum of frequencies. The spectral characteristics of the radiation are determined by the temperature of the body, which implies that a black body at a given temperature will emit the same amount of radiation at every frequency.\n"
]
}
],
@@ -184,7 +335,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 24,
"id": "b8c83765",
"metadata": {},
"outputs": [
@@ -194,12 +345,33 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new MultiPromptChain chain...\u001b[0m\n",
"\u001b[1m> Entering new MultiPromptChain chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/bagatur/langchain/libs/langchain/langchain/chains/llm.py:280: UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.\n",
" warnings.warn(\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"math: {'input': 'What is the first prime number greater than 40 such that one plus the prime number is divisible by 3?'}\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"?\n",
"\n",
"The answer is 43. One plus 43 is 44 which is divisible by 3.\n"
"\n",
"The first prime number greater than 40 such that one plus the prime number is divisible by 3 is 43. This can be seen by breaking down the problem:\n",
"\n",
"1) We know that a prime number is a number that is only divisible by itself and one. \n",
"2) We also know that if a number is divisible by 3, the sum of its digits must be divisible by 3. \n",
"\n",
"So, if we want to find the first prime number greater than 40 such that one plus the prime number is divisible by 3, we can start counting up from 40, testing each number to see if it is prime and if the sum of the number and one is divisible by three. \n",
"\n",
"The first number we come to that satisfies these conditions is 43.\n"
]
}
],
@@ -213,7 +385,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 25,
"id": "74c6bba7",
"metadata": {},
"outputs": [
@@ -223,10 +395,26 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new MultiPromptChain chain...\u001b[0m\n",
"None: {'input': 'What is the name of the type of cloud that rains?'}\n",
"\u001b[1m> Entering new MultiPromptChain chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/bagatur/langchain/libs/langchain/langchain/chains/llm.py:280: UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.\n",
" warnings.warn(\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"physics: {'input': 'What is the name of the type of cloud that rains?'}\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
" The type of cloud that rains is called a cumulonimbus cloud. It is a tall and dense cloud that is often accompanied by thunder and lightning.\n"
"\n",
"\n",
"The type of cloud that rains is called a cumulonimbus cloud.\n"
]
}
],
@@ -239,14 +427,14 @@
"id": "239d4743",
"metadata": {},
"source": [
"## EmbeddingRouterChain\n",
"## [Legacy] EmbeddingRouterChain\n",
"\n",
"The `EmbeddingRouterChain` uses embeddings and similarity to route between destination chains."
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 26,
"id": "55c3ed0e",
"metadata": {},
"outputs": [],
@@ -258,7 +446,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 27,
"id": "572a5082",
"metadata": {},
"outputs": [],
@@ -271,18 +459,10 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 28,
"id": "50221efe",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using embedded DuckDB without persistence: data will be transient\n"
]
}
],
"outputs": [],
"source": [
"router_chain = EmbeddingRouterChain.from_names_and_descriptions(\n",
" names_and_descriptions, Chroma, CohereEmbeddings(), routing_keys=[\"input\"]\n",
@@ -291,7 +471,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 29,
"id": "ff7996a0",
"metadata": {},
"outputs": [],
@@ -306,7 +486,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 30,
"id": "99270cc9",
"metadata": {},
"outputs": [
@@ -321,7 +501,7 @@
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"Black body radiation is the emission of energy from an idealized physical body (known as a black body) that is in thermal equilibrium with its environment. It is emitted in a characteristic pattern of frequencies known as a black-body spectrum, which depends only on the temperature of the body. The study of black body radiation is an important part of astrophysics and atmospheric physics, as the thermal radiation emitted by stars and planets can often be approximated as black body radiation.\n"
"Black body radiation is the electromagnetic radiation emitted by a black body, which is an idealized physical body that absorbs all incident electromagnetic radiation. This radiation is related to the temperature of the body, with higher temperatures leading to higher radiation levels. The spectrum of the radiation is continuous, and is described by the Planck's law of black body radiation.\n"
]
}
],
@@ -331,7 +511,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 31,
"id": "b5ce6238",
"metadata": {},
"outputs": [
@@ -344,9 +524,9 @@
"\u001b[1m> Entering new MultiPromptChain chain...\u001b[0m\n",
"math: {'input': 'What is the first prime number greater than 40 such that one plus the prime number is divisible by 3?'}\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"?\n",
"\n",
"Answer: The first prime number greater than 40 such that one plus the prime number is divisible by 3 is 43.\n"
"\n",
"The first prime number greater than 40 such that one plus the prime number is divisible by 3 is 43. This is because 43 is a prime number, and 1 + 43 = 44, which is divisible by 3.\n"
]
}
],
@@ -357,14 +537,6 @@
" )\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "20f3d047",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -383,7 +555,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,418 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "119af92c-f02c-4729-84ac-0f69d6208c1b",
"metadata": {},
"source": [
"# Sequential\n",
"\n",
"The next step after calling a language model is to make a series of calls to a language model. This is particularly useful when you want to take the output from one call and use it as the input to another.\n",
"\n",
"The recommended way to do this is using the LangChain Expression Language. The legacy way is using the `SequentialChain`, which we continue to document here for backwards compatibility.\n",
"\n",
"As a toy example, let's suppose we want to create a chain that first creates a play synopsis and then generates a play review based on the synopsis."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "443e62b9-8a68-468e-b91d-f19de2993fe8",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"\n",
"synopsis_prompt = PromptTemplate.from_template(\n",
" \"\"\"You are a playwright. Given the title of play, it is your job to write a synopsis for that title.\n",
"\n",
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
")\n",
"\n",
"review_prompt = PromptTemplate.from_template(\n",
" \"\"\"You are a play critic from the New York Times. Given the synopsis of play, it is your job to write a review for that play.\n",
"\n",
"Play Synopsis:\n",
"{synopsis}\n",
"Review from a New York Times play critic of the above play:\"\"\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "7d1b284f-73b4-4f3c-ab88-e4c6f4b0bf76",
"metadata": {},
"source": [
"## Using LCEL\n",
"\n",
"Creating a sequence of calls (to LLMs or any other component/arbitrary function) is precisely what LangChain Expression Language was designed for."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "c0a43154-7624-41b7-9832-f2022af41fba",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'In \"Tragedy at Sunset on the Beach,\" playwright has crafted a deeply affecting drama that delves into the complexities of human relationships and the consequences that arise from one fateful evening. Set against the breathtaking backdrop of a serene beach at sunset, the play takes audiences on an emotional journey as it explores the lives of four individuals whose paths intertwine in unexpected and tragic ways.\\n\\nAt the center of the story is Sarah, a young woman grappling with the recent loss of her husband. Seeking solace and a fresh start, she embarks on a solitary trip to the beach, hoping to find peace and clarity. It is here that she encounters James, a charismatic but troubled artist, lost in his own world of anguish and self-doubt. The unlikely connection they form becomes the catalyst for a series of heart-wrenching events, as their emotional baggage and personal demons collide.\\n\\nThe play skillfully weaves together the narratives of Sarah, James, and Rachel, Sarah\\'s best friend. As Rachel arrives on the beach with the intention of helping Sarah heal, she unknowingly carries a secret that threatens to shatter their friendship forever. Against the backdrop of crashing waves and vibrant sunsets, the characters\\' lives unravel, exposing hidden desires, betrayals, and deeply buried secrets. The boundaries of love, friendship, and loyalty blur, forcing each character to confront their own vulnerabilities and face the consequences of their choices.\\n\\nWhat sets \"Tragedy at Sunset on the Beach\" apart is its ability to evoke genuine emotion from its audience. The playwright\\'s poignant exploration of the human condition touches upon universal themes of loss, forgiveness, and the lengths we go to protect the ones we love. The richly drawn characters come alive on stage, their struggles and triumphs resonating deeply with the audience. Moments of intense emotion are skillfully crafted, leaving spectators captivated and moved.\\n\\nThe play\\'s evocative setting adds another layer of depth to the storytelling. The picturesque beach at sunset becomes a metaphor for the fragility of life and the fleeting nature of happiness. The crashing waves and vibrant colors serve as a backdrop to the characters\\' unraveling lives, heightening the emotional impact of their stories.\\n\\nWhile \"Tragedy at Sunset on the Beach\" is undeniably a heavy and somber play, it ultimately leaves audiences questioning the power of redemption. The characters\\' journeys, though tragic, offer glimpses of hope and the potential for healing. It reminds us that even amidst the darkest moments, there is still a chance for redemption and forgiveness.\\n\\nOverall, \"Tragedy at Sunset on the Beach\" is a thought-provoking and emotionally charged play that will captivate audiences from start to finish. The playwright\\'s skillful storytelling, evocative setting, and richly drawn characters make for a truly memorable theatrical experience. This is a play that will leave spectators questioning their own lives and the choices they make, long after the curtain falls.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema import StrOutputParser\n",
"\n",
"llm = ChatOpenAI()\n",
"chain = {\"synopsis\": synopsis_prompt | llm | StrOutputParser()} | review_prompt | llm | StrOutputParser()\n",
"chain.invoke({\"title\": \"Tragedy at sunset on the beach\"})"
]
},
{
"cell_type": "markdown",
"id": "c37f72d5-a005-444b-b97e-39df86c515c7",
"metadata": {},
"source": [
"If we wanted to get back the synopsis as well we could do:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "9f9fb8ad-b6eb-49c3-a1d1-83f4460525e6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'synopsis': 'Tragedy at Sunset on the Beach is a gripping and emotionally charged drama that delves into the complexities of human relationships and the fragility of life. Set against the backdrop of a picturesque beach at sunset, the play follows a group of friends who gather to celebrate a joyous occasion.\\n\\nAs the sun begins its descent, tensions simmer beneath the surface, and long-held secrets and resentments come to light. The characters find themselves entangled in a web of love, betrayal, and loss, as they confront their deepest fears and desires.\\n\\nThe main focus revolves around Sarah, a vibrant and free-spirited woman who becomes the center of a tragic event. Through a series of flashback scenes, we witness the unraveling of her life, exploring her complicated relationships with her closest friends and romantic partners.\\n\\nThe play explores themes of regret, redemption, and the consequences of our choices. It delves into the human condition, questioning the nature of happiness and the value of time. The audience is taken on an emotional rollercoaster, experiencing moments of laughter, heartache, and profound reflection.\\n\\nTragedy at Sunset on the Beach challenges conventional notions of tragedy, evoking a sense of empathy and understanding for the flawed and vulnerable characters. It serves as a reminder that life is unpredictable and fragile, urging us to cherish every moment and embrace the beauty that exists even amidst tragedy.',\n",
" 'review': \"In Tragedy at Sunset on the Beach, playwright John Smithson delivers a powerful and thought-provoking exploration of the human experience. Set against the stunning backdrop of a beach at sunset, this emotionally charged drama takes the audience on a journey through the complexities of relationships, the fragility of life, and the profound impact of our choices.\\n\\nSmithson skillfully weaves together a tale of love, betrayal, and loss, as a group of friends gather to celebrate a joyous occasion. As the sun sets, tensions rise, and long-held secrets and resentments are exposed, leaving the characters entangled in a web of emotions. Through a series of poignant flashback scenes, we witness the unraveling of Sarah's life, a vibrant and free-spirited woman who becomes the center of a tragic event.\\n\\nWhat sets Tragedy at Sunset on the Beach apart is its ability to challenge conventional notions of tragedy. Smithson masterfully portrays flawed and vulnerable characters with such empathy and understanding that the audience can't help but empathize with their struggles. This play serves as a reminder that life is unpredictable and fragile, urging us to cherish every moment and embrace the beauty that exists even amidst tragedy.\\n\\nThe performances in this production are nothing short of extraordinary. The actors effortlessly navigate the emotional rollercoaster of the script, eliciting moments of laughter, heartache, and profound reflection from the audience. Their ability to convey the complexities of their characters' relationships and inner turmoil is truly commendable.\\n\\nThe direction by Jane Anderson is impeccable, capturing the essence of the beach at sunset and utilizing the space to create an immersive experience for the audience. The use of flashbacks adds depth and nuance to the narrative, allowing for a deeper understanding of the characters and their motivations.\\n\\nTragedy at Sunset on the Beach is not a play for the faint of heart. It tackles heavy themes of regret, redemption, and the consequences of our choices. However, it is precisely this raw and unflinching exploration of the human condition that makes it such a compelling piece of theater. Smithson's writing, combined with the exceptional performances and direction, make this play a must-see for theatergoers looking for a thought-provoking and emotionally resonant experience.\\n\\nIn a city renowned for its theater scene, Tragedy at Sunset on the Beach stands out as a shining example of the power of live performance to evoke empathy, provoke contemplation, and remind us of the fragile beauty of life. It is a production that will linger in the minds and hearts of its audience long after the final curtain falls.\"}"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.schema.runnable import RunnablePassthrough\n",
"\n",
"synopsis_chain = synopsis_prompt | llm | StrOutputParser() \n",
"review_chain = review_prompt | llm | StrOutputParser()\n",
"chain = {\"synopsis\": synopsis_chain} | RunnablePassthrough.assign(review=review_chain)\n",
"chain.invoke({\"title\": \"Tragedy at sunset on the beach\"})"
]
},
{
"cell_type": "markdown",
"id": "5b145aac-cd8f-466c-a5ba-92b376a711f8",
"metadata": {},
"source": [
"Head to the [LCEL](/docs/expression_language) section for more on the interface, built-in features, and cookbook examples."
]
},
{
"cell_type": "markdown",
"id": "9af35228-d3ff-4c95-8168-506c72618ace",
"metadata": {},
"source": [
"## [Legacy] SequentialChain\n",
"\n",
":::note This is a legacy class, using LCEL as shown above is preffered.\n",
"\n",
"Sequential chains allow you to connect multiple chains and compose them into pipelines that execute some specific scenario. There are two types of sequential chains:\n",
"\n",
"- `SimpleSequentialChain`: The simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.\n",
"- `SequentialChain`: A more general form of sequential chains, allowing for multiple inputs/outputs."
]
},
{
"cell_type": "markdown",
"id": "6c25c84e-c9f6-43be-8282-78fbd1525091",
"metadata": {},
"source": [
"### SimpleSequentialChain"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "7ed84b1a-66a6-463c-ba61-1e98434e1958",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.chains import LLMChain\n",
"from langchain.prompts import PromptTemplate\n",
"\n",
"# This is an LLMChain to write a synopsis given a title of a play.\n",
"llm = OpenAI(temperature=.7)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=synopsis_prompt)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "a3173022-e2c7-478b-a9b8-4a535d905a1c",
"metadata": {},
"outputs": [],
"source": [
"# This is an LLMChain to write a review of a play given a synopsis.\n",
"llm = OpenAI(temperature=.7)\n",
"review_chain = LLMChain(llm=llm, prompt=review_prompt)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "b2fdd3d9-cd49-4606-b016-678e27d2b6e0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new SimpleSequentialChain chain...\u001b[0m\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"Tragedy at Sunset on the Beach is a modern tragedy about a young couple in love. The couple, Jack and Jill, are deeply in love and plan to spend the day together on the beach at sunset. However, when they arrive, they are shocked to discover that the beach is an abandoned, dilapidated wasteland. With no one else around, they explore the beach and start to reminisce about their relationship and the good times theyve shared. \n",
"\n",
"But then, out of the blue, a mysterious figure emerges from the shadows and reveals a dark secret. The figure tells the couple that the beach is no ordinary beach, but is in fact the site of a terrible tragedy that took place many years ago. As the figure explains what happened, Jack and Jill become overwhelmed with grief. \n",
"\n",
"In the end, Jack and Jill are forced to confront the truth about the tragedy and its consequences. The play is ultimately a reflection on the power of tragedy and the human capacity to confront and overcome it.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"Tragedy at Sunset on the Beach is a powerful, thought-provoking modern tragedy that is sure to leave a lasting impression on its audience. The play follows the story of Jack and Jill, a young couple deeply in love, as they explore an abandoned beach and discover a dark secret from the past.\n",
"\n",
"The play brilliantly captures the raw emotions of Jack and Jill as they learn of the tragedy that has occurred on the beach. The writing is masterful, and the actors do a wonderful job of conveying the couples grief and pain. The play is ultimately a reflection on the power of tragedy and the human capacity to confront and overcome it.\n",
"\n",
"Overall, Tragedy at Sunset on the Beach is a must-see for anyone looking for a thought-provoking and emotionally moving play. This play is sure to stay with its audience long after the curtain closes. Highly recommended.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"# This is the overall chain where we run these two chains in sequence.\n",
"from langchain.chains import SimpleSequentialChain\n",
"\n",
"overall_chain = SimpleSequentialChain(chains=[synopsis_chain, review_chain], verbose=True)\n",
"\n",
"review = overall_chain.run(\"Tragedy at sunset on the beach\")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "5f023a0c-9305-4a14-ae24-23fff9933861",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"Tragedy at Sunset on the Beach is a powerful, thought-provoking modern tragedy that is sure to leave a lasting impression on its audience. The play follows the story of Jack and Jill, a young couple deeply in love, as they explore an abandoned beach and discover a dark secret from the past.\n",
"\n",
"The play brilliantly captures the raw emotions of Jack and Jill as they learn of the tragedy that has occurred on the beach. The writing is masterful, and the actors do a wonderful job of conveying the couples grief and pain. The play is ultimately a reflection on the power of tragedy and the human capacity to confront and overcome it.\n",
"\n",
"Overall, Tragedy at Sunset on the Beach is a must-see for anyone looking for a thought-provoking and emotionally moving play. This play is sure to stay with its audience long after the curtain closes. Highly recommended.\n"
]
}
],
"source": [
"print(review)"
]
},
{
"cell_type": "markdown",
"id": "d09df151-6a66-4982-8424-44ec3c92422d",
"metadata": {},
"source": [
"### SequentialChain\n",
"Of course, not all sequential chains will be as simple as passing a single string as an argument and getting a single string as output for all steps in the chain. In this next example, we will experiment with more complex chains that involve multiple inputs, and where there also multiple final outputs.\n",
"\n",
"Of particular importance is how we name the input/output variables. In the above example we didn't have to think about that because we were just passing the output of one chain directly as input to the next, but here we do have worry about that because we have multiple inputs."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "7481ed64-22f3-47dc-9796-c372eeb4f7bd",
"metadata": {},
"outputs": [],
"source": [
"# This is an LLMChain to write a synopsis given a title of a play and the era it is set in.\n",
"llm = OpenAI(temperature=.7)\n",
"synopsis_template = \"\"\"You are a playwright. Given the title of play and the era it is set in, it is your job to write a synopsis for that title.\n",
"\n",
"Title: {title}\n",
"Era: {era}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"synopsis_prompt_template = PromptTemplate(input_variables=[\"title\", \"era\"], template=synopsis_template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=synopsis_prompt_template, output_key=\"synopsis\")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "6d5ec9be-7101-460a-9fc1-7ef2d02434e7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new SequentialChain chain...\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"{'title': 'Tragedy at sunset on the beach',\n",
" 'era': 'Victorian England',\n",
" 'synopsis': \"\\n\\nThe play is set in Victorian England and follows the story of a young couple, Mary and John, who were deeply in love and had just gotten engaged. On the night of their engagement, they decided to take a romantic walk along the beach at sunset. Unexpectedly, John is shot by a stranger and killed right in front of Mary. In a state of shock and anguish, Mary is left alone, struggling to comprehend what has just occurred. \\n\\nThe play follows Mary as she searches for answers to John's death. As Mary's investigation begins, she discovers that John was actually involved in a dark and dangerous plot to overthrow the government. Unbeknownst to Mary, John had been working as a spy in a secret mission to uncover the truth behind a political scandal. \\n\\nNow, Mary must face the consequences of her beloved's actions and find a way to save the future of England. As the story unfolds, Mary must confront her own beliefs as well as the powerful people who are determined to end her mission. \\n\\nAt the end of the play, all of Mary's questions are answered and she is able to make a choice that will ultimately decide the fate of the nation. Tragedy at Sunset on the Beach is a\",\n",
" 'review': \"\\n\\nSet against the backdrop of Victorian England, Tragedy at Sunset on the Beach tells a heart-wrenching story of love, loss, and tragedy. The play follows Mary and John, a young couple deeply in love, who experience an unexpected tragedy on the night of their engagement. When John is shot and killed by a stranger, Mary is left alone to uncover the truth behind her beloved's death.\\n\\nWhat follows is an intense and gripping journey as Mary discovers that John was a spy in a secret mission to uncover a powerful political scandal. As Mary faces off against those determined to end her mission, she must confront her own beliefs and ultimately decide the fate of the nation.\\n\\nThe play is skillfully crafted and brilliantly performed. The actors portray a range of emotions from joy to sorrow that will leave the audience moved and captivated. The production is a beautiful testament to the power of love and the strength of the human spirit, and it is sure to leave a lasting impression. Highly recommended.\"}"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This is an LLMChain to write a review of a play given a synopsis.\n",
"llm = OpenAI(temperature=.7)\n",
"template = \"\"\"You are a play critic from the New York Times. Given the synopsis of play, it is your job to write a review for that play.\n",
"\n",
"Play Synopsis:\n",
"{synopsis}\n",
"Review from a New York Times play critic of the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"synopsis\"], template=template)\n",
"review_chain = LLMChain(llm=llm, prompt=prompt_template, output_key=\"review\")\n",
"\n",
"# This is the overall chain where we run these two chains in sequence.\n",
"from langchain.chains import SequentialChain\n",
"overall_chain = SequentialChain(\n",
" chains=[synopsis_chain, review_chain],\n",
" input_variables=[\"era\", \"title\"],\n",
" # Here we return multiple variables\n",
" output_variables=[\"synopsis\", \"review\"],\n",
" verbose=True)\n",
"\n",
"\n",
"overall_chain({\"title\":\"Tragedy at sunset on the beach\", \"era\": \"Victorian England\"})"
]
},
{
"cell_type": "markdown",
"id": "282f3c01-566b-4285-9615-dd07c8d43d54",
"metadata": {},
"source": [
"#### Memory in Sequential Chains\n",
"Sometimes you may want to pass along some context to use in each step of the chain or in a later part of the chain, but maintaining and chaining together the input/output variables can quickly get messy. Using `SimpleMemory` is a convenient way to do manage this and clean up your chains.\n",
"\n",
"For example, using the previous playwright `SequentialChain`, lets say you wanted to include some context about date, time and location of the play, and using the generated synopsis and review, create some social media post text. You could add these new context variables as `input_variables`, or we can add a `SimpleMemory` to the chain to manage this context:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "e50d0da6-dea1-428f-94eb-c7dfc3d298e3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new SequentialChain chain...\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"{'title': 'Tragedy at sunset on the beach',\n",
" 'era': 'Victorian England',\n",
" 'time': 'December 25th, 8pm PST',\n",
" 'location': 'Theater in the Park',\n",
" 'social_post_text': \"Experience a heartbreaking love story this Christmas as we bring you 'Tragedy at Sunset on the Beach', set in Victorian England on December 25th at 8pm PST at the Theater in the Park. Follow the story of two young lovers, George and Mary, and their fight against overwhelming odds. Will their love prevail? Find out this Christmas Day! #TragedyAtSunset #LoveStory #Christmas #VictorianEngland\"}"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chains import SequentialChain\n",
"from langchain.memory import SimpleMemory\n",
"\n",
"llm = OpenAI(temperature=.7)\n",
"template = \"\"\"You are a social media manager for a theater company. Given the title of play, the era it is set in, the date,time and location, the synopsis of the play, and the review of the play, it is your job to write a social media post for that play.\n",
"\n",
"Here is some context about the time and location of the play:\n",
"Date and Time: {time}\n",
"Location: {location}\n",
"\n",
"Play Synopsis:\n",
"{synopsis}\n",
"Review from a New York Times play critic of the above play:\n",
"{review}\n",
"\n",
"Social Media Post:\n",
"\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"synopsis\", \"review\", \"time\", \"location\"], template=template)\n",
"social_chain = LLMChain(llm=llm, prompt=prompt_template, output_key=\"social_post_text\")\n",
"\n",
"overall_chain = SequentialChain(\n",
" memory=SimpleMemory(memories={\"time\": \"December 25th, 8pm PST\", \"location\": \"Theater in the Park\"}),\n",
" chains=[synopsis_chain, review_chain, social_chain],\n",
" input_variables=[\"era\", \"title\"],\n",
" # Here we return multiple variables\n",
" output_variables=[\"social_post_text\"],\n",
" verbose=True)\n",
"\n",
"overall_chain({\"title\":\"Tragedy at sunset on the beach\", \"era\": \"Victorian England\"})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,229 +0,0 @@
# Sequential
The next step after calling a language model is to make a series of calls to a language model. This is particularly useful when you want to take the output from one call and use it as the input to another.
In this notebook we will walk through some examples of how to do this, using sequential chains. Sequential chains allow you to connect multiple chains and compose them into pipelines that execute some specific scenario. There are two types of sequential chains:
- `SimpleSequentialChain`: The simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.
- `SequentialChain`: A more general form of sequential chains, allowing for multiple inputs/outputs.
```python
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
```
```python
# This is an LLMChain to write a synopsis given a title of a play.
llm = OpenAI(temperature=.7)
synopsis_template = """You are a playwright. Given the title of play, it is your job to write a synopsis for that title.
Title: {title}
Playwright: This is a synopsis for the above play:"""
synopsis_prompt_template = PromptTemplate(input_variables=["title"], template=synopsis_template)
synopsis_chain = LLMChain(llm=llm, prompt=synopsis_prompt_template)
```
```python
# This is an LLMChain to write a review of a play given a synopsis.
llm = OpenAI(temperature=.7)
template = """You are a play critic from the New York Times. Given the synopsis of play, it is your job to write a review for that play.
Play Synopsis:
{synopsis}
Review from a New York Times play critic of the above play:"""
prompt_template = PromptTemplate(input_variables=["synopsis"], template=template)
review_chain = LLMChain(llm=llm, prompt=prompt_template)
```
```python
# This is the overall chain where we run these two chains in sequence.
from langchain.chains import SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[synopsis_chain, review_chain], verbose=True)
```
```python
review = overall_chain.run("Tragedy at sunset on the beach")
```
<CodeOutputBlock lang="python">
```
> Entering new SimpleSequentialChain chain...
Tragedy at Sunset on the Beach is a story of a young couple, Jack and Sarah, who are in love and looking forward to their future together. On the night of their anniversary, they decide to take a walk on the beach at sunset. As they are walking, they come across a mysterious figure, who tells them that their love will be tested in the near future.
The figure then tells the couple that the sun will soon set, and with it, a tragedy will strike. If Jack and Sarah can stay together and pass the test, they will be granted everlasting love. However, if they fail, their love will be lost forever.
The play follows the couple as they struggle to stay together and battle the forces that threaten to tear them apart. Despite the tragedy that awaits them, they remain devoted to one another and fight to keep their love alive. In the end, the couple must decide whether to take a chance on their future together or succumb to the tragedy of the sunset.
Tragedy at Sunset on the Beach is an emotionally gripping story of love, hope, and sacrifice. Through the story of Jack and Sarah, the audience is taken on a journey of self-discovery and the power of love to overcome even the greatest of obstacles.
The play's talented cast brings the characters to life, allowing us to feel the depths of their emotion and the intensity of their struggle. With its compelling story and captivating performances, this play is sure to draw in audiences and leave them on the edge of their seats.
The play's setting of the beach at sunset adds a touch of poignancy and romanticism to the story, while the mysterious figure serves to keep the audience enthralled. Overall, Tragedy at Sunset on the Beach is an engaging and thought-provoking play that is sure to leave audiences feeling inspired and hopeful.
> Finished chain.
```
</CodeOutputBlock>
```python
print(review)
```
<CodeOutputBlock lang="python">
```
Tragedy at Sunset on the Beach is an emotionally gripping story of love, hope, and sacrifice. Through the story of Jack and Sarah, the audience is taken on a journey of self-discovery and the power of love to overcome even the greatest of obstacles.
The play's talented cast brings the characters to life, allowing us to feel the depths of their emotion and the intensity of their struggle. With its compelling story and captivating performances, this play is sure to draw in audiences and leave them on the edge of their seats.
The play's setting of the beach at sunset adds a touch of poignancy and romanticism to the story, while the mysterious figure serves to keep the audience enthralled. Overall, Tragedy at Sunset on the Beach is an engaging and thought-provoking play that is sure to leave audiences feeling inspired and hopeful.
```
</CodeOutputBlock>
## Sequential Chain
Of course, not all sequential chains will be as simple as passing a single string as an argument and getting a single string as output for all steps in the chain. In this next example, we will experiment with more complex chains that involve multiple inputs, and where there also multiple final outputs.
Of particular importance is how we name the input/output variables. In the above example we didn't have to think about that because we were just passing the output of one chain directly as input to the next, but here we do have worry about that because we have multiple inputs.
```python
# This is an LLMChain to write a synopsis given a title of a play and the era it is set in.
llm = OpenAI(temperature=.7)
synopsis_template = """You are a playwright. Given the title of play and the era it is set in, it is your job to write a synopsis for that title.
Title: {title}
Era: {era}
Playwright: This is a synopsis for the above play:"""
synopsis_prompt_template = PromptTemplate(input_variables=["title", "era"], template=synopsis_template)
synopsis_chain = LLMChain(llm=llm, prompt=synopsis_prompt_template, output_key="synopsis")
```
```python
# This is an LLMChain to write a review of a play given a synopsis.
llm = OpenAI(temperature=.7)
template = """You are a play critic from the New York Times. Given the synopsis of play, it is your job to write a review for that play.
Play Synopsis:
{synopsis}
Review from a New York Times play critic of the above play:"""
prompt_template = PromptTemplate(input_variables=["synopsis"], template=template)
review_chain = LLMChain(llm=llm, prompt=prompt_template, output_key="review")
```
```python
# This is the overall chain where we run these two chains in sequence.
from langchain.chains import SequentialChain
overall_chain = SequentialChain(
chains=[synopsis_chain, review_chain],
input_variables=["era", "title"],
# Here we return multiple variables
output_variables=["synopsis", "review"],
verbose=True)
```
```python
overall_chain({"title":"Tragedy at sunset on the beach", "era": "Victorian England"})
```
<CodeOutputBlock lang="python">
```
> Entering new SequentialChain chain...
> Finished chain.
{'title': 'Tragedy at sunset on the beach',
'era': 'Victorian England',
'synopsis': "\n\nThe play follows the story of John, a young man from a wealthy Victorian family, who dreams of a better life for himself. He soon meets a beautiful young woman named Mary, who shares his dream. The two fall in love and decide to elope and start a new life together.\n\nOn their journey, they make their way to a beach at sunset, where they plan to exchange their vows of love. Unbeknownst to them, their plans are overheard by John's father, who has been tracking them. He follows them to the beach and, in a fit of rage, confronts them. \n\nA physical altercation ensues, and in the struggle, John's father accidentally stabs Mary in the chest with his sword. The two are left in shock and disbelief as Mary dies in John's arms, her last words being a declaration of her love for him.\n\nThe tragedy of the play comes to a head when John, broken and with no hope of a future, chooses to take his own life by jumping off the cliffs into the sea below. \n\nThe play is a powerful story of love, hope, and loss set against the backdrop of 19th century England.",
'review': "\n\nThe latest production from playwright X is a powerful and heartbreaking story of love and loss set against the backdrop of 19th century England. The play follows John, a young man from a wealthy Victorian family, and Mary, a beautiful young woman with whom he falls in love. The two decide to elope and start a new life together, and the audience is taken on a journey of hope and optimism for the future.\n\nUnfortunately, their dreams are cut short when John's father discovers them and in a fit of rage, fatally stabs Mary. The tragedy of the play is further compounded when John, broken and without hope, takes his own life. The storyline is not only realistic, but also emotionally compelling, drawing the audience in from start to finish.\n\nThe acting was also commendable, with the actors delivering believable and nuanced performances. The playwright and director have successfully crafted a timeless tale of love and loss that will resonate with audiences for years to come. Highly recommended."}
```
</CodeOutputBlock>
### Memory in Sequential Chains
Sometimes you may want to pass along some context to use in each step of the chain or in a later part of the chain, but maintaining and chaining together the input/output variables can quickly get messy. Using `SimpleMemory` is a convenient way to do manage this and clean up your chains.
For example, using the previous playwright `SequentialChain`, lets say you wanted to include some context about date, time and location of the play, and using the generated synopsis and review, create some social media post text. You could add these new context variables as `input_variables`, or we can add a `SimpleMemory` to the chain to manage this context:
```python
from langchain.chains import SequentialChain
from langchain.memory import SimpleMemory
llm = OpenAI(temperature=.7)
template = """You are a social media manager for a theater company. Given the title of play, the era it is set in, the date,time and location, the synopsis of the play, and the review of the play, it is your job to write a social media post for that play.
Here is some context about the time and location of the play:
Date and Time: {time}
Location: {location}
Play Synopsis:
{synopsis}
Review from a New York Times play critic of the above play:
{review}
Social Media Post:
"""
prompt_template = PromptTemplate(input_variables=["synopsis", "review", "time", "location"], template=template)
social_chain = LLMChain(llm=llm, prompt=prompt_template, output_key="social_post_text")
overall_chain = SequentialChain(
memory=SimpleMemory(memories={"time": "December 25th, 8pm PST", "location": "Theater in the Park"}),
chains=[synopsis_chain, review_chain, social_chain],
input_variables=["era", "title"],
# Here we return multiple variables
output_variables=["social_post_text"],
verbose=True)
overall_chain({"title":"Tragedy at sunset on the beach", "era": "Victorian England"})
```
<CodeOutputBlock lang="python">
```
> Entering new SequentialChain chain...
> Finished chain.
{'title': 'Tragedy at sunset on the beach',
'era': 'Victorian England',
'time': 'December 25th, 8pm PST',
'location': 'Theater in the Park',
'social_post_text': "\nSpend your Christmas night with us at Theater in the Park and experience the heartbreaking story of love and loss that is 'A Walk on the Beach'. Set in Victorian England, this romantic tragedy follows the story of Frances and Edward, a young couple whose love is tragically cut short. Don't miss this emotional and thought-provoking production that is sure to leave you in tears. #AWalkOnTheBeach #LoveAndLoss #TheaterInThePark #VictorianEngland"}
```
</CodeOutputBlock>

View File

@@ -7,21 +7,27 @@
"source": [
"# Transformation\n",
"\n",
"This notebook showcases using a generic transformation chain.\n",
"Often we want to transform inputs as they are passed from one component to another.\n",
"\n",
"As an example, we will create a dummy transformation that takes in a super long text, filters the text to only the first 3 paragraphs, and then passes that into an `LLMChain` to summarize those."
"As an example, we will create a dummy transformation that takes in a super long text, filters the text to only the first 3 paragraphs, and then passes that into a chain to summarize those."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "bbbb4330",
"execution_count": 2,
"id": "d257f50d-c53d-41b7-be8a-df23fbd7c017",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import TransformChain, LLMChain, SimpleSequentialChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.prompts import PromptTemplate"
"from langchain.prompts import PromptTemplate\n",
"\n",
"prompt = PromptTemplate.from_template(\n",
" \"\"\"Summarize this text:\n",
"\n",
"{output_text}\n",
"\n",
"Summary:\"\"\"\n",
")"
]
},
{
@@ -35,9 +41,67 @@
" state_of_the_union = f.read()"
]
},
{
"cell_type": "markdown",
"id": "4c938536-e3fb-45eb-a1b3-cb82be410e32",
"metadata": {},
"source": [
"## Using LCEL\n",
"\n",
"With LCEL this is trivial, since we can add functions in any `RunnableSequence`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 17,
"id": "1e53e851-b1bd-424f-a144-5f2e8b413dcf",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The speaker acknowledges the presence of important figures in the government and addresses the audience as fellow Americans. They highlight the impact of COVID-19 on keeping people apart in the previous year but express joy in being able to come together again. The speaker emphasizes the unity of Democrats, Republicans, and Independents as Americans.'"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema import StrOutputParser\n",
"\n",
"runnable = {\"output_text\": lambda text: \"\\n\\n\".join(text.split(\"\\n\\n\")[:3])} | prompt | ChatOpenAI() | StrOutputParser()\n",
"runnable.invoke(state_of_the_union)"
]
},
{
"cell_type": "markdown",
"id": "a9b9bd07-155f-4777-9215-509d39ecfe3f",
"metadata": {},
"source": [
"## [Legacy] TransformationChain\n",
"\n",
":::note This is a legacy class, using LCEL as shown above is preffered.\n",
"\n",
"This notebook showcases using a generic transformation chain."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "bbbb4330",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import TransformChain, LLMChain, SimpleSequentialChain\n",
"from langchain.llms import OpenAI\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "98739592",
"metadata": {},
"outputs": [],
@@ -47,7 +111,6 @@
" shortened_text = \"\\n\\n\".join(text.split(\"\\n\\n\")[:3])\n",
" return {\"output_text\": shortened_text}\n",
"\n",
"\n",
"transform_chain = TransformChain(\n",
" input_variables=[\"text\"], output_variables=[\"output_text\"], transform=transform_func\n",
")"
@@ -55,7 +118,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 14,
"id": "e9397934",
"metadata": {},
"outputs": [],
@@ -71,7 +134,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 15,
"id": "06f51f17",
"metadata": {},
"outputs": [],
@@ -81,17 +144,17 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 16,
"id": "f7caa1ee",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' The speaker addresses the nation, noting that while last year they were kept apart due to COVID-19, this year they are together again. They are reminded that regardless of their political affiliations, they are all Americans.'"
"' In an address to the nation, the speaker acknowledges the hardships of the past year due to the COVID-19 pandemic, but emphasizes that regardless of political affiliation, all Americans can come together.'"
]
},
"execution_count": 7,
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
@@ -99,14 +162,6 @@
"source": [
"sequential_chain.run(state_of_the_union)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e3ca6409",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -125,7 +180,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@@ -127,9 +127,9 @@ len(docs)
## Auto-detect file encodings with TextLoader
In this example we will see some strategies that can be useful when loading a big list of arbitrary files from a directory using the `TextLoader` class.
In this example we will see some strategies that can be useful when loading a large list of arbitrary files from a directory using the `TextLoader` class.
First to illustrate the problem, let's try to load multiple text with arbitrary encodings.
First to illustrate the problem, let's try to load multiple texts with arbitrary encodings.
```python

View File

@@ -66,7 +66,7 @@
"\n",
"The record manager relies on a time-based mechanism to determine what content can be cleaned up (when using `full` or `incremental` cleanup modes).\n",
"\n",
"If two tasks run back-to-back, and the first task finishes before the the clock time changes, then the second task may not be able to clean up content.\n",
"If two tasks run back-to-back, and the first task finishes before the clock time changes, then the second task may not be able to clean up content.\n",
"\n",
"This is unlikely to be an issue in actual settings for the following reasons:\n",
"\n",

View File

@@ -12,7 +12,7 @@
"- [Memory in LLMChain](/docs/modules/memory/how_to/adding_memory.html)\n",
"- [Custom Agents](/docs/modules/agents/how_to/custom_agent.html)\n",
"\n",
"In order to add a memory to an agent we are going to the the following steps:\n",
"In order to add a memory to an agent we are going to perform the following steps:\n",
"\n",
"1. We are going to create an `LLMChain` with memory.\n",
"2. We are going to use that `LLMChain` to create a custom Agent.\n",

View File

@@ -4,9 +4,8 @@ This output parser can be used when you want to return a list of comma-separated
```python
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
output_parser = CommaSeparatedListOutputParser()

View File

@@ -7,11 +7,9 @@ But we can do other things besides throw errors. Specifically, we can pass the m
For this example, we'll use the above Pydantic output parser. Here's what happens if we pass it a result that does not comply with the schema:
```python
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from langchain.pydantic_v1 import BaseModel, Field
from typing import List
```

View File

@@ -28,3 +28,8 @@ design and secure your applications.
## Reporting a Vulnerability
Please report security vulnerabilities by email to security@langchain.dev. This will ensure the issue is promptly triaged and acted upon as needed.
## Enterprise solutions
LangChain may offer enterprise solutions for customers who have additional security
requirements. Please contact us at sales@langchain.dev.

View File

@@ -181,7 +181,7 @@
"source": [
"To enable use of GPU on Apple Silicon, follow the steps [here](https://github.com/abetlen/llama-cpp-python/blob/main/docs/install/macos.md) to use the Python binding `with Metal support`.\n",
"\n",
"In particular, ensure that `conda` is using the correct virtual enviorment that you created (`miniforge3`).\n",
"In particular, ensure that `conda` is using the correct virtual environment that you created (`miniforge3`).\n",
"\n",
"E.g., for me:\n",
"\n",

View File

@@ -65,7 +65,7 @@ qa.run(query)
</CodeOutputBlock>
The above way allows you to really simply change the chain_type, but it doesn't provide a ton of flexibility over parameters to that chain type. If you want to control those parameters, you can load the chain directly (as you did in [this notebook](/docs/modules/chains/additional/question_answering.html)) and then pass that directly to the the RetrievalQA chain with the `combine_documents_chain` parameter. For example:
The above way allows you to really simply change the chain_type, but it doesn't provide a ton of flexibility over parameters to that chain type. If you want to control those parameters, you can load the chain directly (as you did in [this notebook](/docs/modules/chains/additional/question_answering.html)) and then pass that directly to the RetrievalQA chain with the `combine_documents_chain` parameter. For example:
```python

View File

@@ -160,29 +160,60 @@ const config = {
label: "Integrations",
},
{
to: "https://api.python.langchain.com",
href: "https://api.python.langchain.com",
label: "API",
position: "left",
},
{
to: "/docs/community",
label: "Community",
type: "dropdown",
label: "More",
position: "left",
items: [
{
to: "/docs/community",
label: "Community",
},
{
to: "/docs/additional_resources/dependents",
label: "Dependents",
},
{
to: "/docs/additional_resources/tutorials",
label: "Tutorials"
},
{
to: "/docs/additional_resources/youtube",
label: "YouTube videos"
},
{ label: "Gallery", href: "https://github.com/kyrolabs/awesome-langchain" }
]
},
{
to: "https://chat.langchain.com",
label: "Chat our docs",
position: "right",
},
{
to: "https://smith.langchain.com",
label: "LangSmith",
position: "right",
},
{
to: "https://js.langchain.com/docs",
label: "JS/TS Docs",
type: "dropdown",
label: "Also by LangChain",
position: "right",
items: [
{
href: "https://chat.langchain.com",
label: "Chat our docs",
},
{
href: "https://smith.langchain.com",
label: "LangSmith",
},
{
href: "https://smith.langchain.com/hub",
label: "LangChain Hub",
},
{
href: "https://github.com/langchain-ai/langserve",
label: "LangServe",
},
{
href: "https://js.langchain.com/docs",
label: "JS/TS",
},
]
},
// Please keep GitHub link to the right for consistency.
{

View File

@@ -1,65 +0,0 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "91c6a7ef",
"metadata": {},
"source": [
"# SingleStoreDB Chat Message History\n",
"\n",
"This notebook goes over how to use SingleStoreDB to store chat message history."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d15e3302",
"metadata": {},
"outputs": [],
"source": [
"from langchain.memory import SingleStoreDBChatMessageHistory\n",
"\n",
"history = SingleStoreDBChatMessageHistory(\n",
" session_id=\"foo\",\n",
" host=\"root:pass@localhost:3306/db\"\n",
")\n",
"\n",
"history.add_user_message(\"hi!\")\n",
"\n",
"history.add_ai_message(\"whats up?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "64fc465e",
"metadata": {},
"outputs": [],
"source": [
"history.messages"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -68,19 +68,6 @@ module.exports = {
description: 'Design guides for key parts of the development process',
slug: "guides",
},
},
{
type: "category",
label: "More",
collapsed: true,
items: [
{ type: "autogenerated", dirName: "additional_resources" },
{ type: "link", label: "Gallery", href: "https://github.com/kyrolabs/awesome-langchain" }
],
link: {
type: 'generated-index',
slug: "additional_resources",
},
}
],
integrations: [

View File

@@ -1,8 +0,0 @@
---
title: Cookbook
hide_table_of_contents: true
---
# Cookbook
The page you're looking for has been moved to the [cookbook section of the repo](https://github.com/langchain-ai/langchain/tree/master/cookbook) as a notebook.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 150 KiB

After

Width:  |  Height:  |  Size: 185 KiB

View File

@@ -3848,8 +3848,12 @@
"destination": "/docs/additional_resources/dependents"
},
{
"source": "docs/integrations/retrievers/google_cloud_enterprise_search",
"destination": "docs/integrations/retrievers/google_vertex_ai_search"
"source": "/docs/integrations/retrievers/google_cloud_enterprise_search",
"destination": "/docs/integrations/retrievers/google_vertex_ai_search"
},
{
"source": "/docs/integrations/providers/google_document_ai",
"destination": "/docs/integrations/platforms/google#google-document-ai"
}
]
}

View File

@@ -48,4 +48,5 @@ python3.11 -m pip install --upgrade pip
python3.11 -m pip install -r vercel_requirements.txt
python3.11 scripts/model_feat_table.py
nbdoc_build --srcdir docs
cp ../cookbook/README.md src/pages/cookbook.mdx
python3.11 scripts/generate_api_reference_links.py

View File

@@ -0,0 +1 @@
"""CSV toolkit."""

View File

@@ -0,0 +1,37 @@
from io import IOBase
from typing import Any, List, Optional, Union
from langchain.agents.agent import AgentExecutor
from langchain.schema.language_model import BaseLanguageModel
from langchain_experimental.agents.agent_toolkits.pandas.base import (
create_pandas_dataframe_agent,
)
def create_csv_agent(
llm: BaseLanguageModel,
path: Union[str, IOBase, List[Union[str, IOBase]]],
pandas_kwargs: Optional[dict] = None,
**kwargs: Any,
) -> AgentExecutor:
"""Create csv agent by loading to a dataframe and using pandas agent."""
try:
import pandas as pd
except ImportError:
raise ImportError(
"pandas package not found, please install with `pip install pandas`"
)
_kwargs = pandas_kwargs or {}
if isinstance(path, (str, IOBase)):
df = pd.read_csv(path, **_kwargs)
elif isinstance(path, list):
df = []
for item in path:
if not isinstance(item, (str, IOBase)):
raise ValueError(f"Expected str or file-like object, got {type(path)}")
df.append(pd.read_csv(item, **_kwargs))
else:
raise ValueError(f"Expected str, list, or file-like object, got {type(path)}")
return create_pandas_dataframe_agent(llm, df, **kwargs)

View File

@@ -1,10 +1,8 @@
from __future__ import annotations # allows pydantic model to reference itself
import re
from typing import Any, Optional, Union
from typing import Any, List, Optional, Union
import duckdb
import pandas as pd
from langchain.graphs.networkx_graph import NetworkxEntityGraph
from langchain_experimental.cpal.constants import Constant
@@ -38,7 +36,7 @@ class EntityModel(BaseModel):
name: str = Field(description="entity name")
code: str = Field(description="entity actions")
value: float = Field(description="entity initial value")
depends_on: list[str] = Field(default=[], description="ancestor entities")
depends_on: List[str] = Field(default=[], description="ancestor entities")
# TODO: generalize to multivariate math
# TODO: acyclic graph
@@ -54,7 +52,7 @@ class EntityModel(BaseModel):
class CausalModel(BaseModel):
attribute: str = Field(description="name of the attribute to be calculated")
entities: list[EntityModel] = Field(description="entities in the story")
entities: List[EntityModel] = Field(description="entities in the story")
# TODO: root validate each `entity.depends_on` using system's entity names
@@ -101,8 +99,8 @@ class InterventionModel(BaseModel):
}
"""
entity_settings: list[EntitySettingModel]
system_settings: Optional[list[SystemSettingModel]] = None
entity_settings: List[EntitySettingModel]
system_settings: Optional[List[SystemSettingModel]] = None
@validator("system_settings")
def lower_case_name(cls, v: str) -> Union[str, None]:
@@ -129,7 +127,7 @@ class StoryModel(BaseModel):
causal_operations: Any = Field(required=True)
intervention: Any = Field(required=True)
query: Any = Field(required=True)
_outcome_table: pd.DataFrame = PrivateAttr(default=None)
_outcome_table: Any = PrivateAttr(default=None)
_networkx_wrapper: Any = PrivateAttr(default=None)
def __init__(self, **kwargs: Any):
@@ -190,6 +188,12 @@ class StoryModel(BaseModel):
self.causal_operations.entities.sort(key=lambda x: sorted_nodes.index(x.name))
def _forward_propagate(self) -> None:
try:
import pandas as pd
except ImportError as e:
raise ImportError(
"Unable to import pandas, please install with `pip install pandas`."
) from e
entity_scope = {
entity.name: entity for entity in self.causal_operations.entities
}
@@ -217,11 +221,17 @@ class StoryModel(BaseModel):
if self.query.llm_error_msg == "":
try:
import duckdb
df = self._outcome_table # noqa
query_result = duckdb.sql(self.query.expression).df()
self.query._result_table = query_result
except duckdb.BinderException as e:
self.query._result_table = humanize_sql_error_msg(str(e))
except ImportError as e:
raise ImportError(
"Unable to import duckdb, please install with `pip install duckdb`."
) from e
except Exception as e:
self.query._result_table = str(e)
else:

View File

@@ -5,14 +5,12 @@ from typing import Any, DefaultDict, Dict, List, Optional
from langchain.callbacks.manager import (
CallbackManagerForLLMRun,
Callbacks,
)
from langchain.chat_models.anthropic import ChatAnthropic
from langchain.chat_models.base import BaseChatModel
from langchain.schema import (
ChatGeneration,
ChatResult,
LLMResult,
)
from langchain.schema.messages import (
AIMessage,
@@ -196,18 +194,6 @@ class AnthropicFunctions(BaseChatModel):
else:
return ChatResult(generations=[ChatGeneration(message=response)])
async def agenerate(
self,
messages: List[List[BaseMessage]],
stop: Optional[List[str]] = None,
callbacks: Callbacks = None,
*,
tags: Optional[List[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> LLMResult:
raise NotImplementedError
@property
def _llm_type(self) -> str:
return "anthropic_functions"

View File

@@ -12,7 +12,7 @@ class ToTDFSMemory:
"""
def __init__(self, stack: Optional[List[Thought]] = None):
self.stack: list[Thought] = stack or []
self.stack: List[Thought] = stack or []
def top(self) -> Optional[Thought]:
"Get the top of the stack without popping it."

Some files were not shown because too many files have changed in this diff Show More