Compare commits

..

98 Commits

Author SHA1 Message Date
Edmar Ferreira
5b48ab8db3 add mako template 2022-11-20 12:20:15 -03:00
Harrison Chase
a19ad935b3 Harrison/verbose prompt (#159)
Add printing of prompt to LLMChain
2022-11-19 20:39:35 -08:00
Harrison Chase
c02eb199b6 add few shot example (#148) 2022-11-19 20:32:45 -08:00
Harrison Chase
8869b0ab0e bump version to 0.0.16 (#157) 2022-11-18 06:09:03 -08:00
Harrison Chase
b15c84e19d Harrison/chain lab (#156) 2022-11-18 05:50:02 -08:00
Harrison Chase
0ac08bbca6 bump version to 0.0.15 (#154) 2022-11-16 23:22:05 -08:00
Nicholas Larus-Stone
0c3ae78ec1 chore: update ascii colors to work with dark mode (#152) 2022-11-16 22:05:28 -08:00
Nicholas Larus-Stone
ca4b10bb74 feat: add option to ignore or restrict to SQL tables (#151)
`SQLDatabase` now accepts two `init` arguments:
1. `ignore_tables` to pass in a list of tables to not search over
2. `include_tables` to restrict to a list of tables to consider
2022-11-16 22:04:50 -08:00
Harrison Chase
d2f9288be6 add metadata to documents (#153)
add concept of metadata to document
2022-11-16 21:58:05 -08:00
Harrison Chase
d775ddd749 add apply functionality (#150) 2022-11-16 21:39:02 -08:00
thesved
47e35d7d0e Fix notebook links (#149)
Example notebook links were broken.
2022-11-16 15:13:12 -08:00
Harrison Chase
4f1bf159f4 bump version to 0.0.14 (#145) 2022-11-14 22:07:54 -08:00
Harrison Chase
b504cd739f Harrison/cleanup env check (#144) 2022-11-14 22:05:41 -08:00
Harrison Chase
a4b502d92f fix env var loader (#143) 2022-11-14 21:42:43 -08:00
Harrison Chase
1835e8a681 prompt nit (#141)
doing some cleanup, and i think this just simplifies things...
2022-11-14 21:30:33 -08:00
Harrison Chase
bbb405a492 update colors (#140) 2022-11-14 20:27:36 -08:00
Predrag Gruevski
1a95252f00 Use pull_request not pull_request_target in GitHub Actions. (#139)
`pull_request` runs on the merge commit between the opened PR and the
target branch where the PR is to be merged — `master` in this case. This
is desirable because that way the new changes get linted and tested.

The existing `pull_request_target` specifier causes lint and test to run
_on the target branch itself_ (i.e. `master` in this case). That way the
new code in the PR doesn't get linted and tested at all. This can also
lead to security vulnerabilities, as described in the GitHub docs:

![image](https://user-images.githubusercontent.com/2348618/201735153-c5dd0c03-2490-45e9-b7f9-f0d47eb0109f.png)

Screenshot from here:
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target
Link from the screenshot:
https://securitylab.github.com/research/github-actions-preventing-pwn-requests/
2022-11-14 11:34:08 -08:00
Harrison Chase
9f223e6ccc Harrison/fix lint (#138) 2022-11-14 08:55:59 -08:00
Delip Rao
76cecf8165 A fix for Jupyter environment variable issue (#135)
- fixes the Jupyter environment variable issues mentioned in issue #134 
- fixes format/lint issues in some unrelated files (from make
format/lint)


![image](https://user-images.githubusercontent.com/347398/201599322-090af858-362d-4d69-bf59-208aea65419a.png)
2022-11-14 08:34:01 -08:00
Harrison Chase
ced29b816b remove extra run from merge conflict (#133) 2022-11-13 21:07:20 -08:00
Harrison Chase
11d37d556e bump version 0.0.13 (#132) 2022-11-13 21:06:50 -08:00
Harrison Chase
b1b6b27c5f Harrison/redo docs (#130)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2022-11-13 20:13:23 -08:00
Harrison Chase
f23b3ceb49 consolidate run functions (#126)
consolidating logic for when a chain is able to run with single input
text, single output text

open to feedback on naming, logic, usefulness
2022-11-13 18:14:35 -08:00
Harrison Chase
1fe3a4f724 extra requires (#129)
add extra requires
2022-11-13 17:34:58 -08:00
Eugene Yurtsev
2910f50a3c Fix a few typos and wrapped f-strings (#128)
Fix a few typos and wrapped f-strings
2022-11-13 13:16:19 -08:00
Edmar Ferreira
8a5ec894e7 Prompt from file proof of concept using plain text (#127)
This is a simple proof of concept of using external files as templates. 
I'm still feeling my way around the codebase.
As a user, I want to use files as prompts, so it will be easier to
manage and test prompts.
The future direction is to use a template engine, most likely Mako.
2022-11-13 13:15:30 -08:00
Harrison Chase
d87e73ddb1 huggingface tokenizer (#75) 2022-11-13 09:37:44 -08:00
Eugene Yurtsev
b542941234 Bumping python version for read the docs (#122)
Haven't checked whether things work with new python version, hoping
error will
be caught with CI
2022-11-12 13:43:39 -08:00
Eugene Yurtsev
6df08eec52 Readme: Fix link to embeddings example and use python markup for code examples (#123)
* Fix URL to embeddings notebook
* Specify python is used for the code block
2022-11-12 11:26:08 -08:00
Eugene Yurtsev
f5a588a165 Add py.typed marker to package (#121)
- Update
- update
2022-11-12 11:22:32 -08:00
Harrison Chase
47af2bcee4 vector db qa (#71) 2022-11-12 07:24:49 -08:00
Harrison Chase
4c0b684f79 new manifest notebook (#118) 2022-11-11 06:49:06 -08:00
Harrison Chase
7467243a42 bump version 0.0.12 (#116) 2022-11-11 06:41:07 -08:00
Harrison Chase
e43534d41c add integration with manifest (#62) 2022-11-10 11:24:11 -08:00
Harrison Chase
5e76c12455 Harrison/fix docs (#115) 2022-11-10 08:59:51 -08:00
Harrison Chase
9f878e43d8 Harrison/lintai21 (#114) 2022-11-10 08:46:35 -08:00
tomeras91
d8734ce5ad Add AI21 LLMs (#99)
Integrate AI21 /complete API into langchain, to allow access to Jurassic
models.
2022-11-10 08:12:28 -08:00
Harrison Chase
2179ea3103 remove unnecc variables (#113)
i dont think either of these variables are used?
2022-11-10 07:53:45 -08:00
Harrison Chase
da445e474d version 0.0.11 (#112) 2022-11-09 23:10:29 -08:00
Harrison Chase
b92e9abdf1 Harrison/fix name (#111) 2022-11-09 23:10:16 -08:00
Samantha Whitmore
a0780cc930 OptimizedPrompt -- k-shot example choice backed by semantic search (#91) 2022-11-09 21:15:42 -08:00
Delip Rao
3ee6e332dd Implements NLTK and Spacy-based TextSplitters (#103)
This PR is for Issue #88 

- [x] `make format`
- [x] `make lint`
- [x] `make tests`
2022-11-09 20:45:30 -08:00
issam9
28282ad099 Issam9/cohere embeddings (#105)
Add support for cohere embeddings
2022-11-09 13:44:27 -08:00
Delip Rao
95dd2f140e Make Integration Tests "work" again (#106)
This fixes Issue #104 

The tests for HF Embeddings is skipped because of the segfault issue
mentioned there. Perhaps, a new issue should be created for that?
2022-11-09 13:26:58 -08:00
Nicholas Larus-Stone
abe4fc04fa docs: fix some minor typos in README (#107)
Small docs fixes
2022-11-09 13:23:29 -08:00
Delip Rao
bd462e9df0 Fix pip install issue due to FAISS (#102)
- change requirements.txt to fix Issue #101
- update .gitignore to support VSCode dev environment
2022-11-09 13:23:17 -08:00
Samantha Whitmore
386a14a19f Change NLPCloud default model (#100) 2022-11-09 09:30:09 -08:00
Harrison Chase
5b7aed34a3 bump version to 0.0.10 (#98) 2022-11-09 00:20:46 -08:00
Harrison Chase
db37bd089f model laboratory (#95) 2022-11-08 22:17:10 -08:00
Samantha Whitmore
2ddab88c06 Update VectorStore interface to contain from_texts, enforce common in… (#97)
…terface
2022-11-08 21:55:22 -08:00
Samantha Whitmore
61f12229df Create VectorStore interface (#92) 2022-11-08 18:19:39 -08:00
Harrison Chase
b9f61390e9 add text2text generation (#93)
fixes issue #90
2022-11-08 18:08:46 -08:00
Samantha Whitmore
e48e562ea5 ElasticVectorSearch: Add in vector search backed by Elastic (#67)
![image](https://user-images.githubusercontent.com/6690839/200147455-33a68e20-c3c0-4045-9bff-598b38ae8fb2.png)

woo!

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2022-11-08 07:01:42 -08:00
Samantha Whitmore
efbc03bda8 NLPCloud client integration (#81)
lots of kwargs! generation docs here:
https://docs.nlpcloud.com/#generation

This somewhat breaks the paradigm introduced in LLM base class as the
stop sequence isn't a list, and should rightfully be introduced at the
time of initialization of the class, along with the other kwargs that
depend on its presence (e.g. remove_end_sequence, etc.) curious if you'd
want to refactor LLM base class to take out stop as a specific named
kwarg?
2022-11-08 06:24:23 -08:00
Harrison Chase
6d8a657676 bump to version 0.0.9 (#82) 2022-11-07 20:42:25 -08:00
Harrison Chase
6cff2837bb Harrison/fix lint (#80) 2022-11-07 15:22:37 -08:00
Cameron Whitehead
54e325be2f Improve credential handing to allow passing in constructors (#79)
Addresses the issue in #76 by either using the relevant environment
variable if set or using a string passed in the constructor.

Prefers the constructor string over the environment variable, which
seemed like the natural choice to me.
2022-11-07 13:34:45 -08:00
Harrison Chase
9679bdc34c run workflows on forks (#78)
per
https://stackoverflow.com/questions/58221321/is-github-actions-available-on-forked-repositories
2022-11-07 05:53:17 -08:00
Harrison Chase
95d0e5f368 fix lint (#77) 2022-11-07 05:52:57 -08:00
issam9
990cd821cc Issam/hf embeddings (#68)
Add support of HuggingFace embedding models
2022-11-07 05:46:44 -08:00
Harrison Chase
84e164e44b update version to 0.0.8 (#74) 2022-11-06 16:19:06 -08:00
Harrison Chase
a00f659555 undo notebook changes (#73) 2022-11-06 16:11:00 -08:00
Harrison Chase
eb36317f9a Harrison/fix imports (#72)
fix imports and add section to notebook
2022-11-06 16:06:40 -08:00
Samantha Whitmore
a5b61d59e1 Refactor prompts into module, add example generation utils (#64) 2022-11-06 15:40:33 -08:00
Harrison Chase
dce26dfcec handle search errors (#70)
better error handling when serpapi raises an error (usually invalid key)
2022-11-06 15:40:21 -08:00
Harrison Chase
a7d14cad00 add link to socratic models (#69) 2022-11-06 14:10:26 -08:00
Harrison Chase
f772934108 improve logging (#66) 2022-11-05 15:13:12 -07:00
Harrison Chase
818b06ebbc Harrison/add twitter (#65)
add twitter to readme
2022-11-05 14:51:24 -07:00
Harrison Chase
2456a547de mrkl (#42) 2022-11-05 14:41:53 -07:00
Samantha Whitmore
c636488fe5 DynamicPrompt class creation (#49)
Checking that this structure looks generally ok -- going to sub in logic
where the TODO comment is then add a test.
2022-11-05 12:43:21 -07:00
Harrison Chase
618611f4dd update glossary (#63) 2022-11-05 08:44:37 -07:00
Samantha Whitmore
4bbaa9b2d0 Add BasePrompt as abstract base class (#60) 2022-11-04 08:42:45 -07:00
Harrison Chase
8f907161e3 Harrison/initial glossary (#61) 2022-11-04 08:02:21 -07:00
Harrison Chase
8764ac2b55 bump to 007 (#59) 2022-11-03 07:23:16 -07:00
Harrison Chase
4cc18d6c2a Harrison/pretty print (#57)
make stuff look nice
2022-11-03 00:41:07 -07:00
Harrison Chase
dfb81c969f bump version 0.0.6 (#56) 2022-11-02 09:22:02 -07:00
Harrison Chase
76aff023d7 FAISS and embedding support (#48)
also adds embeddings and an in memory docstore
2022-11-01 21:29:39 -07:00
Harrison Chase
798deaec2b add license (#50) 2022-11-01 21:12:02 -07:00
Harrison Chase
d3c1872902 Improve docs (#51) 2022-11-01 21:09:39 -07:00
Harrison Chase
e982cf4b2e Harrison/update docstore (#47)
change docstore interface
2022-10-31 21:18:52 -07:00
Harrison Chase
b45b126d9b bump version (#46) 2022-10-31 21:14:29 -07:00
Harrison Chase
160af4ba6b Harrison/map reduce (#36) 2022-10-31 20:17:22 -07:00
Harrison Chase
4ac5345012 add developer guide (#44) 2022-10-30 22:48:52 -07:00
Harrison Chase
fba30e07d1 factor out mock python repl (#43) 2022-10-30 18:09:04 -07:00
Harrison Chase
7b0d02ac51 prompt templating (#41)
Co-authored-by: Samantha Whitmore <whitmore.samantha@gmail.com>
2022-10-30 09:45:27 -07:00
Harrison Chase
52383a485d bump version to 0.0.4 (#37) 2022-10-27 23:41:30 -07:00
Harrison Chase
af81e9ca9c add sql database (#35) 2022-10-27 23:21:47 -07:00
Harrison Chase
90a6e578bc fix type hint (#34) 2022-10-27 18:20:16 -07:00
Michael
6a3dca888b Fix cohere integration (#33)
Currently the cohere module uses a non-supported model. Updating this to
use the default model if one is not specified.
2022-10-27 18:17:03 -07:00
Harrison Chase
c7f9c62532 bump version to 0.0.3 (#29) 2022-10-26 21:35:44 -07:00
Harrison Chase
ab731f1f8c add wikipedia to readme (#30) 2022-10-26 21:35:23 -07:00
Harrison Chase
ce7b14b843 Harrison/add react chain (#24)
from https://arxiv.org/abs/2210.03629

still need to think if docstore abstraction makes sense
2022-10-26 21:02:23 -07:00
Harrison Chase
61a51b7a76 bump version to 0.0.2 (#28) 2022-10-25 22:05:24 -07:00
Harrison Chase
e40ec861f5 clean up example (#27)
clean up jupyter notebook for huggingface
2022-10-25 22:05:11 -07:00
Harrison Chase
020c42dcae Harrison/add huggingface hub (#23)
Add support for huggingface hub

I could not find a good way to enforce stop tokens over the huggingface
hub api - that needs to hopefully be cleaned up in the future
2022-10-25 22:00:33 -07:00
Harrison Chase
316aae8223 rename validator (#25)
more appropriate name
2022-10-25 20:56:48 -07:00
Harrison Chase
d2fdcba29d fix test name (#22) 2022-10-25 20:22:16 -07:00
Harrison Chase
21b10ffb13 update readme (#21) 2022-10-25 08:47:42 -07:00
168 changed files with 8733 additions and 392 deletions

View File

@@ -1,6 +1,6 @@
name: lint
on: [push]
on: [push, pull_request]
jobs:
build:
@@ -17,7 +17,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -r test_requirements.txt
- name: Analysing the code with our lint
run: |
make lint

View File

@@ -1,6 +1,6 @@
name: test
on: [push]
on: [push, pull_request]
jobs:
build:

1
.gitignore vendored
View File

@@ -1,3 +1,4 @@
.vscode/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
The MIT License
Copyright (c) Harrison Chase
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@@ -1,2 +1,3 @@
include langchain/py.typed
include langchain/VERSION
include LICENSE

View File

@@ -2,9 +2,7 @@
⚡ Building applications with LLMs through composability ⚡
[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml) [![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml) [![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai) [![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
## Quick Install
@@ -20,16 +18,18 @@ combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications.
It aims to create:
1. a comprehensive collection of pieces you would ever want to combine
2. a flexible interface for combining pieces into a single comprehensive "chain"
3. a schema for easily saving and sharing those chains
## 🔧 Setting up your environment
## 📖 Documentation
Besides the installation of this python package, you will also need to set environment variables for the services that call out to authenticated APIs. You do not need to set an environment variable unless you plan on using that API. Please see below for a comprehensive list of the APIs that require an API key, and the associated environment variable that you should set.
- OpenAI: `OPENAI_API_KEY`
- Cohere: `COHERE_API_KEY`
- SerpAPI (Google Search): `SERPAPI_API_KEY`
Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:
- Getting started (installation, setting up environment, simple examples)
- How-To examples (demos, integrations, helper functions)
- Reference (full API docs)
- Resources (high level explanation of core concepts)
## 🚀 What can I do with this
@@ -37,9 +37,9 @@ This project was largely inspired by a few projects seen on Twitter for which we
**[Self-ask-with-search](https://ofir.io/self-ask.pdf)**
To recreate this paper, use the following code snippet or checkout the [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/self_ask_with_search.ipynb).
To recreate this paper, use the following code snippet or checkout the [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/self_ask_with_search.ipynb).
```
```python
from langchain import SelfAskWithSearchChain, OpenAI, SerpAPIChain
llm = OpenAI(temperature=0)
@@ -52,9 +52,9 @@ self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open c
**[LLM Math](https://twitter.com/amasad/status/1568824744367259648?s=20&t=-7wxpXBJinPgDuyHLouP1w)**
To recreate this example, use the following code snippet or check out the [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/llm_math.ipynb).
To recreate this example, use the following code snippet or check out the [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/llm_math.ipynb).
```
```python
from langchain import OpenAI, LLMMathChain
llm = OpenAI(temperature=0)
@@ -65,23 +65,68 @@ llm_math.run("How many of the integers between 0 and 99 inclusive are divisible
**Generic Prompting**
You can also use this for simple prompting pipelines, as in the below example and this [example notebook](https://github.com/hwchase17/langchain/blob/master/examples/simple_prompts.ipynb).
You can also use this for simple prompting pipelines, as in the below example and this [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/demos/simple_prompts.ipynb).
```
from langchain import Prompt, OpenAI, LLMChain
```python
from langchain import PromptTemplate, OpenAI, LLMChain
template = """Question: {question}
Answer: Let's think step by step."""
prompt = Prompt(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0))
prompt = PromptTemplate(template=template, input_variables=["question"])
llm = OpenAI(temperature=0)
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"
llm_chain.predict(question=question)
```
## 📖 Documentation
**Embed & Search Documents**
The above examples are probably the most user friendly documentation that exists,
but full API docs can be found [here](https://langchain.readthedocs.io/en/latest/?).
We support two vector databases to store and search embeddings -- FAISS and Elasticsearch. Here's a code snippet showing how to use FAISS to store embeddings and search for text similar to a query. Both database backends are featured in this [example notebook](https://github.com/hwchase17/langchain/blob/master/docs/examples/integrations/embeddings.ipynb).
```python
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.faiss import FAISS
from langchain.text_splitter import CharacterTextSplitter
with open('state_of_the_union.txt') as f:
state_of_the_union = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_text(state_of_the_union)
embeddings = OpenAIEmbeddings()
docsearch = FAISS.from_texts(texts, embeddings)
query = "What did the president say about Ketanji Brown Jackson"
docs = docsearch.similarity_search(query)
```
## 🤖 Developer Guide
To begin developing on this project, first clone to the repo locally.
To install requirements, run `pip install -r requirements.txt`.
This will install all requirements for running the package, examples, linting, formatting, and tests.
Formatting for this project is a combination of [Black](https://black.readthedocs.io/en/stable/) and [isort](https://pycqa.github.io/isort/).
To run formatting for this project, run `make format`.
Linting for this project is a combination of [Black](https://black.readthedocs.io/en/stable/), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](http://mypy-lang.org/).
To run linting for this project, run `make lint`.
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer and they can help you with it. We do not want this to be a blocker for good code getting contributed.
Unit tests cover modular logic that does not require calls to outside apis.
To run unit tests, run `make tests`.
If you add new logic, please add a unit test.
Integration tests cover logic that requires making calls to outside APIs (often integration with other services).
To run integration tests, run `make integration_tests`.
If you add support for a new external API, please add a new integration test.
If you are adding a Jupyter notebook example, you can run `pip install -e .` to build the langchain package from your local changes, so your new logic can be imported into the notebook.
Docs are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.
For that reason, we ask that you add good documentation to all classes and methods.
Similar to linting, we recognize documentation can be annoying - if you do not want to do it, please contact a project maintainer and they can help you with it. We do not want this to be a blocker for good code getting contributed.

View File

@@ -37,9 +37,14 @@ extensions = [
"sphinx.ext.autodoc.typehints",
"sphinx.ext.autosummary",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinxcontrib.autodoc_pydantic",
"myst_parser",
"nbsphinx",
"sphinx_panels",
]
autodoc_pydantic_model_show_json = False
autodoc_pydantic_field_list_validators = False
autodoc_pydantic_config_members = False
@@ -68,6 +73,14 @@ exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
html_theme = "sphinx_rtd_theme"
# html_theme = "sphinx_typlog_theme"
html_context = {
"display_github": True, # Integrate GitHub
"github_user": "hwchase17", # Username
"github_repo": "langchain", # Repo name
"github_version": "master", # Version
"conf_py_path": "/docs/", # Path in the checkout to the docs root
}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".

10
docs/examples/demos.rst Normal file
View File

@@ -0,0 +1,10 @@
Demos
=====
The examples here are all end-to-end chains of specific applications.
.. toctree::
:maxdepth: 1
:glob:
demos/*

View File

@@ -1,11 +1,43 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e71e720f",
"metadata": {},
"source": [
"# LLM Math\n",
"\n",
"This notebook showcases using LLMs and Python REPLs to do complex word math problems."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "44e9ba31",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"How many of the integers between 0 and 99 inclusive are divisible by 8?\u001b[102m\n",
"\n",
"```python\n",
"count = 0\n",
"for i in range(100):\n",
" if i % 8 == 0:\n",
" count += 1\n",
"print(count)\n",
"```\n",
"\u001b[0m\n",
"Answer: \u001b[103m13\n",
"\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
@@ -21,7 +53,7 @@
"from langchain import OpenAI, LLMMathChain\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"llm_math = LLMMathChain(llm=llm)\n",
"llm_math = LLMMathChain(llm=llm, verbose=True)\n",
"\n",
"llm_math.run(\"How many of the integers between 0 and 99 inclusive are divisible by 8?\")"
]

View File

@@ -0,0 +1,93 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d9a0131f",
"metadata": {},
"source": [
"# Map Reduce\n",
"\n",
"This notebok showcases an example of map-reduce chains: recursive summarization."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e9db25f3",
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, PromptTemplate, LLMChain\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.chains.mapreduce import MapReduceChain\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"\n",
"_prompt = \"\"\"Write a concise summary of the following:\n",
"\n",
"\n",
"{text}\n",
"\n",
"\n",
"CONCISE SUMMARY:\"\"\"\n",
"prompt = PromptTemplate(template=_prompt, input_variables=[\"text\"])\n",
"\n",
"text_splitter = CharacterTextSplitter()\n",
"\n",
"mp_chain = MapReduceChain.from_params(llm, prompt, text_splitter)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "99bbe19b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"\\n\\nThe President discusses the recent aggression by Russia, and the response by the United States and its allies. He announces new sanctions against Russia, and says that the free world is united in holding Putin accountable. The President also discusses the American Rescue Plan, the Bipartisan Infrastructure Law, and the Bipartisan Innovation Act. Finally, the President addresses the need for women's rights and equality for LGBTQ+ Americans.\""
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"mp_chain.run(state_of_the_union)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b581501e",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,226 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f1390152",
"metadata": {},
"source": [
"# MRKL\n",
"\n",
"This notebook showcases using the MRKL chain to route between tasks"
]
},
{
"cell_type": "markdown",
"id": "39ea3638",
"metadata": {},
"source": [
"This uses the example Chinook database.\n",
"To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ac561cc4",
"metadata": {},
"outputs": [],
"source": [
"from langchain import LLMMathChain, OpenAI, SerpAPIChain, MRKLChain, SQLDatabase, SQLDatabaseChain\n",
"from langchain.chains.mrkl.base import ChainConfig"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "07e96d99",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"search = SerpAPIChain()\n",
"llm_math_chain = LLMMathChain(llm=llm, verbose=True)\n",
"db = SQLDatabase.from_uri(\"sqlite:///../../../notebooks/Chinook.db\")\n",
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)\n",
"chains = [\n",
" ChainConfig(\n",
" action_name = \"Search\",\n",
" action=search.run,\n",
" action_description=\"useful for when you need to answer questions about current events\"\n",
" ),\n",
" ChainConfig(\n",
" action_name=\"Calculator\",\n",
" action=llm_math_chain.run,\n",
" action_description=\"useful for when you need to answer questions about math\"\n",
" ),\n",
" \n",
" ChainConfig(\n",
" action_name=\"FooBar DB\",\n",
" action=db_chain.run,\n",
" action_description=\"useful for when you need to answer questions about FooBar. Input should be in the form of a question\"\n",
" )\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "a069c4b6",
"metadata": {},
"outputs": [],
"source": [
"mrkl = MRKLChain.from_chains(llm, chains, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "e603cd7d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the age of Olivia Wilde's boyfriend raised to the 0.23 power?\n",
"Thought:\u001b[102m I need to find the age of Olivia Wilde's boyfriend\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde's boyfriend\"\u001b[0m\n",
"Observation: \u001b[104mOlivia Wilde started dating Harry Styles after ending her years-long engagement to Jason Sudeikis — see their relationship timeline.\u001b[0m\n",
"Thought:\u001b[102m I need to find the age of Harry Styles\n",
"Action: Search\n",
"Action Input: \"Harry Styles age\"\u001b[0m\n",
"Observation: \u001b[104m28 years\u001b[0m\n",
"Thought:\u001b[102m I need to calculate 28 to the 0.23 power\n",
"Action: Calculator\n",
"Action Input: 28^0.23\u001b[0m\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"28^0.23\u001b[102m\n",
"\n",
"```python\n",
"print(28**0.23)\n",
"```\n",
"\u001b[0m\n",
"Answer: \u001b[103m2.1520202182226886\n",
"\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[103mAnswer: 2.1520202182226886\n",
"\u001b[0m\n",
"Thought:\u001b[102m I now know the final answer\n",
"Final Answer: 2.1520202182226886\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'2.1520202182226886'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mrkl.run(\"What is the age of Olivia Wilde's boyfriend raised to the 0.23 power?\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "a5c07010",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"Who recently released an album called 'The Storm Before the Calm' and are they in the FooBar database? If so, what albums of theirs are in the FooBar database?\n",
"Thought:\u001b[102m I need to find an album called 'The Storm Before the Calm'\n",
"Action: Search\n",
"Action Input: \"The Storm Before the Calm album\"\u001b[0m\n",
"Observation: \u001b[104mThe Storm Before the Calm (stylized in all lowercase) is the tenth (and eighth international) studio album by Canadian-American singer-songwriter Alanis ...\u001b[0m\n",
"Thought:\u001b[102m I need to check if Alanis is in the FooBar database\n",
"Action: FooBar DB\n",
"Action Input: \"Does Alanis Morissette exist in the FooBar database?\"\u001b[0m\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"Does Alanis Morissette exist in the FooBar database?\n",
"SQLQuery:\u001b[102m SELECT * FROM Artist WHERE Name = 'Alanis Morissette'\u001b[0m\n",
"SQLResult: \u001b[103m[(4, 'Alanis Morissette')]\u001b[0m\n",
"Answer:\u001b[102m Yes\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[101m Yes\u001b[0m\n",
"Thought:\u001b[102m I need to find out what albums of Alanis's are in the FooBar database\n",
"Action: FooBar DB\n",
"Action Input: \"What albums by Alanis Morissette are in the FooBar database?\"\u001b[0m\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What albums by Alanis Morissette are in the FooBar database?\n",
"SQLQuery:\u001b[102m SELECT Title FROM Album WHERE ArtistId = (SELECT ArtistId FROM Artist WHERE Name = 'Alanis Morissette')\u001b[0m\n",
"SQLResult: \u001b[103m[('Jagged Little Pill',)]\u001b[0m\n",
"Answer:\u001b[102m Jagged Little Pill\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"Observation: \u001b[101m Jagged Little Pill\u001b[0m\n",
"Thought:\u001b[102m I now know the final answer\n",
"Final Answer: The album is by Alanis Morissette and the albums in the FooBar database by her are Jagged Little Pill\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The album is by Alanis Morissette and the albums in the FooBar database by her are Jagged Little Pill'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mrkl.run(\"Who recently released an album called 'The Storm Before the Calm' and are they in the FooBar database? If so, what albums of theirs are in the FooBar database?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d7c2e6ac",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -46,7 +46,7 @@ if __name__ == "__main__":
try:
while True:
browser_content = "\n".join(_crawler.crawl())
llm_command = nat_bot_chain.run(_crawler.page.url, browser_content)
llm_command = nat_bot_chain.execute(_crawler.page.url, browser_content)
if not quiet:
print("URL: " + _crawler.page.url)
print("Objective: " + objective)

View File

@@ -0,0 +1,98 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "82140df0",
"metadata": {},
"source": [
"# ReAct\n",
"\n",
"This notebook showcases the implementation of the ReAct chain logic."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4e272b47",
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, ReActChain, Wikipedia\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"react = ReActChain(llm=llm, docstore=Wikipedia(), verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8078c8f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\n",
"Thought 1:\u001b[102m I need to search David Chanoff and find the U.S. Navy admiral he\n",
"collaborated with.\n",
"Action 1: Search[David Chanoff]\u001b[0m\n",
"Observation 1: \u001b[103mDavid Chanoff is a noted author of non-fiction work. His work has typically involved collaborations with the principal protagonist of the work concerned. His collaborators have included; Augustus A. White, Joycelyn Elders, Đoàn Văn Toại, William J. Crowe, Ariel Sharon, Kenneth Good and Felix Zandman. He has also written about a wide range of subjects including literary history, education and foreign for The Washington Post, The New Republic and The New York Times Magazine. He has published more than twelve books.\u001b[0m\n",
"Thought 2:\u001b[102m The U.S. Navy admiral David Chanoff collaborated with is William J. Crowe.\n",
"Action 2: Search[William J. Crowe]\u001b[0m\n",
"Observation 2: \u001b[103mWilliam James Crowe Jr. (January 2, 1925 October 18, 2007) was a United States Navy admiral and diplomat who served as the 11th chairman of the Joint Chiefs of Staff under Presidents Ronald Reagan and George H. W. Bush, and as the ambassador to the United Kingdom and Chair of the Intelligence Oversight Board under President Bill Clinton.\u001b[0m\n",
"Thought 3:\u001b[102m William J. Crowe served as the ambassador to the United Kingdom under President Bill Clinton. So the answer is Bill Clinton.\n",
"Action 3: Finish[Bill Clinton]\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Bill Clinton'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question = \"Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?\"\n",
"react.run(question)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0a6bd3b4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0c3f1df8",
"metadata": {},
"source": [
"# Self Ask With Search\n",
"\n",
"This notebook showcases the Self Ask With Search chain."
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -10,19 +20,23 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[102m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[106mCarlos Alcaraz\u001b[0m.\u001b[102m\n",
"Intermediate answer: \u001b[103mCarlos Alcaraz won the 2022 Men's single title while Poland's Iga Swiatek won the Women's single title defeating Tunisian's Ons Jabeur..\u001b[0m\u001b[102m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[106mEl Palmar, Murcia, Spain\u001b[0m.\u001b[102m\n",
"So the final answer is: El Palmar, Murcia, Spain\u001b[0m"
"Intermediate answer: \u001b[103mEl Palmar, Murcia, Spain.\u001b[0m\u001b[102m\n",
"So the final answer is: El Palmar, Murcia, Spain\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"\"What is the hometown of the reigning men's U.S. Open champion?\\nAre follow up questions needed here: Yes.\\nFollow up: Who is the reigning men's U.S. Open champion?\\nIntermediate answer: Carlos Alcaraz.\\nFollow up: Where is Carlos Alcaraz from?\\nIntermediate answer: El Palmar, Murcia, Spain.\\nSo the final answer is: El Palmar, Murcia, Spain\""
"'\\nSo the final answer is: El Palmar, Murcia, Spain'"
]
},
"execution_count": 1,
@@ -36,7 +50,7 @@
"llm = OpenAI(temperature=0)\n",
"search = SerpAPIChain()\n",
"\n",
"self_ask_with_search = SelfAskWithSearchChain(llm=llm, search_chain=search)\n",
"self_ask_with_search = SelfAskWithSearchChain(llm=llm, search_chain=search, verbose=True)\n",
"\n",
"self_ask_with_search.run(\"What is the hometown of the reigning men's U.S. Open champion?\")"
]
@@ -44,7 +58,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "6195fc82",
"id": "683d69e7",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -1,34 +1,59 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d8a5c5d4",
"metadata": {},
"source": [
"# Simple Example\n",
"\n",
"This notebook showcases a simple chain."
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 1,
"id": "51a54c4d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mQuestion: What NFL team won the Super Bowl in the year Justin Beiber was born?\n",
"\n",
"Answer: Let's think step by step.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' The year Justin Beiber was born was 1994. In 1994, the Dallas Cowboys won the Super Bowl.'"
]
},
"execution_count": 7,
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain import Prompt, OpenAI, LLMChain\n",
"from langchain import PromptTemplate, OpenAI, LLMChain\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = Prompt(template=template, input_variables=[\"question\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0))\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0), verbose=True)\n",
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"llm_chain.predict(question=question)"
"llm_chain.run(question)"
]
},
{
@@ -56,7 +81,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.8.7"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,129 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0ed6aab1",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# SQLite example\n",
"\n",
"This example showcases hooking up an LLM to answer questions over a database."
]
},
{
"cell_type": "markdown",
"id": "b2f66479",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"This uses the example Chinook database.\n",
"To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d0e27d88",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain import OpenAI, SQLDatabase, SQLDatabaseChain"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "72ede462",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"db = SQLDatabase.from_uri(\"sqlite:///../../../notebooks/Chinook.db\")\n",
"llm = OpenAI(temperature=0)\n",
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "15ff81df",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"How many employees are there?\n",
"SQLQuery:\u001b[102m SELECT COUNT(*) FROM Employee\u001b[0m\n",
"SQLResult: \u001b[103m[(8,)]\u001b[0m\n",
"Answer:\u001b[102m 8\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' 8'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db_chain.run(\"How many employees are there?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "61d91b85",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,104 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "07c1e3b9",
"metadata": {},
"source": [
"# Vector DB Question/Answering\n",
"\n",
"This example showcases question answering over a vector database."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "82525493",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores.faiss import FAISS\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain import OpenAI, VectorDBQA"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5c7049db",
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"docsearch = FAISS.from_texts(texts, embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "3018f865",
"metadata": {},
"outputs": [],
"source": [
"qa = VectorDBQA(llm=OpenAI(), vectorstore=docsearch)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "032a47f8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' The President said that Ketanji Brown Jackson is a consensus builder and has received a broad range of support since she was nominated.'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"qa.run(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0f20b92",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,10 @@
Integrations
============
The examples here all highlight a specific type of integration.
.. toctree::
:maxdepth: 1
:glob:
integrations/*

View File

@@ -0,0 +1,177 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7ef4d402-6662-4a26-b612-35b542066487",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Embeddings & VectorStores\n",
"\n",
"This notebook show cases how to use embeddings to create a VectorStore"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "965eecee",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch\n",
"from langchain.vectorstores.faiss import FAISS"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "68481687",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "015f4ff5",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"docsearch = FAISS.from_texts(texts, embeddings)\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "67baf32e",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence. \n",
"\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n"
]
}
],
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "markdown",
"id": "eea6e627",
"metadata": {},
"source": [
"## Requires having ElasticSearch setup"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4906b8a3",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"docsearch = ElasticVectorSearch.from_texts(texts, embeddings, elasticsearch_url=\"http://localhost:9200\")\n",
"\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "95f9eee9",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence. \n",
"\n",
"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
"\n",
"And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n"
]
}
],
"source": [
"print(docs[0].page_content)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,71 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "959300d4",
"metadata": {},
"source": [
"# HuggingFace Hub\n",
"\n",
"This example showcases how to connect to the HuggingFace Hub."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "3acf0069",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The Seattle Seahawks won the Super Bowl in 2010. Justin Beiber was born in 2010. The\n"
]
}
],
"source": [
"from langchain import PromptTemplate, HuggingFaceHub, LLMChain\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1e-10}))\n",
"\n",
"question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
"\n",
"print(llm_chain.run(question))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae4559c7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,180 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "b118c9dc",
"metadata": {},
"source": [
"# HuggingFace Tokenizers\n",
"\n",
"This notebook show cases how to use HuggingFace tokenizers to split text."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e82c4685",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a8ce51d5",
"metadata": {},
"outputs": [],
"source": [
"from transformers import GPT2TokenizerFast\n",
"\n",
"tokenizer = GPT2TokenizerFast.from_pretrained(\"gpt2\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ca5e72c0",
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(tokenizer, chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "37cdfbeb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \n",
"\n",
"Last year COVID-19 kept us apart. This year we are finally together again. \n",
"\n",
"Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n",
"\n",
"With a duty to one another to the American people to the Constitution. \n",
"\n",
"And with an unwavering resolve that freedom will always triumph over tyranny. \n",
"\n",
"Six days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n",
"\n",
"He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n",
"\n",
"He met the Ukrainian people. \n",
"\n",
"From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n",
"\n",
"Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. \n",
"\n",
"In this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. \n",
"\n",
"Let each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world. \n",
"\n",
"Please rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people. \n",
"\n",
"Throughout our history weve learned this lesson when dictators do not pay a price for their aggression they cause more chaos. \n",
"\n",
"They keep moving. \n",
"\n",
"And the costs and the threats to America and the world keep rising. \n",
"\n",
"Thats why the NATO Alliance was created to secure peace and stability in Europe after World War 2. \n",
"\n",
"The United States is a member along with 29 other nations. \n",
"\n",
"It matters. American diplomacy matters. American resolve matters. \n",
"\n",
"Putins latest attack on Ukraine was premeditated and unprovoked. \n",
"\n",
"He rejected repeated efforts at diplomacy. \n",
"\n",
"He thought the West and NATO wouldnt respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did. \n",
"\n",
"We prepared extensively and carefully. \n",
"\n",
"We spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. \n",
"\n",
"I spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression. \n",
"\n",
"We countered Russias lies with truth. \n",
"\n",
"And now that he has acted the free world is holding him accountable. \n",
"\n",
"Along with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland. \n",
"\n",
"We are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever. \n",
"\n",
"Together with our allies we are right now enforcing powerful economic sanctions. \n",
"\n",
"We are cutting off Russias largest banks from the international financial system. \n",
"\n",
"Preventing Russias central bank from defending the Russian Ruble making Putins $630 Billion “war fund” worthless. \n",
"\n",
"We are choking off Russias access to technology that will sap its economic strength and weaken its military for years to come. \n",
"\n",
"Tonight I say to the Russian oligarchs and corrupt leaders who have bilked billions of dollars off this violent regime no more. \n",
"\n",
"The U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs. \n",
"\n",
"We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains. \n",
"\n",
"And tonight I am announcing that we will join our allies in closing off American air space to all Russian flights further isolating Russia and adding an additional squeeze on their economy. The Ruble has lost 30% of its value. \n",
"\n",
"The Russian stock market has lost 40% of its value and trading remains suspended. Russias economy is reeling and Putin alone is to blame. \n",
"\n",
"Together with our allies we are providing support to the Ukrainians in their fight for freedom. Military assistance. Economic assistance. Humanitarian assistance. \n",
"\n",
"We are giving more than $1 Billion in direct assistance to Ukraine. \n",
"\n",
"And we will continue to aid the Ukrainian people as they defend their country and to help ease their suffering. \n",
"\n",
"Let me be clear, our forces are not engaged and will not engage in conflict with Russian forces in Ukraine. \n",
"\n",
"Our forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies in the event that Putin decides to keep moving west. \n"
]
}
],
"source": [
"print(texts[0])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d214aec2",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,215 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "b4462a94",
"metadata": {},
"source": [
"# Manifest\n",
"\n",
"This notebook goes over how to use Manifest and LangChain."
]
},
{
"cell_type": "markdown",
"id": "59fcaebc",
"metadata": {},
"source": [
"For more detailed information on `manifest`, and how to use it with local hugginface models like in this example, see https://github.com/HazyResearch/manifest"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "04a0170a",
"metadata": {},
"outputs": [],
"source": [
"from manifest import Manifest\n",
"from langchain.llms.manifest import ManifestWrapper"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "de250a6a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'model_name': 'bigscience/T0_3B', 'model_path': 'bigscience/T0_3B'}\n"
]
}
],
"source": [
"manifest = Manifest(\n",
" client_name = \"huggingface\",\n",
" client_connection = \"http://127.0.0.1:5000\"\n",
")\n",
"print(manifest.client.get_model_params())"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "67b719d6",
"metadata": {},
"outputs": [],
"source": [
"llm = ManifestWrapper(client=manifest, llm_kwargs={\"temperature\": 0.001, \"max_tokens\": 256})"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5af505a8",
"metadata": {},
"outputs": [],
"source": [
"# Map reduce example\n",
"from langchain import PromptTemplate\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.chains.mapreduce import MapReduceChain\n",
"\n",
"\n",
"_prompt = \"\"\"Write a concise summary of the following:\n",
"\n",
"\n",
"{text}\n",
"\n",
"\n",
"CONCISE SUMMARY:\"\"\"\n",
"prompt = PromptTemplate(template=_prompt, input_variables=[\"text\"])\n",
"\n",
"text_splitter = CharacterTextSplitter()\n",
"\n",
"mp_chain = MapReduceChain.from_params(llm, prompt, text_splitter)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "485b3ec3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'President Obama delivered his annual State of the Union address on Tuesday night, laying out his priorities for the coming year. Obama said the government will provide free flu vaccines to all Americans, ending the government shutdown and allowing businesses to reopen. The president also said that the government will continue to send vaccines to 112 countries, more than any other nation. \"We have lost so much to COVID-19,\" Trump said. \"Time with one another. And worst of all, so much loss of life.\" He said the CDC is working on a vaccine for kids under 5, and that the government will be ready with plenty of vaccines when they are available. Obama says the new guidelines are a \"great step forward\" and that the virus is no longer a threat. He says the government is launching a \"Test to Treat\" initiative that will allow people to get tested at a pharmacy and get antiviral pills on the spot at no cost. Obama says the new guidelines are a \"great step forward\" and that the virus is no longer a threat. He says the government will continue to send vaccines to 112 countries, more than any other nation. \"We are coming for your'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"mp_chain.run(state_of_the_union)"
]
},
{
"cell_type": "markdown",
"id": "6e9d45a8",
"metadata": {},
"source": [
"## Compare HF Models"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "33407ab3",
"metadata": {},
"outputs": [],
"source": [
"from langchain.model_laboratory import ModelLaboratory\n",
"\n",
"manifest1 = ManifestWrapper(\n",
" client=Manifest(\n",
" client_name=\"huggingface\",\n",
" client_connection=\"http://127.0.0.1:5000\"\n",
" ),\n",
" llm_kwargs={\"temperature\": 0.01}\n",
")\n",
"manifest2 = ManifestWrapper(\n",
" client=Manifest(\n",
" client_name=\"huggingface\",\n",
" client_connection=\"http://127.0.0.1:5001\"\n",
" ),\n",
" llm_kwargs={\"temperature\": 0.01}\n",
")\n",
"manifest3 = ManifestWrapper(\n",
" client=Manifest(\n",
" client_name=\"huggingface\",\n",
" client_connection=\"http://127.0.0.1:5002\"\n",
" ),\n",
" llm_kwargs={\"temperature\": 0.01}\n",
")\n",
"llms = [manifest1, manifest2, manifest3]\n",
"model_lab = ModelLaboratory(llms)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "448935c3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mManifestWrapper\u001b[0m\n",
"Params: {'model_name': 'bigscience/T0_3B', 'model_path': 'bigscience/T0_3B', 'temperature': 0.01}\n",
"\u001b[104mpink\u001b[0m\n",
"\n",
"\u001b[1mManifestWrapper\u001b[0m\n",
"Params: {'model_name': 'EleutherAI/gpt-neo-125M', 'model_path': 'EleutherAI/gpt-neo-125M', 'temperature': 0.01}\n",
"\u001b[103mA flamingo is a small, round\u001b[0m\n",
"\n",
"\u001b[1mManifestWrapper\u001b[0m\n",
"Params: {'model_name': 'google/flan-t5-xl', 'model_path': 'google/flan-t5-xl', 'temperature': 0.01}\n",
"\u001b[101mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
},
"vscode": {
"interpreter": {
"hash": "51b9b5b89a4976ad21c8b4273a6c78d700e2954ce7d7452948b7774eb33bbce4"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,254 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "920a3c1a",
"metadata": {},
"source": [
"# Model Laboratory\n",
"\n",
"This example goes over basic functionality of how to use the ModelLaboratory to test out and try different models."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ab9e95ad",
"metadata": {},
"outputs": [],
"source": [
"from langchain import LLMChain, OpenAI, Cohere, HuggingFaceHub, PromptTemplate\n",
"from langchain.model_laboratory import ModelLaboratory"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "32cb94e6",
"metadata": {},
"outputs": [],
"source": [
"llms = [\n",
" OpenAI(temperature=0), \n",
" Cohere(model=\"command-xlarge-20221108\", max_tokens=20, temperature=0), \n",
" HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1})\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "14cde09d",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory.from_llms(llms)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f186c741",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"Flamingos are pink.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"Pink\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "248b652a",
"metadata": {},
"outputs": [],
"source": [
"prompt = PromptTemplate(template=\"What is the capital of {state}?\", input_variables=[\"state\"])\n",
"model_lab_with_prompt = ModelLaboratory.from_llms(llms, prompt=prompt)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f64377ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"New York\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mst john s\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab_with_prompt.compare(\"New York\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "54336dbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain import SelfAskWithSearchChain, SerpAPIChain\n",
"\n",
"open_ai_llm = OpenAI(temperature=0)\n",
"search = SerpAPIChain()\n",
"self_ask_with_search_openai = SelfAskWithSearchChain(llm=open_ai_llm, search_chain=search, verbose=True)\n",
"\n",
"cohere_llm = Cohere(temperature=0, model=\"command-xlarge-20221108\")\n",
"search = SerpAPIChain()\n",
"self_ask_with_search_cohere = SelfAskWithSearchChain(llm=cohere_llm, search_chain=search, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "6a50a9f1",
"metadata": {},
"outputs": [],
"source": [
"chains = [self_ask_with_search_openai, self_ask_with_search_cohere]\n",
"names = [str(open_ai_llm), str(cohere_llm)]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d3549e99",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory(chains, names=names)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "362f7f57",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mEl Palmar, Spain.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[36;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 256, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What is the hometown of the reigning men's U.S. Open champion?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "94159131",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

10
docs/examples/prompts.rst Normal file
View File

@@ -0,0 +1,10 @@
Prompts
=======
The examples here all highlight how to work with prompts.
.. toctree::
:maxdepth: 1
:glob:
prompts/*

View File

@@ -0,0 +1,4 @@
{
"input_variables": ["input", "output"],
"template": "Input: {input}\nOutput: {output}"
}

View File

@@ -0,0 +1,4 @@
[
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"}
]

View File

@@ -0,0 +1,306 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f8b01b97",
"metadata": {},
"source": [
"# Few Shot Prompt examples\n",
"Notebook showing off how canonical prompts in LangChain can be recreated as FewShotPrompts"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "18c67cc9",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts.few_shot import FewShotPromptTemplate\n",
"from langchain.prompts.prompt import PromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2a729c9f",
"metadata": {},
"outputs": [],
"source": [
"# Self Ask with Search\n",
"\n",
"examples = [\n",
" {\n",
" \"question\": \"Who lived longer, Muhammad Ali or Alan Turing?\",\n",
" \"answer\": \"Are follow up questions needed here: Yes.\\nFollow up: How old was Muhammad Ali when he died?\\nIntermediate answer: Muhammad Ali was 74 years old when he died.\\nFollow up: How old was Alan Turing when he died?\\nIntermediate answer: Alan Turing was 41 years old when he died.\\nSo the final answer is: Muhammad Ali\"\n",
" },\n",
" {\n",
" \"question\": \"When was the founder of craigslist born?\",\n",
" \"answer\": \"Are follow up questions needed here: Yes.\\nFollow up: Who was the founder of craigslist?\\nIntermediate answer: Craigslist was founded by Craig Newmark.\\nFollow up: When was Craig Newmark born?\\nIntermediate answer: Craig Newmark was born on December 6, 1952.\\nSo the final answer is: December 6, 1952\"\n",
" },\n",
" {\n",
" \"question\": \"Who was the maternal grandfather of George Washington?\",\n",
" \"answer\": \"Are follow up questions needed here: Yes.\\nFollow up: Who was the mother of George Washington?\\nIntermediate answer: The mother of George Washington was Mary Ball Washington.\\nFollow up: Who was the father of Mary Ball Washington?\\nIntermediate answer: The father of Mary Ball Washington was Joseph Ball.\\nSo the final answer is: Joseph Ball\"\n",
" },\n",
" {\n",
" \"question\": \"Are both the directors of Jaws and Casino Royale from the same country?\",\n",
" \"answer\": \"Are follow up questions needed here: Yes.\\nFollow up: Who is the director of Jaws?\\nIntermediate Answer: The director of Jaws is Steven Spielberg.\\nFollow up: Where is Steven Spielberg from?\\nIntermediate Answer: The United States.\\nFollow up: Who is the director of Casino Royale?\\nIntermediate Answer: The director of Casino Royale is Martin Campbell.\\nFollow up: Where is Martin Campbell from?\\nIntermediate Answer: New Zealand.\\nSo the final answer is: No\"\n",
" }\n",
"]\n",
"example_prompt = PromptTemplate(input_variables=[\"question\", \"answer\"], template=\"Question: {question}\\n{answer}\")\n",
"\n",
"prompt = FewShotPromptTemplate(\n",
" examples=examples, \n",
" example_prompt=example_prompt, \n",
" suffix=\"Question: {input}\", \n",
" input_variables=[\"input\"]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "95fc0059",
"metadata": {},
"outputs": [],
"source": [
"# ReAct\n",
"\n",
"examples = [\n",
" {\n",
" \"question\": \"What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?\",\n",
" \"answer\": \"Thought 1: I need to search Colorado orogeny, find the area that the eastern sector of the Colorado orogeny extends into, then find the elevation range of that area.\\nAction 1: Search[Colorado orogeny]\\nObservation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in Colorado and surrounding areas.\\nThought 2: It does not mention the eastern sector. So I need to look up eastern sector.\\nAction 2: Lookup[eastern sector]\\nObservation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called the Central Plains orogeny.\\nThought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I need to search High Plains and find its elevation range.\\nAction 3: Search[High Plains]\\nObservation 3: High Plains refers to one of two distinct land regions\\nThought 4: I need to instead search High Plains (United States).\\nAction 4: Search[High Plains (United States)]\\nObservation 4: The High Plains are a subregion of the Great Plains. From east to west, the High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130 m).[3]\\nThought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer is 1,800 to 7,000 ft.\\nAction 5: Finish[1,800 to 7,000 ft]\"\n",
" },\n",
" {\n",
" \"question\": \"Musician and satirist Allie Goertz wrote a song about the \\\"The Simpsons\\\" character Milhouse, who Matt Groening named after who?\",\n",
" \"answer\": \"Thought 1: The question simplifies to \\\"The Simpsons\\\" character Milhouse is named after who. I only need to search Milhouse and find who it is named after.\\nAction 1: Search[Milhouse]\\nObservation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening.\\nThought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up \\\"named after\\\".\\nAction 2: Lookup[named after]\\nObservation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous.\\nThought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.\\nAction 3: Finish[Richard Nixon]\"\n",
" },\n",
" {\n",
" \"question\": \"Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?\",\n",
" \"answer\": \"Thought 1: I need to search Adam Clayton Powell and The Saimaa Gesture, and find which documentary is about Finnish rock groups.\\nAction 1: Search[Adam Clayton Powell]\\nObservation 1 Could not find [Adam Clayton Powell]. Similar: [Adam Clayton Powell III, Seventh Avenue (Manhattan), Adam Clayton Powell Jr. State Office Building, Isabel Washington Powell, Adam Powell, Adam Clayton Powell (film), Giancarlo Esposito].\\nThought 2: To find the documentary, I can search Adam Clayton Powell (film).\\nAction 2: Search[Adam Clayton Powell (film)]\\nObservation 2: Adam Clayton Powell is a 1989 American documentary film directed by Richard Kilberg. The film is about the rise and fall of influential African-American politician Adam Clayton Powell Jr.[3][4] It was later aired as part of the PBS series The American Experience.\\nThought 3: Adam Clayton Powell (film) is a documentary about an African-American politician, not Finnish rock groups. So the documentary about Finnish rock groups must instead be The Saimaa Gesture.\\nAction 3: Finish[The Saimaa Gesture]\"\n",
" },\n",
" {\n",
" \"question\": \"What profession does Nicholas Ray and Elia Kazan have in common?\",\n",
" \"answer\": \"Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then find the profession they have in common.\\nAction 1: Search[Nicholas Ray]\\nObservation 1: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 - June 16, 1979) was an American film director, screenwriter, and actor best known for the 1955 film Rebel Without a Cause.\\nThought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need to search Elia Kazan next and find his professions.\\nAction 2: Search[Elia Kazan]\\nObservation 2: Elia Kazan was an American film and theatre director, producer, screenwriter and actor.\\nThought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor. So profession Nicholas Ray and Elia Kazan have in common is director, screenwriter, and actor.\\nAction 3: Finish[director, screenwriter, actor]\"\n",
" },\n",
" {\n",
" \"question\": \"Which magazine was started first Arthurs Magazine or First for Women?\",\n",
" \"answer\": \"Thought 1: I need to search Arthurs Magazine and First for Women, and find which was started first.\\nAction 1: Search[Arthurs Magazine]\\nObservation 1: Arthurs Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century.\\nThought 2: Arthurs Magazine was started in 1844. I need to search First for Women next.\\nAction 2: Search[First for Women]\\nObservation 2: First for Women is a womans magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989.\\nThought 3: First for Women was started in 1989. 1844 (Arthurs Magazine) < 1989 (First for Women), so Arthurs Magazine was started first.\\nAction 3: Finish[Arthurs Magazine]\"\n",
" },\n",
" {\n",
" \"question\": \"Were Pavel Urysohn and Leonid Levin known for the same type of work?\",\n",
" \"answer\": \"Thought 1: I need to search Pavel Urysohn and Leonid Levin, find their types of work, then find if they are the same.\\nAction 1: Search[Pavel Urysohn]\\nObservation 1: Pavel Samuilovich Urysohn (February 3, 1898 - August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory.\\nThought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work.\\nAction 2: Search[Leonid Levin]\\nObservation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist.\\nThought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work.\\nAction 3: Finish[yes]\"\n",
" }\n",
"]\n",
"example_prompt = PromptTemplate(input_variables=[\"question\", \"answer\"], template=\"Question: {question}\\n{answer}\")\n",
"\n",
"prompt = FewShotPrompt(\n",
" examples=examples, \n",
" example_prompt=example_prompt, \n",
" suffix=\"Question: {input}\", \n",
" input_variables=[\"input\"]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "897d4e08",
"metadata": {},
"outputs": [],
"source": [
"# LLM Math\n",
"examples = [\n",
" {\n",
" \"question\": \"What is 37593 * 67?\",\n",
" \"answer\": \"```python\\nprint(37593 * 67)\\n```\\n```output\\n2518731\\n```\\nAnswer: 2518731\"\n",
" }\n",
"]\n",
"example_prompt = PromptTemplate(input_variables=[\"question\", \"answer\"], template=\"Question: {question}\\n\\n{answer}\")\n",
"\n",
"prompt = FewShotPromptTemplate(\n",
" examples=examples, \n",
" example_prompt=example_prompt, \n",
" suffix=\"Question: {input}\", \n",
" input_variables=[\"input\"]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "7ab7379f",
"metadata": {},
"outputs": [],
"source": [
"# NatBot\n",
"example_seperator = \"==================================================\"\n",
"content_1 = \"\"\"<link id=1>About</link>\n",
"<link id=2>Store</link>\n",
"<link id=3>Gmail</link>\n",
"<link id=4>Images</link>\n",
"<link id=5>(Google apps)</link>\n",
"<link id=6>Sign in</link>\n",
"<img id=7 alt=\"(Google)\"/>\n",
"<input id=8 alt=\"Search\"></input>\n",
"<button id=9>(Search by voice)</button>\n",
"<button id=10>(Google Search)</button>\n",
"<button id=11>(I'm Feeling Lucky)</button>\n",
"<link id=12>Advertising</link>\n",
"<link id=13>Business</link>\n",
"<link id=14>How Search works</link>\n",
"<link id=15>Carbon neutral since 2007</link>\n",
"<link id=16>Privacy</link>\n",
"<link id=17>Terms</link>\n",
"<text id=18>Settings</text>\"\"\"\n",
"content_2 = \"\"\"<link id=1>About</link>\n",
"<link id=2>Store</link>\n",
"<link id=3>Gmail</link>\n",
"<link id=4>Images</link>\n",
"<link id=5>(Google apps)</link>\n",
"<link id=6>Sign in</link>\n",
"<img id=7 alt=\"(Google)\"/>\n",
"<input id=8 alt=\"Search\"></input>\n",
"<button id=9>(Search by voice)</button>\n",
"<button id=10>(Google Search)</button>\n",
"<button id=11>(I'm Feeling Lucky)</button>\n",
"<link id=12>Advertising</link>\n",
"<link id=13>Business</link>\n",
"<link id=14>How Search works</link>\n",
"<link id=15>Carbon neutral since 2007</link>\n",
"<link id=16>Privacy</link>\n",
"<link id=17>Terms</link>\n",
"<text id=18>Settings</text>\"\"\"\n",
"content_3 = \"\"\"<button id=1>For Businesses</button>\n",
"<button id=2>Mobile</button>\n",
"<button id=3>Help</button>\n",
"<button id=4 alt=\"Language Picker\">EN</button>\n",
"<link id=5>OpenTable logo</link>\n",
"<button id=6 alt =\"search\">Search</button>\n",
"<text id=7>Find your table for any occasion</text>\n",
"<button id=8>(Date selector)</button>\n",
"<text id=9>Sep 28, 2022</text>\n",
"<text id=10>7:00 PM</text>\n",
"<text id=11>2 people</text>\n",
"<input id=12 alt=\"Location, Restaurant, or Cuisine\"></input>\n",
"<button id=13>Lets go</button>\n",
"<text id=14>It looks like you're in Peninsula. Not correct?</text>\n",
"<button id=15>Get current location</button>\n",
"<button id=16>Next</button>\"\"\"\n",
"examples = [\n",
" {\n",
" \"i\": 1,\n",
" \"content\": content_1,\n",
" \"objective\": \"Find a 2 bedroom house for sale in Anchorage AK for under $750k\",\n",
" \"current_url\": \"https://www.google.com/\",\n",
" \"command\": 'TYPESUBMIT 8 \"anchorage redfin\"'\n",
" },\n",
" {\n",
" \"i\": 2,\n",
" \"content\": content_2,\n",
" \"objective\": \"Make a reservation for 4 at Dorsia at 8pm\",\n",
" \"current_url\": \"https://www.google.com/\",\n",
" \"command\": 'TYPESUBMIT 8 \"dorsia nyc opentable\"'\n",
" },\n",
" {\n",
" \"i\": 3,\n",
" \"content\": content_3,\n",
" \"objective\": \"Make a reservation for 4 for dinner at Dorsia in New York City at 8pm\",\n",
" \"current_url\": \"https://www.opentable.com/\",\n",
" \"command\": 'TYPESUBMIT 12 \"dorsia new york city\"'\n",
" },\n",
"]\n",
"example_prompt_template=\"\"\"EXAMPLE {i}:\n",
"==================================================\n",
"CURRENT BROWSER CONTENT:\n",
"------------------\n",
"{content}\n",
"------------------\n",
"OBJECTIVE: {objective}\n",
"CURRENT URL: {current_url}\n",
"YOUR COMMAND:\n",
"{command}\"\"\"\n",
"example_prompt = PromptTemplate(input_variables=[\"i\", \"content\", \"objective\", \"current_url\", \"command\"], template=example_prompt_template)\n",
"\n",
"\n",
"prefix = \"\"\"\n",
"You are an agent controlling a browser. You are given:\n",
"\t(1) an objective that you are trying to achieve\n",
"\t(2) the URL of your current web page\n",
"\t(3) a simplified text description of what's visible in the browser window (more on that below)\n",
"You can issue these commands:\n",
"\tSCROLL UP - scroll up one page\n",
"\tSCROLL DOWN - scroll down one page\n",
"\tCLICK X - click on a given element. You can only click on links, buttons, and inputs!\n",
"\tTYPE X \"TEXT\" - type the specified text into the input with id X\n",
"\tTYPESUBMIT X \"TEXT\" - same as TYPE above, except then it presses ENTER to submit the form\n",
"The format of the browser content is highly simplified; all formatting elements are stripped.\n",
"Interactive elements such as links, inputs, buttons are represented like this:\n",
"\t\t<link id=1>text</link>\n",
"\t\t<button id=2>text</button>\n",
"\t\t<input id=3>text</input>\n",
"Images are rendered as their alt text like this:\n",
"\t\t<img id=4 alt=\"\"/>\n",
"Based on your given objective, issue whatever command you believe will get you closest to achieving your goal.\n",
"You always start on Google; you should submit a search query to Google that will take you to the best page for\n",
"achieving your objective. And then interact with that page to achieve your objective.\n",
"If you find yourself on Google and there are no search results displayed yet, you should probably issue a command\n",
"like \"TYPESUBMIT 7 \"search query\"\" to get to a more useful page.\n",
"Then, if you find yourself on a Google search results page, you might issue the command \"CLICK 24\" to click\n",
"on the first link in the search results. (If your previous command was a TYPESUBMIT your next command should\n",
"probably be a CLICK.)\n",
"Don't try to interact with elements that you can't see.\n",
"Here are some examples:\n",
"\"\"\"\n",
"suffix=\"\"\"\n",
"The current browser content, objective, and current URL follow. Reply with your next command to the browser.\n",
"CURRENT BROWSER CONTENT:\n",
"------------------\n",
"{browser_content}\n",
"------------------\n",
"OBJECTIVE: {objective}\n",
"CURRENT URL: {url}\n",
"PREVIOUS COMMAND: {previous_command}\n",
"YOUR COMMAND:\n",
"\"\"\"\n",
"PROMPT = FewShotPromptTemplate(\n",
" examples = examples,\n",
" example_prompt=example_prompt,\n",
" example_separator=example_seperator,\n",
" input_variables=[\"browser_content\", \"url\", \"previous_command\", \"objective\"],\n",
" prefix=prefix,\n",
" suffix=suffix,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ce5927c6",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,11 @@
{
"_type": "few_shot",
"input_variables": ["adjective"],
"prefix": "Write antonyms for the following words.",
"example_prompt": {
"input_variables": ["input", "output"],
"template": "Input: {input}\nOutput: {output}"
},
"examples": "examples.json",
"suffix": "Input: {adjective}\nOutput:"
}

View File

@@ -0,0 +1,14 @@
_type: few_shot
input_variables:
["adjective"]
prefix:
Write antonyms for the following words.
example_prompt:
input_variables:
["input", "output"]
template:
"Input: {input}\nOutput: {output}"
examples:
examples.json
suffix:
"Input: {adjective}\nOutput:"

View File

@@ -0,0 +1,8 @@
{
"_type": "few_shot",
"input_variables": ["adjective"],
"prefix": "Write antonyms for the following words.",
"example_prompt_path": "example_prompt.json",
"examples": "examples.json",
"suffix": "Input: {adjective}\nOutput:"
}

View File

@@ -0,0 +1,14 @@
{
"_type": "few_shot",
"input_variables": ["adjective"],
"prefix": "Write antonyms for the following words.",
"example_prompt": {
"input_variables": ["input", "output"],
"template": "Input: {input}\nOutput: {output}"
},
"examples": [
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"}
],
"suffix": "Input: {adjective}\nOutput:"
}

View File

@@ -0,0 +1,150 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f5d249ee",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Generate Examples\n",
"\n",
"This notebook shows how to use LangChain to generate more examples similar to the ones you already have."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1685fa2f",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain.llms.openai import OpenAI\n",
"from langchain.example_generator import generate_example\n",
"from langchain.prompts import PromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "334ef4f7",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"# Use examples from ReAct\n",
"examples = [\n",
" {\n",
" \"question\": \"What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?\",\n",
" \"answer\": \"Thought 1: I need to search Colorado orogeny, find the area that the eastern sector of the Colorado orogeny extends into, then find the elevation range of that area.\\nAction 1: Search[Colorado orogeny]\\nObservation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in Colorado and surrounding areas.\\nThought 2: It does not mention the eastern sector. So I need to look up eastern sector.\\nAction 2: Lookup[eastern sector]\\nObservation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called the Central Plains orogeny.\\nThought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I need to search High Plains and find its elevation range.\\nAction 3: Search[High Plains]\\nObservation 3: High Plains refers to one of two distinct land regions\\nThought 4: I need to instead search High Plains (United States).\\nAction 4: Search[High Plains (United States)]\\nObservation 4: The High Plains are a subregion of the Great Plains. From east to west, the High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130 m).[3]\\nThought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer is 1,800 to 7,000 ft.\\nAction 5: Finish[1,800 to 7,000 ft]\"\n",
" },\n",
" {\n",
" \"question\": \"Musician and satirist Allie Goertz wrote a song about the \\\"The Simpsons\\\" character Milhouse, who Matt Groening named after who?\",\n",
" \"answer\": \"Thought 1: The question simplifies to \\\"The Simpsons\\\" character Milhouse is named after who. I only need to search Milhouse and find who it is named after.\\nAction 1: Search[Milhouse]\\nObservation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening.\\nThought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up \\\"named after\\\".\\nAction 2: Lookup[named after]\\nObservation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous.\\nThought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.\\nAction 3: Finish[Richard Nixon]\"\n",
" },\n",
" {\n",
" \"question\": \"Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?\",\n",
" \"answer\": \"Thought 1: I need to search Adam Clayton Powell and The Saimaa Gesture, and find which documentary is about Finnish rock groups.\\nAction 1: Search[Adam Clayton Powell]\\nObservation 1 Could not find [Adam Clayton Powell]. Similar: [Adam Clayton Powell III, Seventh Avenue (Manhattan), Adam Clayton Powell Jr. State Office Building, Isabel Washington Powell, Adam Powell, Adam Clayton Powell (film), Giancarlo Esposito].\\nThought 2: To find the documentary, I can search Adam Clayton Powell (film).\\nAction 2: Search[Adam Clayton Powell (film)]\\nObservation 2: Adam Clayton Powell is a 1989 American documentary film directed by Richard Kilberg. The film is about the rise and fall of influential African-American politician Adam Clayton Powell Jr.[3][4] It was later aired as part of the PBS series The American Experience.\\nThought 3: Adam Clayton Powell (film) is a documentary about an African-American politician, not Finnish rock groups. So the documentary about Finnish rock groups must instead be The Saimaa Gesture.\\nAction 3: Finish[The Saimaa Gesture]\"\n",
" },\n",
" {\n",
" \"question\": \"What profession does Nicholas Ray and Elia Kazan have in common?\",\n",
" \"answer\": \"Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then find the profession they have in common.\\nAction 1: Search[Nicholas Ray]\\nObservation 1: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 - June 16, 1979) was an American film director, screenwriter, and actor best known for the 1955 film Rebel Without a Cause.\\nThought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need to search Elia Kazan next and find his professions.\\nAction 2: Search[Elia Kazan]\\nObservation 2: Elia Kazan was an American film and theatre director, producer, screenwriter and actor.\\nThought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor. So profession Nicholas Ray and Elia Kazan have in common is director, screenwriter, and actor.\\nAction 3: Finish[director, screenwriter, actor]\"\n",
" },\n",
" {\n",
" \"question\": \"Which magazine was started first Arthurs Magazine or First for Women?\",\n",
" \"answer\": \"Thought 1: I need to search Arthurs Magazine and First for Women, and find which was started first.\\nAction 1: Search[Arthurs Magazine]\\nObservation 1: Arthurs Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century.\\nThought 2: Arthurs Magazine was started in 1844. I need to search First for Women next.\\nAction 2: Search[First for Women]\\nObservation 2: First for Women is a womans magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989.\\nThought 3: First for Women was started in 1989. 1844 (Arthurs Magazine) < 1989 (First for Women), so Arthurs Magazine was started first.\\nAction 3: Finish[Arthurs Magazine]\"\n",
" },\n",
" {\n",
" \"question\": \"Were Pavel Urysohn and Leonid Levin known for the same type of work?\",\n",
" \"answer\": \"Thought 1: I need to search Pavel Urysohn and Leonid Levin, find their types of work, then find if they are the same.\\nAction 1: Search[Pavel Urysohn]\\nObservation 1: Pavel Samuilovich Urysohn (February 3, 1898 - August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory.\\nThought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work.\\nAction 2: Search[Leonid Levin]\\nObservation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist.\\nThought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work.\\nAction 3: Finish[yes]\"\n",
" }\n",
"]\n",
"example_template = PromptTemplate(template=\"Question: {question}\\n{answer}\", input_variables=[\"question\", \"answer\"])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a7bd36bc",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"new_example = generate_example(examples, OpenAI(), example_template)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "e1efb008",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['',\n",
" '',\n",
" '',\n",
" 'Question: Is the film \"The Omen\" based on a book?',\n",
" 'Thought 1: I need to search \"The Omen\" and find if it is based on a book.',\n",
" 'Action 1: Search[\"The Omen\"]',\n",
" 'Observation 1: The Omen is a 1976 American supernatural horror film directed by Richard Donner and written by David Seltzer.',\n",
" 'Thought 2: The Omen is not based on a book.']"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new_example.split('\\n')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1ed01ba2",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,540 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "43fb16cb",
"metadata": {},
"source": [
"# Prompt Management\n",
"\n",
"Managing your prompts is annoying and tedious, with everyone writing their own slightly different variants of the same ideas. But it shouldn't be this way. \n",
"\n",
"LangChain provides a standard and flexible way for specifying and managing all your prompts, as well as clear and specific terminology around them. This notebook goes through the core components of working with prompts, showing how to use them as well as explaining what they do.\n",
"\n",
"This notebook covers how to work with prompts in Python. If you are interested in how to work with serialized versions of prompts and load them from disk, see [this notebook](prompt_serialization.ipynb)."
]
},
{
"cell_type": "markdown",
"id": "890aad4d",
"metadata": {},
"source": [
"### The BasePromptTemplate Interface\n",
"\n",
"A prompt template is a mechanism for constructing a prompt to pass to the language model given some user input. Below is the interface that all different types of prompt templates should expose.\n",
"\n",
"```python\n",
"class BasePromptTemplate(ABC):\n",
"\n",
" input_variables: List[str]\n",
" \"\"\"A list of the names of the variables the prompt template expects.\"\"\"\n",
"\n",
" @abstractmethod\n",
" def format(self, **kwargs: Any) -> str:\n",
" \"\"\"Format the prompt with the inputs.\n",
"\n",
" Args:\n",
" kwargs: Any arguments to be passed to the prompt template.\n",
"\n",
" Returns:\n",
" A formatted string.\n",
"\n",
" Example:\n",
"\n",
" .. code-block:: python\n",
"\n",
" prompt.format(variable1=\"foo\")\n",
" \"\"\"\n",
"```\n",
"\n",
"The only two things that define a prompt are:\n",
"\n",
"1. `input_variables`: The user inputted variables that are needed to format the prompt.\n",
"2. `format`: A method which takes in keyword arguments are returns a formatted prompt. The keys are expected to be the input variables\n",
" \n",
"The rest of the logic of how the prompt is constructed is left up to different implementations. Let's take a look at some below."
]
},
{
"cell_type": "markdown",
"id": "cddb465e",
"metadata": {},
"source": [
"### PromptTemplate\n",
"\n",
"This is the most simple type of prompt - a string template that takes any number of input variables. The template should be formatted as a Python f-string, although we will support other formats (Jinja, Mako, etc) in the future. \n",
"\n",
"If you just want to use a hardcoded prompt template, you should use this implementation.\n",
"\n",
"Let's walk through a few examples."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "094229f4",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "ab46bd2a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Tell me a joke.'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# An example prompt with no input variables\n",
"no_input_prompt = PromptTemplate(input_variables=[], template=\"Tell me a joke.\")\n",
"no_input_prompt.format()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "c3ad0fa8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Tell me a funny joke.'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# An example prompt with one input variable\n",
"one_input_prompt = PromptTemplate(input_variables=[\"adjective\"], template=\"Tell me a {adjective} joke.\")\n",
"one_input_prompt.format(adjective=\"funny\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "ba577dcf",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Tell me a funny joke about chickens.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# An example prompt with multiple input variables\n",
"multiple_input_prompt = PromptTemplate(\n",
" input_variables=[\"adjective\", \"content\"], \n",
" template=\"Tell me a {adjective} joke about {content}.\"\n",
")\n",
"multiple_input_prompt.format(adjective=\"funny\", content=\"chickens\")"
]
},
{
"cell_type": "markdown",
"id": "1492b49d",
"metadata": {},
"source": [
"### Few Shot Prompts\n",
"\n",
"A FewShotPromptTemplate is a prompt template that includes some examples. If you have collected some examples of how the task should be done, you can insert them into prompt using this class.\n",
"\n",
"Examples are datapoints that can be included in the prompt in order to give the model more context what to do. Examples are represented as a dictionary of key-value pairs, with the key being the input (or label) name, and the value being the input (or label) value. \n",
"\n",
"In addition to the example, we also need to specify how the example should be formatted when it's inserted in the prompt. We can do this using the above `PromptTemplate`!"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "3eb36972",
"metadata": {},
"outputs": [],
"source": [
"# These are some examples of a pretend task of creating antonyms.\n",
"examples = [\n",
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
" {\"input\": \"tall\", \"output\": \"short\"},\n",
"]\n",
"# This how we specify how the example should be formatted.\n",
"example_prompt = PromptTemplate(\n",
" input_variables=[\"input\",\"output\"],\n",
" template=\"Input: {input}\\nOutput: {output}\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "80a91d96",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts.few_shot import FewShotPromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "7931e5f2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the antonym of every input\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: big\n",
"Output:\n"
]
}
],
"source": [
"prompt_from_string_examples = FewShotPromptTemplate(\n",
" # These are the examples we want to insert into the prompt.\n",
" examples=examples,\n",
" # This is how we want to format the examples when we insert them into the prompt.\n",
" example_prompt=example_prompt,\n",
" # The prefix is some text that goes before the examples in the prompt.\n",
" # Usually, this consists of intructions.\n",
" prefix=\"Give the antonym of every input\",\n",
" # The suffix is some text that goes after the examples in the prompt.\n",
" # Usually, this is where the user input will go\n",
" suffix=\"Input: {adjective}\\nOutput:\", \n",
" # The input variables are the variables that the overall prompt expects.\n",
" input_variables=[\"adjective\"],\n",
" # The example_separator is the string we will use to join the prefix, examples, and suffix together with.\n",
" example_separator=\"\\n\\n\"\n",
" \n",
")\n",
"print(prompt_from_string_examples.format(adjective=\"big\"))"
]
},
{
"cell_type": "markdown",
"id": "bf038596",
"metadata": {},
"source": [
"### ExampleSelector\n",
"If you have a large number of examples, you may need to select which ones to include in the prompt. The ExampleSelector is the class responsible for doing so. The base interface is defined as below.\n",
"\n",
"```python\n",
"class BaseExampleSelector(ABC):\n",
" \"\"\"Interface for selecting examples to include in prompts.\"\"\"\n",
"\n",
" @abstractmethod\n",
" def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:\n",
" \"\"\"Select which examples to use based on the inputs.\"\"\"\n",
"\n",
"```\n",
"\n",
"The only method it needs to expose is a `select_examples` method. This takes in the input variables and then returns a list of examples. It is up to each specific implementation as to how those examples are selected. Let's take a look at some below."
]
},
{
"cell_type": "markdown",
"id": "861a4d1f",
"metadata": {},
"source": [
"### LengthBased ExampleSelector\n",
"\n",
"This ExampleSelector selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more.\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "7c469c95",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts.example_selector.length_based import LengthBasedExampleSelector"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "0ec6d950",
"metadata": {},
"outputs": [],
"source": [
"# These are a lot of examples of a pretend task of creating antonyms.\n",
"examples = [\n",
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
" {\"input\": \"tall\", \"output\": \"short\"},\n",
" {\"input\": \"energetic\", \"output\": \"lethargic\"},\n",
" {\"input\": \"sunny\", \"output\": \"gloomy\"},\n",
" {\"input\": \"windy\", \"output\": \"calm\"},\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "207e55f7",
"metadata": {},
"outputs": [],
"source": [
"example_selector = LengthBasedExampleSelector(\n",
" # These are the examples is has available to choose from.\n",
" examples=examples, \n",
" # This is the PromptTemplate being used to format the examples.\n",
" example_prompt=example_prompt, \n",
" # This is the maximum length that the formatted examples should be.\n",
" # Length is measured by the get_text_length function below.\n",
" max_length=18,\n",
" # This is the function used to get the length of a string, which is used\n",
" # to determine which examples to include. It is commented out because\n",
" # it is provided as a default value if none is specified.\n",
" # get_text_length: Callable[[str], int] = lambda x: len(re.split(\"\\n| \", x))\n",
")\n",
"dynamic_prompt = FewShotPromptTemplate(\n",
" # We provide an ExampleSelector instead of examples.\n",
" example_selector=example_selector,\n",
" example_prompt=example_prompt,\n",
" prefix=\"Give the antonym of every input\",\n",
" suffix=\"Input: {adjective}\\nOutput:\", \n",
" input_variables=[\"adjective\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "d00b4385",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the antonym of every input\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: energetic\n",
"Output: lethargic\n",
"\n",
"Input: sunny\n",
"Output: gloomy\n",
"\n",
"Input: windy\n",
"Output: calm\n",
"\n",
"Input: big\n",
"Output:\n"
]
}
],
"source": [
"# An example with small input, so it selects all examples.\n",
"print(dynamic_prompt.format(adjective=\"big\"))"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "878bcde9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the antonym of every input\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: big and huge and massive and large and gigantic and tall and bigger than everything else\n",
"Output:\n"
]
}
],
"source": [
"# An example with long input, so it selects only one example.\n",
"long_string = \"big and huge and massive and large and gigantic and tall and bigger than everything else\"\n",
"print(dynamic_prompt.format(adjective=long_string))"
]
},
{
"cell_type": "markdown",
"id": "2d007b0a",
"metadata": {},
"source": [
"### Similarity ExampleSelector\n",
"\n",
"The SemanticSimilarityExampleSelector selects examples based on which examples are most similar to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "241bfe80",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts.example_selector.semantic_similarity import SemanticSimilarityExampleSelector\n",
"from langchain.vectorstores import FAISS\n",
"from langchain.embeddings import OpenAIEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "50d0a701",
"metadata": {},
"outputs": [],
"source": [
"example_selector = SemanticSimilarityExampleSelector.from_examples(\n",
" # This is the list of examples available to select from.\n",
" examples, \n",
" # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
" OpenAIEmbeddings(), \n",
" # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
" FAISS, \n",
" # This is the number of examples to produce.\n",
" k=1\n",
")\n",
"similar_prompt = FewShotPromptTemplate(\n",
" # We provide an ExampleSelector instead of examples.\n",
" example_selector=example_selector,\n",
" example_prompt=example_prompt,\n",
" prefix=\"Give the antonym of every input\",\n",
" suffix=\"Input: {adjective}\\nOutput:\", \n",
" input_variables=[\"adjective\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "4c8fdf45",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the antonym of every input\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: worried\n",
"Output:\n"
]
}
],
"source": [
"# Input is a feeling, so should select the happy/sad example\n",
"print(similar_prompt.format(adjective=\"worried\"))"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "829af21a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the antonym of every input\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: fat\n",
"Output:\n"
]
}
],
"source": [
"# Input is a measurment, so should select the tall/short example\n",
"print(similar_prompt.format(adjective=\"fat\"))"
]
},
{
"cell_type": "markdown",
"id": "dbc32551",
"metadata": {},
"source": [
"### Serialization\n",
"\n",
"PromptTemplates and examples can be serialized and loaded from disk, making it easy to share and store prompts. For a detailed walkthrough on how to do that, see [this notebook](prompt_serialization.ipynb)."
]
},
{
"cell_type": "markdown",
"id": "1e1e13c6",
"metadata": {},
"source": [
"### Customizability\n",
"The above covers all the ways currently supported in LangChain to represent prompts and example selectors. However, due to the simple interface that the base classes (`BasePromptTemplate`, `BaseExampleSelector`) expose, it should be easy to subclass them and write your own implementation in your own codebase. And of course, if you'd like to contribute that back to LangChain, we'd love that :)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c746d6f4",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,538 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "43fb16cb",
"metadata": {},
"source": [
"# Prompt Serialization\n",
"\n",
"It is often preferrable to store prompts not as python code but as files. This can make it easy to share, store, and version prompts. This notebook covers how to do that in LangChain, walking through all the different types of prompts and the different serialization options.\n",
"\n",
"At a high level, the following design principles are applied to serialization:\n",
"1. Both JSON and YAML are supported. We want to support serialization methods are human readable on disk, and YAML and JSON are two of the most popular methods for that. Note that this rule applies to prompts. For other assets, like Examples, different serialization methods may be supported.\n",
"2. We support specifying everything in one file, or storing different components (templates, examples, etc) in different files and referencing them. For some cases, storing everything in file makes the most sense, but for others it is preferrable to split up some of the assets (long templates, large examples, reusable components). LangChain supports both."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "2c8d7587",
"metadata": {},
"outputs": [],
"source": [
"# All prompts are loading through the `load_prompt` function.\n",
"from langchain.prompts.loading import load_prompt"
]
},
{
"cell_type": "markdown",
"id": "cddb465e",
"metadata": {},
"source": [
"## PromptTemplate\n",
"\n",
"This section covers examples for loading a PromptTemplate."
]
},
{
"cell_type": "markdown",
"id": "4d4b40f2",
"metadata": {},
"source": [
"### Loading from YAML\n",
"This shows an example of loading a PromptTemplate from YAML."
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "2d6e5117",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"input_variables:\r\n",
" [\"adjective\", \"content\"]\r\n",
"template: \r\n",
" Tell me a {adjective} joke about {content}.\r\n"
]
}
],
"source": [
"!cat simple_prompt.yaml"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "4f4ca686",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tell me a funny joke about chickens.\n"
]
}
],
"source": [
"prompt = load_prompt(\"simple_prompt.yaml\")\n",
"print(prompt.format(adjective=\"funny\", content=\"chickens\"))"
]
},
{
"cell_type": "markdown",
"id": "362eadb2",
"metadata": {},
"source": [
"### Loading from JSON\n",
"This shows an example of loading a PromptTemplate from JSON."
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "510def23",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"input_variables\": [\"adjective\", \"content\"],\r\n",
" \"template\": \"Tell me a {adjective} joke about {content}.\"\r\n",
"}\r\n"
]
}
],
"source": [
"!cat simple_prompt.json"
]
},
{
"cell_type": "markdown",
"id": "d788a83c",
"metadata": {},
"source": [
"### Loading Template from a File\n",
"This shows an example of storing the template in a separate file and then referencing it in the config. Notice that the key changes from `template` to `template_path`."
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "5547760d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tell me a {adjective} joke about {content}."
]
}
],
"source": [
"!cat simple_template.txt"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "9cb13ac5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"input_variables\": [\"adjective\", \"content\"],\r\n",
" \"template_path\": \"simple_template.txt\"\r\n",
"}\r\n"
]
}
],
"source": [
"!cat simple_prompt_with_template_file.json"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "762cb4bf",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tell me a funny joke about chickens.\n"
]
}
],
"source": [
"prompt = load_prompt(\"simple_prompt_with_template_file.json\")\n",
"print(prompt.format(adjective=\"funny\", content=\"chickens\"))"
]
},
{
"cell_type": "markdown",
"id": "2ae191cc",
"metadata": {},
"source": [
"## FewShotPromptTemplate\n",
"\n",
"This section covers examples for loading few shot prompt templates."
]
},
{
"cell_type": "markdown",
"id": "9828f94c",
"metadata": {},
"source": [
"### Examples\n",
"This shows an example of what examples stored as json might look like."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "b21f5b95",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[\r\n",
" {\"input\": \"happy\", \"output\": \"sad\"},\r\n",
" {\"input\": \"tall\", \"output\": \"short\"}\r\n",
"]\r\n"
]
}
],
"source": [
"!cat examples.json"
]
},
{
"cell_type": "markdown",
"id": "8e300335",
"metadata": {},
"source": [
"### Loading from YAML\n",
"This shows an example of loading a few shot example from YAML."
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "e2bec0fc",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_type: few_shot\r\n",
"input_variables:\r\n",
" [\"adjective\"]\r\n",
"prefix: \r\n",
" Write antonyms for the following words.\r\n",
"example_prompt:\r\n",
" input_variables:\r\n",
" [\"input\", \"output\"]\r\n",
" template:\r\n",
" \"Input: {input}\\nOutput: {output}\"\r\n",
"examples:\r\n",
" examples.json\r\n",
"suffix:\r\n",
" \"Input: {adjective}\\nOutput:\"\r\n"
]
}
],
"source": [
"!cat few_shot_prompt.yaml"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "98c8f356",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Write antonyms for the following words.\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: funny\n",
"Output:\n"
]
}
],
"source": [
"prompt = load_prompt(\"few_shot_prompt.yaml\")\n",
"print(prompt.format(adjective=\"funny\"))"
]
},
{
"cell_type": "markdown",
"id": "4870aa9d",
"metadata": {},
"source": [
"### Loading from JSON\n",
"This shows an example of loading a few shot example from JSON."
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "9d996a86",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"_type\": \"few_shot\",\r\n",
" \"input_variables\": [\"adjective\"],\r\n",
" \"prefix\": \"Write antonyms for the following words.\",\r\n",
" \"example_prompt\": {\r\n",
" \"input_variables\": [\"input\", \"output\"],\r\n",
" \"template\": \"Input: {input}\\nOutput: {output}\"\r\n",
" },\r\n",
" \"examples\": \"examples.json\",\r\n",
" \"suffix\": \"Input: {adjective}\\nOutput:\"\r\n",
"} \r\n"
]
}
],
"source": [
"!cat few_shot_prompt.json"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "dd2c10bb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Write antonyms for the following words.\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: funny\n",
"Output:\n"
]
}
],
"source": [
"prompt = load_prompt(\"few_shot_prompt.json\")\n",
"print(prompt.format(adjective=\"funny\"))"
]
},
{
"cell_type": "markdown",
"id": "9d23faf4",
"metadata": {},
"source": [
"### Examples in the Config\n",
"This shows an example of referencing the examples directly in the config."
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "6cd781ef",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"_type\": \"few_shot\",\r\n",
" \"input_variables\": [\"adjective\"],\r\n",
" \"prefix\": \"Write antonyms for the following words.\",\r\n",
" \"example_prompt\": {\r\n",
" \"input_variables\": [\"input\", \"output\"],\r\n",
" \"template\": \"Input: {input}\\nOutput: {output}\"\r\n",
" },\r\n",
" \"examples\": [\r\n",
" {\"input\": \"happy\", \"output\": \"sad\"},\r\n",
" {\"input\": \"tall\", \"output\": \"short\"}\r\n",
" ],\r\n",
" \"suffix\": \"Input: {adjective}\\nOutput:\"\r\n",
"} \r\n"
]
}
],
"source": [
"!cat few_shot_prompt_examples_in.json"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "533ab8a7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Write antonyms for the following words.\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: funny\n",
"Output:\n"
]
}
],
"source": [
"prompt = load_prompt(\"few_shot_prompt_examples_in.json\")\n",
"print(prompt.format(adjective=\"funny\"))"
]
},
{
"cell_type": "markdown",
"id": "2e86139e",
"metadata": {},
"source": [
"### Example Prompt from a File\n",
"This shows an example of loading the PromptTemplate that is used to format the examples from a separate file. Note that the key changes from `example_prompt` to `example_prompt_path`."
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "0b6dd7b8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"input_variables\": [\"input\", \"output\"],\r\n",
" \"template\": \"Input: {input}\\nOutput: {output}\" \r\n",
"}\r\n"
]
}
],
"source": [
"!cat example_prompt.json"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "76a1065d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"_type\": \"few_shot\",\r\n",
" \"input_variables\": [\"adjective\"],\r\n",
" \"prefix\": \"Write antonyms for the following words.\",\r\n",
" \"example_prompt_path\": \"example_prompt.json\",\r\n",
" \"examples\": \"examples.json\",\r\n",
" \"suffix\": \"Input: {adjective}\\nOutput:\"\r\n",
"} \r\n"
]
}
],
"source": [
"!cat few_shot_prompt_example_prompt.json "
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "744d275d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Write antonyms for the following words.\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: funny\n",
"Output:\n"
]
}
],
"source": [
"prompt = load_prompt(\"few_shot_prompt_example_prompt.json\")\n",
"print(prompt.format(adjective=\"funny\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dcfc7176",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,4 @@
{
"input_variables": ["adjective", "content"],
"template": "Tell me a {adjective} joke about {content}."
}

View File

@@ -0,0 +1,4 @@
input_variables:
["adjective", "content"]
template:
Tell me a {adjective} joke about {content}.

View File

@@ -0,0 +1,4 @@
{
"input_variables": ["adjective", "content"],
"template_path": "simple_template.txt"
}

View File

@@ -0,0 +1 @@
Tell me a {adjective} joke about {content}.

View File

@@ -0,0 +1,723 @@
Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.
Last year COVID-19 kept us apart. This year we are finally together again.
Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans.
With a duty to one another to the American people to the Constitution.
And with an unwavering resolve that freedom will always triumph over tyranny.
Six days ago, Russias Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated.
He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined.
He met the Ukrainian people.
From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world.
Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland.
In this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight.
Let each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world.
Please rise if you are able and show that, Yes, we the United States of America stand with the Ukrainian people.
Throughout our history weve learned this lesson when dictators do not pay a price for their aggression they cause more chaos.
They keep moving.
And the costs and the threats to America and the world keep rising.
Thats why the NATO Alliance was created to secure peace and stability in Europe after World War 2.
The United States is a member along with 29 other nations.
It matters. American diplomacy matters. American resolve matters.
Putins latest attack on Ukraine was premeditated and unprovoked.
He rejected repeated efforts at diplomacy.
He thought the West and NATO wouldnt respond. And he thought he could divide us at home. Putin was wrong. We were ready. Here is what we did.
We prepared extensively and carefully.
We spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin.
I spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression.
We countered Russias lies with truth.
And now that he has acted the free world is holding him accountable.
Along with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland.
We are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever.
Together with our allies we are right now enforcing powerful economic sanctions.
We are cutting off Russias largest banks from the international financial system.
Preventing Russias central bank from defending the Russian Ruble making Putins $630 Billion “war fund” worthless.
We are choking off Russias access to technology that will sap its economic strength and weaken its military for years to come.
Tonight I say to the Russian oligarchs and corrupt leaders who have bilked billions of dollars off this violent regime no more.
The U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs.
We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains.
And tonight I am announcing that we will join our allies in closing off American air space to all Russian flights further isolating Russia and adding an additional squeeze on their economy. The Ruble has lost 30% of its value.
The Russian stock market has lost 40% of its value and trading remains suspended. Russias economy is reeling and Putin alone is to blame.
Together with our allies we are providing support to the Ukrainians in their fight for freedom. Military assistance. Economic assistance. Humanitarian assistance.
We are giving more than $1 Billion in direct assistance to Ukraine.
And we will continue to aid the Ukrainian people as they defend their country and to help ease their suffering.
Let me be clear, our forces are not engaged and will not engage in conflict with Russian forces in Ukraine.
Our forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies in the event that Putin decides to keep moving west.
For that purpose weve mobilized American ground forces, air squadrons, and ship deployments to protect NATO countries including Poland, Romania, Latvia, Lithuania, and Estonia.
As I have made crystal clear the United States and our Allies will defend every inch of territory of NATO countries with the full force of our collective power.
And we remain clear-eyed. The Ukrainians are fighting back with pure courage. But the next few days weeks, months, will be hard on them.
Putin has unleashed violence and chaos. But while he may make gains on the battlefield he will pay a continuing high price over the long run.
And a proud Ukrainian people, who have known 30 years of independence, have repeatedly shown that they will not tolerate anyone who tries to take their country backwards.
To all Americans, I will be honest with you, as Ive always promised. A Russian dictator, invading a foreign country, has costs around the world.
And Im taking robust action to make sure the pain of our sanctions is targeted at Russias economy. And I will use every tool at our disposal to protect American businesses and consumers.
Tonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world.
America will lead that effort, releasing 30 Million barrels from our own Strategic Petroleum Reserve. And we stand ready to do more if necessary, unified with our allies.
These steps will help blunt gas prices here at home. And I know the news about whats happening can seem alarming.
But I want you to know that we are going to be okay.
When the history of this era is written Putins war on Ukraine will have left Russia weaker and the rest of the world stronger.
While it shouldnt have taken something so terrible for people around the world to see whats at stake now everyone sees it clearly.
We see the unity among leaders of nations and a more unified Europe a more unified West. And we see unity among the people who are gathering in cities in large crowds around the world even in Russia to demonstrate their support for Ukraine.
In the battle between democracy and autocracy, democracies are rising to the moment, and the world is clearly choosing the side of peace and security.
This is a real test. Its going to take time. So let us continue to draw inspiration from the iron will of the Ukrainian people.
To our fellow Ukrainian Americans who forge a deep bond that connects our two nations we stand with you.
Putin may circle Kyiv with tanks, but he will never gain the hearts and souls of the Ukrainian people.
He will never extinguish their love of freedom. He will never weaken the resolve of the free world.
We meet tonight in an America that has lived through two of the hardest years this nation has ever faced.
The pandemic has been punishing.
And so many families are living paycheck to paycheck, struggling to keep up with the rising cost of food, gas, housing, and so much more.
I understand.
I remember when my Dad had to leave our home in Scranton, Pennsylvania to find work. I grew up in a family where if the price of food went up, you felt it.
Thats why one of the first things I did as President was fight to pass the American Rescue Plan.
Because people were hurting. We needed to act, and we did.
Few pieces of legislation have done more in a critical moment in our history to lift us out of crisis.
It fueled our efforts to vaccinate the nation and combat COVID-19. It delivered immediate economic relief for tens of millions of Americans.
Helped put food on their table, keep a roof over their heads, and cut the cost of health insurance.
And as my Dad used to say, it gave people a little breathing room.
And unlike the $2 Trillion tax cut passed in the previous administration that benefitted the top 1% of Americans, the American Rescue Plan helped working people—and left no one behind.
And it worked. It created jobs. Lots of jobs.
In fact—our economy created over 6.5 Million new jobs just last year, more jobs created in one year
than ever before in the history of America.
Our economy grew at a rate of 5.7% last year, the strongest growth in nearly 40 years, the first step in bringing fundamental change to an economy that hasnt worked for the working people of this nation for too long.
For the past 40 years we were told that if we gave tax breaks to those at the very top, the benefits would trickle down to everyone else.
But that trickle-down theory led to weaker economic growth, lower wages, bigger deficits, and the widest gap between those at the top and everyone else in nearly a century.
Vice President Harris and I ran for office with a new economic vision for America.
Invest in America. Educate Americans. Grow the workforce. Build the economy from the bottom up
and the middle out, not from the top down.
Because we know that when the middle class grows, the poor have a ladder up and the wealthy do very well.
America used to have the best roads, bridges, and airports on Earth.
Now our infrastructure is ranked 13th in the world.
We wont be able to compete for the jobs of the 21st Century if we dont fix that.
Thats why it was so important to pass the Bipartisan Infrastructure Law—the most sweeping investment to rebuild America in history.
This was a bipartisan effort, and I want to thank the members of both parties who worked to make it happen.
Were done talking about infrastructure weeks.
Were going to have an infrastructure decade.
It is going to transform America and put us on a path to win the economic competition of the 21st Century that we face with the rest of the world—particularly with China.
As Ive told Xi Jinping, it is never a good bet to bet against the American people.
Well create good jobs for millions of Americans, modernizing roads, airports, ports, and waterways all across America.
And well do it all to withstand the devastating effects of the climate crisis and promote environmental justice.
Well build a national network of 500,000 electric vehicle charging stations, begin to replace poisonous lead pipes—so every child—and every American—has clean water to drink at home and at school, provide affordable high-speed internet for every American—urban, suburban, rural, and tribal communities.
4,000 projects have already been announced.
And tonight, Im announcing that this year we will start fixing over 65,000 miles of highway and 1,500 bridges in disrepair.
When we use taxpayer dollars to rebuild America we are going to Buy American: buy American products to support American jobs.
The federal government spends about $600 Billion a year to keep the country safe and secure.
Theres been a law on the books for almost a century
to make sure taxpayers dollars support American jobs and businesses.
Every Administration says theyll do it, but we are actually doing it.
We will buy American to make sure everything from the deck of an aircraft carrier to the steel on highway guardrails are made in America.
But to compete for the best jobs of the future, we also need to level the playing field with China and other competitors.
Thats why it is so important to pass the Bipartisan Innovation Act sitting in Congress that will make record investments in emerging technologies and American manufacturing.
Let me give you one example of why its so important to pass it.
If you travel 20 miles east of Columbus, Ohio, youll find 1,000 empty acres of land.
It wont look like much, but if you stop and look closely, youll see a “Field of dreams,” the ground on which Americas future will be built.
This is where Intel, the American company that helped build Silicon Valley, is going to build its $20 billion semiconductor “mega site”.
Up to eight state-of-the-art factories in one place. 10,000 new good-paying jobs.
Some of the most sophisticated manufacturing in the world to make computer chips the size of a fingertip that power the world and our everyday lives.
Smartphones. The Internet. Technology we have yet to invent.
But thats just the beginning.
Intels CEO, Pat Gelsinger, who is here tonight, told me they are ready to increase their investment from
$20 billion to $100 billion.
That would be one of the biggest investments in manufacturing in American history.
And all theyre waiting for is for you to pass this bill.
So lets not wait any longer. Send it to my desk. Ill sign it.
And we will really take off.
And Intel is not alone.
Theres something happening in America.
Just look around and youll see an amazing story.
The rebirth of the pride that comes from stamping products “Made In America.” The revitalization of American manufacturing.
Companies are choosing to build new factories here, when just a few years ago, they would have built them overseas.
Thats what is happening. Ford is investing $11 billion to build electric vehicles, creating 11,000 jobs across the country.
GM is making the largest investment in its history—$7 billion to build electric vehicles, creating 4,000 jobs in Michigan.
All told, we created 369,000 new manufacturing jobs in America just last year.
Powered by people Ive met like JoJo Burgess, from generations of union steelworkers from Pittsburgh, whos here with us tonight.
As Ohio Senator Sherrod Brown says, “Its time to bury the label “Rust Belt.”
Its time.
But with all the bright spots in our economy, record job growth and higher wages, too many families are struggling to keep up with the bills.
Inflation is robbing them of the gains they might otherwise feel.
I get it. Thats why my top priority is getting prices under control.
Look, our economy roared back faster than most predicted, but the pandemic meant that businesses had a hard time hiring enough workers to keep up production in their factories.
The pandemic also disrupted global supply chains.
When factories close, it takes longer to make goods and get them from the warehouse to the store, and prices go up.
Look at cars.
Last year, there werent enough semiconductors to make all the cars that people wanted to buy.
And guess what, prices of automobiles went up.
So—we have a choice.
One way to fight inflation is to drive down wages and make Americans poorer.
I have a better plan to fight inflation.
Lower your costs, not your wages.
Make more cars and semiconductors in America.
More infrastructure and innovation in America.
More goods moving faster and cheaper in America.
More jobs where you can earn a good living in America.
And instead of relying on foreign supply chains, lets make it in America.
Economists call it “increasing the productive capacity of our economy.”
I call it building a better America.
My plan to fight inflation will lower your costs and lower the deficit.
17 Nobel laureates in economics say my plan will ease long-term inflationary pressures. Top business leaders and most Americans support my plan. And heres the plan:
First cut the cost of prescription drugs. Just look at insulin. One in ten Americans has diabetes. In Virginia, I met a 13-year-old boy named Joshua Davis.
He and his Dad both have Type 1 diabetes, which means they need insulin every day. Insulin costs about $10 a vial to make.
But drug companies charge families like Joshua and his Dad up to 30 times more. I spoke with Joshuas mom.
Imagine what its like to look at your child who needs insulin and have no idea how youre going to pay for it.
What it does to your dignity, your ability to look your child in the eye, to be the parent you expect to be.
Joshua is here with us tonight. Yesterday was his birthday. Happy birthday, buddy.
For Joshua, and for the 200,000 other young people with Type 1 diabetes, lets cap the cost of insulin at $35 a month so everyone can afford it.
Drug companies will still do very well. And while were at it let Medicare negotiate lower prices for prescription drugs, like the VA already does.
Look, the American Rescue Plan is helping millions of families on Affordable Care Act plans save $2,400 a year on their health care premiums. Lets close the coverage gap and make those savings permanent.
Second cut energy costs for families an average of $500 a year by combatting climate change.
Lets provide investments and tax credits to weatherize your homes and businesses to be energy efficient and you get a tax credit; double Americas clean energy production in solar, wind, and so much more; lower the price of electric vehicles, saving you another $80 a month because youll never have to pay at the gas pump again.
Third cut the cost of child care. Many families pay up to $14,000 a year for child care per child.
Middle-class and working families shouldnt have to pay more than 7% of their income for care of young children.
My plan will cut the cost in half for most families and help parents, including millions of women, who left the workforce during the pandemic because they couldnt afford child care, to be able to get back to work.
My plan doesnt stop there. It also includes home and long-term care. More affordable housing. And Pre-K for every 3- and 4-year-old.
All of these will lower costs.
And under my plan, nobody earning less than $400,000 a year will pay an additional penny in new taxes. Nobody.
The one thing all Americans agree on is that the tax system is not fair. We have to fix it.
Im not looking to punish anyone. But lets make sure corporations and the wealthiest Americans start paying their fair share.
Just last year, 55 Fortune 500 corporations earned $40 billion in profits and paid zero dollars in federal income tax.
Thats simply not fair. Thats why Ive proposed a 15% minimum tax rate for corporations.
We got more than 130 countries to agree on a global minimum tax rate so companies cant get out of paying their taxes at home by shipping jobs and factories overseas.
Thats why Ive proposed closing loopholes so the very wealthy dont pay a lower tax rate than a teacher or a firefighter.
So thats my plan. It will grow the economy and lower costs for families.
So what are we waiting for? Lets get this done. And while youre at it, confirm my nominees to the Federal Reserve, which plays a critical role in fighting inflation.
My plan will not only lower costs to give families a fair shot, it will lower the deficit.
The previous Administration not only ballooned the deficit with tax cuts for the very wealthy and corporations, it undermined the watchdogs whose job was to keep pandemic relief funds from being wasted.
But in my administration, the watchdogs have been welcomed back.
Were going after the criminals who stole billions in relief money meant for small businesses and millions of Americans.
And tonight, Im announcing that the Justice Department will name a chief prosecutor for pandemic fraud.
By the end of this year, the deficit will be down to less than half what it was before I took office.
The only president ever to cut the deficit by more than one trillion dollars in a single year.
Lowering your costs also means demanding more competition.
Im a capitalist, but capitalism without competition isnt capitalism.
Its exploitation—and it drives up prices.
When corporations dont have to compete, their profits go up, your prices go up, and small businesses and family farmers and ranchers go under.
We see it happening with ocean carriers moving goods in and out of America.
During the pandemic, these foreign-owned companies raised prices by as much as 1,000% and made record profits.
Tonight, Im announcing a crackdown on these companies overcharging American businesses and consumers.
And as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up.
That ends on my watch.
Medicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect.
Well also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees.
Lets pass the Paycheck Fairness Act and paid leave.
Raise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty.
Lets increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls Americas best-kept secret: community colleges.
And lets pass the PRO Act when a majority of workers want to form a union—they shouldnt be stopped.
When we invest in our workers, when we build the economy from the bottom up and the middle out together, we can do something we havent done in a long time: build a better America.
For more than two years, COVID-19 has impacted every decision in our lives and the life of the nation.
And I know youre tired, frustrated, and exhausted.
But I also know this.
Because of the progress weve made, because of your resilience and the tools we have, tonight I can say
we are moving forward safely, back to more normal routines.
Weve reached a new moment in the fight against COVID-19, with severe cases down to a level not seen since last July.
Just a few days ago, the Centers for Disease Control and Prevention—the CDC—issued new mask guidelines.
Under these new guidelines, most Americans in most of the country can now be mask free.
And based on the projections, more of the country will reach that point across the next couple of weeks.
Thanks to the progress we have made this past year, COVID-19 need no longer control our lives.
I know some are talking about “living with COVID-19”. Tonight I say that we will never just accept living with COVID-19.
We will continue to combat the virus as we do other diseases. And because this is a virus that mutates and spreads, we will stay on guard.
Here are four common sense steps as we move forward safely.
First, stay protected with vaccines and treatments. We know how incredibly effective vaccines are. If youre vaccinated and boosted you have the highest degree of protection.
We will never give up on vaccinating more Americans. Now, I know parents with kids under 5 are eager to see a vaccine authorized for their children.
The scientists are working hard to get that done and well be ready with plenty of vaccines when they do.
Were also ready with anti-viral treatments. If you get COVID-19, the Pfizer pill reduces your chances of ending up in the hospital by 90%.
Weve ordered more of these pills than anyone in the world. And Pfizer is working overtime to get us 1 Million pills this month and more than double that next month.
And were launching the “Test to Treat” initiative so people can get tested at a pharmacy, and if theyre positive, receive antiviral pills on the spot at no cost.
If youre immunocompromised or have some other vulnerability, we have treatments and free high-quality masks.
Were leaving no one behind or ignoring anyones needs as we move forward.
And on testing, we have made hundreds of millions of tests available for you to order for free.
Even if you already ordered free tests tonight, I am announcing that you can order more from covidtests.gov starting next week.
Second we must prepare for new variants. Over the past year, weve gotten much better at detecting new variants.
If necessary, well be able to deploy new vaccines within 100 days instead of many more months or years.
And, if Congress provides the funds we need, well have new stockpiles of tests, masks, and pills ready if needed.
I cannot promise a new variant wont come. But I can promise you well do everything within our power to be ready if it does.
Third we can end the shutdown of schools and businesses. We have the tools we need.
Its time for Americans to get back to work and fill our great downtowns again. People working from home can feel safe to begin to return to the office.
Were doing that here in the federal government. The vast majority of federal workers will once again work in person.
Our schools are open. Lets keep it that way. Our kids need to be in school.
And with 75% of adult Americans fully vaccinated and hospitalizations down by 77%, most Americans can remove their masks, return to work, stay in the classroom, and move forward safely.
We achieved this because we provided free vaccines, treatments, tests, and masks.
Of course, continuing this costs money.
I will soon send Congress a request.
The vast majority of Americans have used these tools and may want to again, so I expect Congress to pass it quickly.
Fourth, we will continue vaccinating the world.
Weve sent 475 Million vaccine doses to 112 countries, more than any other nation.
And we wont stop.
We have lost so much to COVID-19. Time with one another. And worst of all, so much loss of life.
Lets use this moment to reset. Lets stop looking at COVID-19 as a partisan dividing line and see it for what it is: A God-awful disease.
Lets stop seeing each other as enemies, and start seeing each other for who we really are: Fellow Americans.
We cant change how divided weve been. But we can change how we move forward—on COVID-19 and other issues we must face together.
I recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera.
They were responding to a 9-1-1 call when a man shot and killed them with a stolen gun.
Officer Mora was 27 years old.
Officer Rivera was 22.
Both Dominican Americans whod grown up on the same streets they later chose to patrol as police officers.
I spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves.
Ive worked on these issues a long time.
I know what works: Investing in crime preventionand community police officers wholl walk the beat, wholl know the neighborhood, and who can restore trust and safety.
So lets not abandon our streets. Or choose between safety and equal justice.
Lets come together to protect our communities, restore trust, and hold law enforcement accountable.
Thats why the Justice Department required body cameras, banned chokeholds, and restricted no-knock warrants for its officers.
Thats why the American Rescue Plan provided $350 Billion that cities, states, and counties can use to hire more police and invest in proven strategies like community violence interruption—trusted messengers breaking the cycle of violence and trauma and giving young people hope.
We should all agree: The answer is not to Defund the police. The answer is to FUND the police with the resources and training they need to protect our communities.
I ask Democrats and Republicans alike: Pass my budget and keep our neighborhoods safe.
And I will keep doing everything in my power to crack down on gun trafficking and ghost guns you can buy online and make at home—they have no serial numbers and cant be traced.
And I ask Congress to pass proven measures to reduce gun violence. Pass universal background checks. Why should anyone on a terrorist list be able to purchase a weapon?
Ban assault weapons and high-capacity magazines.
Repeal the liability shield that makes gun manufacturers the only industry in America that cant be sued.
These laws dont infringe on the Second Amendment. They save lives.
The most fundamental right in America is the right to vote and to have it counted. And its under assault.
In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections.
We cannot let this happen.
Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections.
Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service.
One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court.
And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.
A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since shes been nominated, shes received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.
And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system.
We can do both. At our border, weve installed new technology like cutting-edge scanners to better detect drug smuggling.
Weve set up joint patrols with Mexico and Guatemala to catch more human traffickers.
Were putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster.
Were securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.
We can do all this while keeping lit the torch of liberty that has led generations of immigrants to this land—my forefathers and so many of yours.
Provide a pathway to citizenship for Dreamers, those on temporary status, farm workers, and essential workers.
Revise our laws so businesses have the workers they need and families dont wait decades to reunite.
Its not only the right thing to do—its the economically smart thing to do.
Thats why immigration reform is supported by everyone from labor unions to religious leaders to the U.S. Chamber of Commerce.
Lets get it done once and for all.
Advancing liberty and justice also requires protecting the rights of women.
The constitutional right affirmed in Roe v. Wade—standing precedent for half a century—is under attack as never before.
If we want to go forward—not backward—we must protect access to health care. Preserve a womans right to choose. And lets continue to advance maternal health care in America.
And for our LGBTQ+ Americans, lets finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong.
As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential.
While it often appears that we never agree, that isnt true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice.
And soon, well strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things.
So tonight Im offering a Unity Agenda for the Nation. Four big things we can do together.
First, beat the opioid epidemic.
There is so much we can do. Increase funding for prevention, treatment, harm reduction, and recovery.
Get rid of outdated rules that stop doctors from prescribing treatments. And stop the flow of illicit drugs by working with state and local law enforcement to go after traffickers.
If youre suffering from addiction, know you are not alone. I believe in recovery, and I celebrate the 23 million Americans in recovery.
Second, lets take on mental health. Especially among our children, whose lives and education have been turned upside down.
The American Rescue Plan gave schools money to hire teachers and help students make up for lost learning.
I urge every parent to make sure your school does just that. And we can all play a part—sign up to be a tutor or a mentor.
Children were also struggling before the pandemic. Bullying, violence, trauma, and the harms of social media.
As Frances Haugen, who is here with us tonight, has shown, we must hold social media platforms accountable for the national experiment theyre conducting on our children for profit.
Its time to strengthen privacy protections, ban targeted advertising to children, demand tech companies stop collecting personal data on our children.
And lets get all Americans the mental health services they need. More people they can turn to for help, and full parity between physical and mental health care.
Third, support our veterans.
Veterans are the best of us.
Ive always believed that we have a sacred obligation to equip all those we send to war and care for them and their families when they come home.
My administration is providing assistance with job training and housing, and now helping lower-income veterans get VA care debt-free.
Our troops in Iraq and Afghanistan faced many dangers.
One was stationed at bases and breathing in toxic smoke from “burn pits” that incinerated wastes of war—medical and hazard material, jet fuel, and more.
When they came home, many of the worlds fittest and best trained warriors were never the same.
Headaches. Numbness. Dizziness.
A cancer that would put them in a flag-draped coffin.
I know.
One of those soldiers was my son Major Beau Biden.
We dont know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops.
But Im committed to finding out everything we can.
Committed to military families like Danielle Robinson from Ohio.
The widow of Sergeant First Class Heath Robinson.
He was born a soldier. Army National Guard. Combat medic in Kosovo and Iraq.
Stationed near Baghdad, just yards from burn pits the size of football fields.
Heaths widow Danielle is here with us tonight. They loved going to Ohio State football games. He loved building Legos with their daughter.
But cancer from prolonged exposure to burn pits ravaged Heaths lungs and body.
Danielle says Heath was a fighter to the very end.
He didnt know how to stop fighting, and neither did she.
Through her pain she found purpose to demand we do better.
Tonight, Danielle—we are.
The VA is pioneering new ways of linking toxic exposures to diseases, already helping more veterans get benefits.
And tonight, Im announcing were expanding eligibility to veterans suffering from nine respiratory cancers.
Im also calling on Congress: pass a law to make sure veterans devastated by toxic exposures in Iraq and Afghanistan finally get the benefits and comprehensive health care they deserve.
And fourth, lets end cancer as we know it.
This is personal to me and Jill, to Kamala, and to so many of you.
Cancer is the #2 cause of death in Americasecond only to heart disease.
Last month, I announced our plan to supercharge
the Cancer Moonshot that President Obama asked me to lead six years ago.
Our goal is to cut the cancer death rate by at least 50% over the next 25 years, turn more cancers from death sentences into treatable diseases.
More support for patients and families.
To get there, I call on Congress to fund ARPA-H, the Advanced Research Projects Agency for Health.
Its based on DARPA—the Defense Department project that led to the Internet, GPS, and so much more.
ARPA-H will have a singular purpose—to drive breakthroughs in cancer, Alzheimers, diabetes, and more.
A unity agenda for the nation.
We can do this.
My fellow Americans—tonight , we have gathered in a sacred space—the citadel of our democracy.
In this Capitol, generation after generation, Americans have debated great questions amid great strife, and have done great things.
We have fought for freedom, expanded liberty, defeated totalitarianism and terror.
And built the strongest, freest, and most prosperous nation the world has ever known.
Now is the hour.
Our moment of responsibility.
Our test of resolve and conscience, of history itself.
It is in this moment that our character is formed. Our purpose is found. Our future is forged.
Well I know this nation.
We will meet the test.
To protect freedom and liberty, to expand fairness and opportunity.
We will save democracy.
As hard as these times have been, I am more optimistic about America today than I have been my whole life.
Because I see the future that is within our grasp.
Because I know there is simply nothing beyond our capacity.
We are the only nation on Earth that has always turned every crisis we have faced into an opportunity.
The only nation that can be defined by a single word: possibilities.
So on this night, in our 245th year as a nation, I have come to report on the State of the Union.
And my report is this: the State of the Union is strong—because you, the American people, are strong.
We are stronger today than we were a year ago.
And we will be stronger a year from now than we are today.
Now is our moment to meet and overcome the challenges of our time.
And we will, as one people.
One America.
The United States of America.
May God bless you all. May God protect our troops.

View File

@@ -0,0 +1,27 @@
# Core Concepts
This section goes over the core concepts of LangChain.
Understanding these will go a long way in helping you understand the codebase and how to construct chains.
## PromptTemplates
PromptTemplates generically have a `format` method that takes in variables and returns a formatted string.
The most simple implementation of this is to have a template string with some variables in it, and then format it with the incoming variables.
More complex iterations dynamically construct the template string from few shot examples, etc.
For a more detailed explanation of how LangChain approaches prompts and prompt templates, see [here](prompts.md).
## LLMs
Wrappers around Large Language Models (in particular, the `generate` ability of large language models) are some of the core functionality of LangChain.
These wrappers are classes that are callable: they take in an input string, and return the generated output string.
## Embeddings
These classes are very similar to the LLM classes in that they are wrappers around models,
but rather than return a string they return an embedding (list of floats). This are particularly useful when
implementing semantic search functionality. They expose separate methods for embedding queries versus embedding documents.
## Vectorstores
These are datastores that store documents. They expose a method for passing in a string and finding similar documents.
## Chains
These are pipelines that combine multiple of the above ideas.
They vary greatly in complexity and are combination of generic, highly configurable pipelines and more narrow (but usually more complex) pipelines.

View File

@@ -0,0 +1,74 @@
# Glossary
This is a collection of terminology commonly used when developing LLM applications.
It contains reference to external papers or sources where the concept was first introduced,
as well as to places in LangChain where the concept is used.
### Chain of Thought Prompting
A prompting technique used to encourage the model to generate a series of intermediate reasoning steps.
A less formal way to induce this behavior is to include “Lets think step-by-step” in the prompt.
Resources:
- [Chain-of-Thought Paper](https://arxiv.org/pdf/2201.11903.pdf)
- [Step-by-Step Paper](https://arxiv.org/abs/2112.00114)
### Action Plan Generation
A prompt usage that uses a language model to generate actions to take.
The results of these actions can then be fed back into the language model to generate a subsequent action.
Resources:
- [WebGPT Paper](https://arxiv.org/pdf/2112.09332.pdf)
- [SayCan Paper](https://say-can.github.io/assets/palm_saycan.pdf)
### ReAct Prompting
A prompting technique that combines Chain-of-Thought prompting with action plan generation.
This induces the to model to think about what action to take, then take it.
Resources:
- [Paper](https://arxiv.org/pdf/2210.03629.pdf)
- [LangChain Example](https://github.com/hwchase17/langchain/blob/master/examples/react.ipynb)
### Self-ask
A prompting method that builds on top of chain-of-thought prompting.
In this method, the model explicitly asks itself follow-up questions, which are then answered by an external search engine.
Resources:
- [Paper](https://ofir.io/self-ask.pdf)
- [LangChain Example](https://github.com/hwchase17/langchain/blob/master/examples/self_ask_with_search.ipynb)
### Prompt Chaining
Combining multiple LLM calls together, with the output of one step being the input to the next.
Resources:
- [PromptChainer Paper](https://arxiv.org/pdf/2203.06566.pdf)
- [Language Model Cascades](https://arxiv.org/abs/2207.10342)
- [ICE Primer Book](https://primer.ought.org/)
- [Socratic Models](https://socraticmodels.github.io/)
### Memetic Proxy
Encouraging the LLM to respond in a certain way framing the discussion in a context that the model knows of and that will result in that type of response. For example, as a conversation between a student and a teacher.
Resources:
- [Paper](https://arxiv.org/pdf/2102.07350.pdf)
### Self Consistency
A decoding strategy that samples a diverse set of reasoning paths and then selects the most consistent answer.
Is most effective when combined with Chain-of-thought prompting.
Resources:
- [Paper](https://arxiv.org/pdf/2203.11171.pdf)
### Inception
Also called “First Person Instruction”.
Encouraging the model to think a certain way by including the start of the models response in the prompt.
Resources:
- [Example](https://twitter.com/goodside/status/1583262455207460865?s=20&t=8Hz7XBnK1OF8siQrxxCIGQ)

138
docs/explanation/prompts.md Normal file
View File

@@ -0,0 +1,138 @@
# Prompts
Prompts and all the tooling around them are integral to working with language models, and therefor
really important to get right, from both and interface and naming perspective. This is a "design doc"
of sorts explaining how we think about prompts and the related concepts, and why the interfaces
for working with are the way they are in LangChain.
For a more code-based walkthrough of all these concept, checkout our example [here](/examples/prompts/prompt_management)
## Prompt
### Concept
A prompt is the final string that gets fed into the language model.
### LangChain Implementation
In LangChain a prompt is represented as just a string.
## Input Variables
### Concept
Input variables are parts of a prompt that are not known until runtime, eg could be user provided.
### LangChain Implementation
In LangChain input variables are just represented as a dictionary of key-value pairs, with the key
being the variable name and the value being the variable value.
## Examples
### Concept
Examples are basically datapoints that can be used to teach the model what to do. These can be included
in prompts to better instruct the model on what to do.
### LangChain Implementation
In LangChain examples are represented as a dictionary of key-value pairs, with the key being the feature
(or label) name, and the value being the feature (or label) value.
## Example Selector
### Concept
If you have a large number of examples, you may need to select which ones to include in the prompt. The
Example Selector is the class responsible for doing so.
### LangChain Implementation
#### BaseExampleSelector
In LangChain there is a BaseExampleSelector that exposes the following interface
```python
class BaseExampleSelector:
def select_examples(self, input_variables: dict):
```
Notice that it does not take in examples at runtime when it's selecting them - those are assumed to have been provided ahead of time.
#### LengthExampleSelector
The LengthExampleSelector selects examples based on the length of the input variables.
This is useful when you are worried about constructing a prompt that will go over the length
of the context window. For longer inputs, it will select fewer examples to include, while for
shorter inputs it will select more.
#### SemanticSimilarityExampleSelector
The SemanticSimilarityExampleSelector selects examples based on which examples are most similar
to the inputs. It does this by finding the examples with the embeddings that have the greatest
cosine similarity with the inputs.
## Prompt Template
### Concept
The prompts that get fed into the language model are nearly always not hardcoded, but rather a combination
of parts, including Examples and Input Variables. A prompt template is responsible
for taking those parts and constructing a prompt.
### LangChain Implementation
#### BasePromptTemplate
In LangChain there is a BasePromptTemplate that exposes the following interface
```python
class BasePromptTemplate:
@property
def input_variables(self) -> List[str]:
def format(self, **kwargs) -> str:
```
The input variables property is used to provide introspection of the PromptTemplate and know
what inputs it expects. The format method takes in input variables and returns the prompt.
#### PromptTemplate
The PromptTemplate implementation is the most simple form of a prompt template. It consists of three parts:
- input variables: which variables this prompt template expects
- template: the template into which these variables will be formatted
- template format: the format of the template (eg mustache, python f-strings, etc)
For example, if I was making an application that took a user inputted concept and asked a language model
to make a joke about that concept, I might use this specification for the PromptTemplate
- input variables = `["thing"]`
- template = `"Tell me a joke about {thing}"`
- template format = `"f-string"`
#### FewShotPromptTemplate
A FewShotPromptTemplate is a Prompt Template that includes some examples. It consists of:
- examples OR example selector: a list of examples to use, or an Example Selector to select which examples to use
- example prompt template: a Prompt Template responsible for taking an individual example (a dictionary) and turning it into a string to be used in the prompt.
- prefix: the template put in the prompt before listing any examples
- suffix: the template put in the prompt after listing any examples
- example separator: a string separator which is used to join the prefix, the examples, and the suffix together
For example, if I wanted to turn the above example into a few shot prompt, this is what it would
look like:
First I would collect some examples, like
```python
examples = [
{"concept": "chicken", "joke": "Why did the chicken cross the road?"},
...
]
```
I would then make sure to define a prompt template for how each example should be formatted
when inserted into the prompt:
```python
prompt_template = PromptTemplate(
input_variables=["concept", "joke"],
template="Tell me a joke about {concept}\n{joke}"
)
```
Then, I would define the components as:
- examples: The above examples
- example_prompt: The above example prompt
- prefix = `"You are a comedian telling jokes on demand."`
- suffix = `"Tell me a joke about {concept}"`
- input variables = `["concept"]`
- template format = `"f-string"`

View File

@@ -0,0 +1,39 @@
# Using Chains
Calling an LLM is a great first step, but it's just the beginning.
Normally when you use an LLM in an application, you are not sending user input directly to the LLM.
Instead, you are probably taking user input and constructing a prompt, and then sending that to the LLM.
For example, in the previous example, the text we passed in was hardcoded to ask for a name for a company that made colorful socks.
In this imaginary service, what we would want to do is take only the user input describing what the company does, and then format the prompt with that information.
This is easy to do with LangChain!
First lets define the prompt:
```python
from langchain.prompts import PromptTemplate
prompt = PromptTemplate(
input_variables=["product"],
template="What is a good name for a company that makes {product}?",
)
```
We can now create a very simple chain that will take user input, format the prompt with it, and then send it to the LLM:
```python
from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)
```
Now we can run that can only specifying the product!
```python
chain.run("colorful socks")
```
There we go! There's the first chain.
That is it for the Getting Started example.
As a next step, we would suggest checking out the more complex chains in the [Demos section](/examples/demos)

View File

@@ -0,0 +1,37 @@
# Setting up your environment
Using LangChain will usually require integrations with one or more model providers, data stores, apis, etc.
There are two components to setting this up, installing the correct python packages and setting the right environment variables.
## Python packages
The python package needed varies based on the integration. See the list of integrations for details.
There should also be helpful error messages raised if you try to run an integration and are missing any required python packages.
## Environment Variables
The environment variable needed varies based on the integration. See the list of integrations for details.
There should also be helpful error messages raised if you try to run an integration and are missing any required environment variables.
You can set the environment variable in a few ways.
If you are trying to set the environment variable `FOO` to value `bar`, here are the ways you could do so:
- From the command line:
```
export FOO=bar
```
- From the python notebook/script:
```python
import os
os.environ["FOO"] = "bar"
```
For the Getting Started example, we will be using OpenAI's APIs, so we will first need to install their SDK:
```
pip install openai
```
We will then need to set the environment variable. Let's do this from inside the Jupyter notebook (or Python script).
```python
import os
os.environ["OPENAI_API_KEY"] = "..."
```

View File

@@ -0,0 +1,11 @@
# Installation
LangChain is available on PyPi, so to it is easily installable with:
```
pip install langchain
```
For more involved installation options, see the [Installation Reference](/installation.md) section.
That's it! LangChain is now installed. You can now use LangChain from a python script or Jupyter notebook.

View File

@@ -0,0 +1,25 @@
# Calling a LLM
The most basic building block of LangChain is calling an LLM on some input.
Let's walk through a simple example of how to do this.
For this purpose, let's pretend we are building a service that generates a company name based on what the company makes.
In order to do this, we first need to import the LLM wrapper.
```python
from langchain.llms import OpenAI
```
We can then initialize the wrapper with any arguments.
In this example, we probably want the outputs to be MORE random, so we'll initialize it with a HIGH temperature.
```python
llm = OpenAI(temperature=0.9)
```
We can now call it on some input!
```python
text = "What would be a good company name a company that makes colorful socks?"
llm(text)
```

View File

@@ -1,10 +1,84 @@
Welcome to LangChain
==========================
.. toctree::
:maxdepth: 2
:caption: User API
Large language models (LLMs) are emerging as a transformative technology, enabling
developers to build applications that they previously could not.
But using these LLMs in isolation is often not enough to
create a truly powerful app - the real power comes when you are able to
combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications.
It aims to create:
1. a comprehensive collection of pieces you would ever want to combine
2. a flexible interface for combining pieces into a single comprehensive "chain"
3. a schema for easily saving and sharing those chains
The documentation is structured into the following sections:
.. toctree::
:maxdepth: 1
:caption: Getting Started
:name: getting_started
getting_started/installation.md
getting_started/environment.md
getting_started/llm.md
getting_started/chains.md
Goes over a simple walk through and tutorial for getting started setting up a simple chain that generates a company name based on what the company makes.
Covers installation, environment set up, calling LLMs, and using prompts.
Start here if you haven't used LangChain before.
.. toctree::
:maxdepth: 1
:caption: How-To Examples
:name: examples
examples/demos.rst
examples/integrations.rst
examples/prompts.rst
examples/model_laboratory.ipynb
More elaborate examples and walk-throughs of particular
integrations and use cases. This is the place to look if you have questions
about how to integrate certain pieces, or if you want to find examples of
common tasks or cool demos.
.. toctree::
:maxdepth: 1
:caption: Reference
:name: reference
installation.md
integrations.md
modules/prompt
modules/example_selector
modules/llms
modules/embeddings
modules/text_splitter
modules/vectorstore
modules/chains
Full API documentation. This is the place to look if you want to
see detailed information about the various classes, methods, and APIs.
.. toctree::
:maxdepth: 1
:caption: Resources
:name: resources
explanation/core_concepts.md
explanation/prompts.md
explanation/glossary.md
Discord <https://discord.gg/6adMQxSpJS>
Higher level, conceptual explanations of the LangChain components.
This is the place to go if you want to increase your high level understanding
of the problems LangChain is solving, and how we decided to go about do so.

24
docs/installation.md Normal file
View File

@@ -0,0 +1,24 @@
# Installation Options
LangChain is available on PyPi, so to it is easily installable with:
```
pip install langchain
```
That will install the bare minimum requirements of LangChain.
A lot of the value of LangChain comes when integrating it with various model providers, datastores, etc.
By default, the dependencies needed to do that are NOT installed.
However, there are two other ways to install LangChain that do bring in those dependencies.
To install modules needed for the common LLM providers, run:
```
pip install langchain[llms]
```
To install all modules needed for all integrations, run:
```
pip install langchain[all]
```

33
docs/integrations.md Normal file
View File

@@ -0,0 +1,33 @@
# Integration Reference
Besides the installation of this python package, you will also need to install packages and set environment variables depending on which chains you want to use.
Note: the reason these packages are not included in the dependencies by default is that as we imagine scaling this package, we do not want to force dependencies that are not needed.
The following use cases require specific installs and api keys:
- _OpenAI_:
- Install requirements with `pip install openai`
- Get an OpenAI api key and either set it as an environment variable (`OPENAI_API_KEY`) or pass it to the LLM constructor as `openai_api_key`.
- _Cohere_:
- Install requirements with `pip install cohere`
- Get a Cohere api key and either set it as an environment variable (`COHERE_API_KEY`) or pass it to the LLM constructor as `cohere_api_key`.
- _HuggingFace Hub_
- Install requirements with `pip install huggingface_hub`
- Get a HuggingFace Hub api token and either set it as an environment variable (`HUGGINGFACEHUB_API_TOKEN`) or pass it to the LLM constructor as `huggingfacehub_api_token`.
- _SerpAPI_:
- Install requirements with `pip install google-search-results`
- Get a SerpAPI api key and either set it as an environment variable (`SERPAPI_API_KEY`) or pass it to the LLM constructor as `serpapi_api_key`.
- _NatBot_:
- Install requirements with `pip install playwright`
- _Wikipedia_:
- Install requirements with `pip install wikipedia`
- _Elasticsearch_:
- Install requirements with `pip install elasticsearch`
- Set up Elasticsearch backend. If you want to do locally, [this](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/getting-started.html) is a good guide.
- _FAISS_:
- Install requirements with `pip install faiss` for Python 3.7 and `pip install faiss-cpu` for Python 3.10+.
- _Manifest_:
- Install requirements with `pip install manifest-ml` (Note: this is only available in Python 3.8+ currently).
If you are using the `NLTKTextSplitter` or the `SpacyTextSplitter`, you will also need to install the appropriate models. For example, if you want to use the `SpacyTextSplitter`, you will need to install the `en_core_web_sm` model with `python -m spacy download en_core_web_sm`. Similarly, if you want to use the `NLTKTextSplitter`, you will need to install the `punkt` model with `python -m nltk.downloader punkt`.

View File

@@ -0,0 +1,5 @@
:mod:`langchain.embeddings`
===========================
.. automodule:: langchain.embeddings
:members:

View File

@@ -0,0 +1,5 @@
:mod:`langchain.prompts.example_selector`
=========================================
.. automodule:: langchain.prompts.example_selector
:members:

View File

@@ -1,5 +1,5 @@
:mod:`langchain.prompt`
=======================
:mod:`langchain.prompts`
========================
.. automodule:: langchain.prompt
.. automodule:: langchain.prompts
:members:

View File

@@ -0,0 +1,6 @@
:mod:`langchain.text_splitter`
==============================
.. automodule:: langchain.text_splitter
:members:
:undoc-members:

View File

@@ -0,0 +1,6 @@
:mod:`langchain.vectorstores`
=============================
.. automodule:: langchain.vectorstores
:members:
:undoc-members:

View File

@@ -1,5 +1,8 @@
autodoc_pydantic==1.8.0
myst_parser
nbsphinx==0.8.9
sphinx==4.5.0
sphinx-autobuild==2021.3.14
sphinx_rtd_theme==1.0.0
sphinx-typlog-theme==0.8.0
autodoc_pydantic==1.8.0
sphinx-panels

View File

@@ -1 +1 @@
0.0.1
0.0.16

View File

@@ -8,12 +8,19 @@ with open(Path(__file__).absolute().parents[0] / "VERSION") as _f:
from langchain.chains import (
LLMChain,
LLMMathChain,
MRKLChain,
PythonChain,
ReActChain,
SelfAskWithSearchChain,
SerpAPIChain,
SQLDatabaseChain,
VectorDBQA,
)
from langchain.llms import Cohere, OpenAI
from langchain.prompt import Prompt
from langchain.docstore import Wikipedia
from langchain.llms import Cohere, HuggingFaceHub, OpenAI
from langchain.prompts import BasePromptTemplate, PromptTemplate
from langchain.sql_database import SQLDatabase
from langchain.vectorstores import FAISS, ElasticVectorSearch
__all__ = [
"LLMChain",
@@ -23,5 +30,16 @@ __all__ = [
"SerpAPIChain",
"Cohere",
"OpenAI",
"Prompt",
"BasePromptTemplate",
"DynamicPrompt",
"PromptTemplate",
"ReActChain",
"Wikipedia",
"HuggingFaceHub",
"SQLDatabase",
"SQLDatabaseChain",
"FAISS",
"MRKLChain",
"VectorDBQA",
"ElasticVectorSearch",
]

View File

@@ -1,9 +1,13 @@
"""Chains are easily reusable components which can be linked together."""
from langchain.chains.llm import LLMChain
from langchain.chains.llm_math.base import LLMMathChain
from langchain.chains.mrkl.base import MRKLChain
from langchain.chains.python import PythonChain
from langchain.chains.react.base import ReActChain
from langchain.chains.self_ask_with_search.base import SelfAskWithSearchChain
from langchain.chains.serpapi import SerpAPIChain
from langchain.chains.sql_database.base import SQLDatabaseChain
from langchain.chains.vector_db_qa.base import VectorDBQA
__all__ = [
"LLMChain",
@@ -11,4 +15,8 @@ __all__ = [
"PythonChain",
"SelfAskWithSearchChain",
"SerpAPIChain",
"ReActChain",
"SQLDatabaseChain",
"MRKLChain",
"VectorDBQA",
]

View File

@@ -2,10 +2,15 @@
from abc import ABC, abstractmethod
from typing import Any, Dict, List
from pydantic import BaseModel
class Chain(ABC):
class Chain(BaseModel, ABC):
"""Base interface that all chains should implement."""
verbose: bool = False
"""Whether to print out response text."""
@property
@abstractmethod
def input_keys(self) -> List[str]:
@@ -30,12 +35,34 @@ class Chain(ABC):
)
@abstractmethod
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
"""Run the logic of this chain and return the output."""
def __call__(self, inputs: Dict[str, Any]) -> Dict[str, str]:
"""Run the logic of this chain and add to output."""
self._validate_inputs(inputs)
outputs = self._run(inputs)
if self.verbose:
print("\n\n\033[1m> Entering new chain...\033[0m")
outputs = self._call(inputs)
if self.verbose:
print("\n\033[1m> Finished chain.\033[0m")
self._validate_outputs(outputs)
return {**inputs, **outputs}
def apply(self, input_list: List[Dict[str, Any]]) -> List[Dict[str, str]]:
"""Call the chain on all inputs in the list."""
return [self(inputs) for inputs in input_list]
def run(self, text: str) -> str:
"""Run text in, text out (if applicable)."""
if len(self.input_keys) != 1:
raise ValueError(
f"`run` not supported when there is not exactly "
f"one input key, got {self.input_keys}."
)
if len(self.output_keys) != 1:
raise ValueError(
f"`run` not supported when there is not exactly "
f"one output key, got {self.output_keys}."
)
return self({self.input_keys[0]: text})[self.output_keys[0]]

View File

@@ -4,8 +4,9 @@ from typing import Any, Dict, List
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.input import print_text
from langchain.llms.base import LLM
from langchain.prompt import Prompt
from langchain.prompts.base import BasePromptTemplate
class LLMChain(Chain, BaseModel):
@@ -16,11 +17,13 @@ class LLMChain(Chain, BaseModel):
from langchain import LLMChain, OpenAI, Prompt
prompt_template = "Tell me a {adjective} joke"
prompt = Prompt(input_variables=["adjective"], template=prompt_template)
prompt = PromptTemplate(
input_variables=["adjective"], template=prompt_template
)
llm = LLMChain(llm=OpenAI(), prompt=prompt)
"""
prompt: Prompt
prompt: BasePromptTemplate
"""Prompt object to use."""
llm: LLM
"""LLM wrapper to use."""
@@ -48,10 +51,12 @@ class LLMChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, Any]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
selected_inputs = {k: inputs[k] for k in self.prompt.input_variables}
prompt = self.prompt.format(**selected_inputs)
if self.verbose:
print("Prompt after formatting:")
print_text(prompt, color="green", end="\n")
kwargs = {}
if "stop" in inputs:
kwargs["stop"] = inputs["stop"]

View File

@@ -7,6 +7,7 @@ from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.llm_math.prompt import PROMPT
from langchain.chains.python import PythonChain
from langchain.input import ChainedInput
from langchain.llms.base import LLM
@@ -22,8 +23,6 @@ class LLMMathChain(Chain, BaseModel):
llm: LLM
"""LLM wrapper to use."""
verbose: bool = False
"""Whether to print out the code that was executed."""
input_key: str = "question" #: :meta private:
output_key: str = "answer" #: :meta private:
@@ -49,36 +48,21 @@ class LLMMathChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_executor = LLMChain(prompt=PROMPT, llm=self.llm)
python_executor = PythonChain()
question = inputs[self.input_key]
t = llm_executor.predict(question=question, stop=["```output"]).strip()
chained_input = ChainedInput(inputs[self.input_key], verbose=self.verbose)
t = llm_executor.predict(question=chained_input.input, stop=["```output"])
chained_input.add(t, color="green")
t = t.strip()
if t.startswith("```python"):
code = t[9:-4]
if self.verbose:
print("[DEBUG] evaluating code")
print(code)
output = python_executor.run(code)
chained_input.add("\nAnswer: ")
chained_input.add(output, color="yellow")
answer = "Answer: " + output
elif t.startswith("Answer:"):
answer = t
else:
raise ValueError(f"unknown format from LLM: {t}")
return {self.output_key: answer}
def run(self, question: str) -> str:
"""Understand user question and execute math in Python if necessary.
Args:
question: User question that contains a math question to parse and answer.
Returns:
The answer to the question.
Example:
.. code-block:: python
answer = llm_math.run("What is one plus one?")
"""
return self({self.input_key: question})[self.output_key]

View File

@@ -1,5 +1,5 @@
# flake8: noqa
from langchain.prompt import Prompt
from langchain.prompts.prompt import PromptTemplate
_PROMPT_TEMPLATE = """You are GPT-3, and you can't do math.
@@ -35,4 +35,4 @@ Answer: 2518731
Question: {question}"""
PROMPT = Prompt(input_variables=["question"], template=_PROMPT_TEMPLATE)
PROMPT = PromptTemplate(input_variables=["question"], template=_PROMPT_TEMPLATE)

View File

@@ -0,0 +1,75 @@
"""Map-reduce chain.
Splits up a document, sends the smaller parts to the LLM with one prompt,
then combines the results with another one.
"""
from typing import Dict, List
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.llms.base import LLM
from langchain.prompts.base import BasePromptTemplate
from langchain.text_splitter import TextSplitter
class MapReduceChain(Chain, BaseModel):
"""Map-reduce chain."""
map_llm: LLMChain
"""LLM wrapper to use for the map step."""
reduce_llm: LLMChain
"""LLM wrapper to use for the reduce step."""
text_splitter: TextSplitter
"""Text splitter to use."""
input_key: str = "input_text" #: :meta private:
output_key: str = "output_text" #: :meta private:
@classmethod
def from_params(
cls, llm: LLM, prompt: BasePromptTemplate, text_splitter: TextSplitter
) -> "MapReduceChain":
"""Construct a map-reduce chain that uses the chain for map and reduce."""
llm_chain = LLMChain(llm=llm, prompt=prompt)
return cls(map_llm=llm_chain, reduce_llm=llm_chain, text_splitter=text_splitter)
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
# Split the larger text into smaller chunks.
docs = self.text_splitter.split_text(inputs[self.input_key])
# Now that we have the chunks, we send them to the LLM and track results.
# This is the "map" part.
input_list = [{self.map_llm.prompt.input_variables[0]: d} for d in docs]
summary_results = self.map_llm.apply(input_list)
summaries = [res[self.map_llm.output_key] for res in summary_results]
# We then need to combine these individual parts into one.
# This is the reduce part.
summary_str = "\n".join(summaries)
inputs = {self.reduce_llm.prompt.input_variables[0]: summary_str}
output = self.reduce_llm.predict(**inputs)
return {self.output_key: output}

View File

@@ -0,0 +1 @@
"""Attempt to implement MRKL systems as described in arxiv.org/pdf/2205.00445.pdf."""

View File

@@ -0,0 +1,170 @@
"""Attempt to implement MRKL systems as described in arxiv.org/pdf/2205.00445.pdf."""
from typing import Any, Callable, Dict, List, NamedTuple, Tuple
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.mrkl.prompt import BASE_TEMPLATE
from langchain.input import ChainedInput, get_color_mapping
from langchain.llms.base import LLM
from langchain.prompts import BasePromptTemplate, PromptTemplate
FINAL_ANSWER_ACTION = "Final Answer: "
class ChainConfig(NamedTuple):
"""Configuration for chain to use in MRKL system.
Args:
action_name: Name of the action.
action: Action function to call.
action_description: Description of the action.
"""
action_name: str
action: Callable
action_description: str
def get_action_and_input(llm_output: str) -> Tuple[str, str]:
"""Parse out the action and input from the LLM output."""
ps = [p for p in llm_output.split("\n") if p]
if ps[-1].startswith(FINAL_ANSWER_ACTION):
directive = ps[-1][len(FINAL_ANSWER_ACTION) :]
return FINAL_ANSWER_ACTION, directive
if not ps[-1].startswith("Action Input: "):
raise ValueError(
"The last line does not have an action input, "
"something has gone terribly wrong."
)
if not ps[-2].startswith("Action: "):
raise ValueError(
"The second to last line does not have an action, "
"something has gone terribly wrong."
)
action = ps[-2][len("Action: ") :]
action_input = ps[-1][len("Action Input: ") :]
return action, action_input.strip(" ").strip('"')
class MRKLChain(Chain, BaseModel):
"""Chain that implements the MRKL system.
Example:
.. code-block:: python
from langchain import OpenAI, Prompt, MRKLChain
from langchain.chains.mrkl.base import ChainConfig
llm = OpenAI(temperature=0)
prompt = PromptTemplate(...)
action_to_chain_map = {...}
mrkl = MRKLChain(
llm=llm,
prompt=prompt,
action_to_chain_map=action_to_chain_map
)
"""
llm: LLM
"""LLM wrapper to use as router."""
prompt: BasePromptTemplate
"""Prompt to use as router."""
action_to_chain_map: Dict[str, Callable]
"""Mapping from action name to chain to execute."""
input_key: str = "question" #: :meta private:
output_key: str = "answer" #: :meta private:
@classmethod
def from_chains(
cls, llm: LLM, chains: List[ChainConfig], **kwargs: Any
) -> "MRKLChain":
"""User friendly way to initialize the MRKL chain.
This is intended to be an easy way to get up and running with the
MRKL chain.
Args:
llm: The LLM to use as the router LLM.
chains: The chains the MRKL system has access to.
**kwargs: parameters to be passed to initialization.
Returns:
An initialized MRKL chain.
Example:
.. code-block:: python
from langchain import LLMMathChain, OpenAI, SerpAPIChain, MRKLChain
from langchain.chains.mrkl.base import ChainConfig
llm = OpenAI(temperature=0)
search = SerpAPIChain()
llm_math_chain = LLMMathChain(llm=llm)
chains = [
ChainConfig(
action_name = "Search",
action=search.search,
action_description="useful for searching"
),
ChainConfig(
action_name="Calculator",
action=llm_math_chain.run,
action_description="useful for doing math"
)
]
mrkl = MRKLChain.from_chains(llm, chains)
"""
tools = "\n".join(
[f"{chain.action_name}: {chain.action_description}" for chain in chains]
)
tool_names = ", ".join([chain.action_name for chain in chains])
template = BASE_TEMPLATE.format(tools=tools, tool_names=tool_names)
prompt = PromptTemplate(template=template, input_variables=["input"])
action_to_chain_map = {chain.action_name: chain.action for chain in chains}
return cls(
llm=llm, prompt=prompt, action_to_chain_map=action_to_chain_map, **kwargs
)
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Expect output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_chain = LLMChain(llm=self.llm, prompt=self.prompt)
chained_input = ChainedInput(
f"{inputs[self.input_key]}\nThought:", verbose=self.verbose
)
color_mapping = get_color_mapping(
list(self.action_to_chain_map.keys()), excluded_colors=["green"]
)
while True:
thought = llm_chain.predict(
input=chained_input.input, stop=["\nObservation"]
)
chained_input.add(thought, color="green")
action, action_input = get_action_and_input(thought)
if action == FINAL_ANSWER_ACTION:
return {self.output_key: action_input}
chain = self.action_to_chain_map[action]
ca = chain(action_input)
chained_input.add("\nObservation: ")
chained_input.add(ca, color=color_mapping[action])
chained_input.add("\nThought:")

View File

@@ -0,0 +1,19 @@
# flake8: noqa
BASE_TEMPLATE = """Answer the following questions as best you can. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {{input}}"""

View File

@@ -57,7 +57,7 @@ class NatBotChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_executor = LLMChain(prompt=PROMPT, llm=self.llm)
url = inputs[self.input_url_key]
browser_content = inputs[self.input_browser_content_key]
@@ -71,7 +71,7 @@ class NatBotChain(Chain, BaseModel):
self.previous_command = llm_cmd
return {self.output_key: llm_cmd}
def run(self, url: str, browser_content: str) -> str:
def execute(self, url: str, browser_content: str) -> str:
"""Figure out next browser command to run.
Args:

View File

@@ -3,8 +3,6 @@
import time
from sys import platform
from playwright.sync_api import sync_playwright
black_listed_elements = {
"html",
"head",
@@ -23,14 +21,14 @@ black_listed_elements = {
class Crawler:
def __init__(self):
self.browser = (
sync_playwright()
.start()
.chromium.launch(
headless=False,
try:
from playwright.sync_api import sync_playwright
except ImportError:
raise ValueError(
"Could not import playwright python package. "
"Please it install it with `pip install playwright`."
)
)
self.browser = sync_playwright().start().chromium.launch(headless=False)
self.page = self.browser.new_page()
self.page.set_viewport_size({"width": 1280, "height": 1080})

View File

@@ -1,5 +1,5 @@
# flake8: noqa
from langchain.prompt import Prompt
from langchain.prompts.prompt import PromptTemplate
_PROMPT_TEMPLATE = """
You are an agent controlling a browser. You are given:
@@ -30,7 +30,7 @@ Based on your given objective, issue whatever command you believe will get you c
You always start on Google; you should submit a search query to Google that will take you to the best page for
achieving your objective. And then interact with that page to achieve your objective.
If you find yourself on Google and there are no search results displayed yet, you should probably issue a command
If you find yourself on Google and there are no search results displayed yet, you should probably issue a command
like "TYPESUBMIT 7 "search query"" to get to a more useful page.
Then, if you find yourself on a Google search results page, you might issue the command "CLICK 24" to click
@@ -66,7 +66,7 @@ CURRENT BROWSER CONTENT:
------------------
OBJECTIVE: Find a 2 bedroom house for sale in Anchorage AK for under $750k
CURRENT URL: https://www.google.com/
YOUR COMMAND:
YOUR COMMAND:
TYPESUBMIT 8 "anchorage redfin"
==================================================
@@ -95,7 +95,7 @@ CURRENT BROWSER CONTENT:
------------------
OBJECTIVE: Make a reservation for 4 at Dorsia at 8pm
CURRENT URL: https://www.google.com/
YOUR COMMAND:
YOUR COMMAND:
TYPESUBMIT 8 "dorsia nyc opentable"
==================================================
@@ -114,15 +114,15 @@ CURRENT BROWSER CONTENT:
<text id=9>Sep 28, 2022</text>
<text id=10>7:00 PM</text>
<text id=11>2 people</text>
<input id=12 alt="Location, Restaurant, or Cuisine"></input>
<input id=12 alt="Location, Restaurant, or Cuisine"></input>
<button id=13>Lets go</button>
<text id=14>It looks like you're in Peninsula. Not correct?</text>
<text id=14>It looks like you're in Peninsula. Not correct?</text>
<button id=15>Get current location</button>
<button id=16>Next</button>
------------------
OBJECTIVE: Make a reservation for 4 for dinner at Dorsia in New York City at 8pm
CURRENT URL: https://www.opentable.com/
YOUR COMMAND:
YOUR COMMAND:
TYPESUBMIT 12 "dorsia new york city"
==================================================
@@ -138,7 +138,7 @@ CURRENT URL: {url}
PREVIOUS COMMAND: {previous_command}
YOUR COMMAND:
"""
PROMPT = Prompt(
PROMPT = PromptTemplate(
input_variables=["browser_content", "url", "previous_command", "objective"],
template=_PROMPT_TEMPLATE,
)

View File

@@ -9,6 +9,7 @@ from typing import Dict, List
from pydantic import BaseModel
from langchain.chains.base import Chain
from langchain.python import PythonREPL
class PythonChain(Chain, BaseModel):
@@ -40,27 +41,11 @@ class PythonChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, str]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
python_repl = PythonREPL()
old_stdout = sys.stdout
sys.stdout = mystdout = StringIO()
exec(inputs[self.input_key])
python_repl.run(inputs[self.input_key])
sys.stdout = old_stdout
output = mystdout.getvalue()
return {self.output_key: output}
def run(self, code: str) -> str:
"""Run code in python interpreter.
Args:
code: Code snippet to execute, should print out the answer.
Returns:
Answer from running the code and printing out the answer.
Example:
.. code-block:: python
answer = python_chain.run("print(1+1)")
"""
return self({self.input_key: code})[self.output_key]

View File

@@ -0,0 +1 @@
"""Implements the ReAct paper from https://arxiv.org/pdf/2210.03629.pdf."""

View File

@@ -0,0 +1,107 @@
"""Chain that implements the ReAct paper from https://arxiv.org/pdf/2210.03629.pdf."""
import re
from typing import Any, Dict, List, Tuple
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.react.prompt import PROMPT
from langchain.docstore.base import Docstore
from langchain.docstore.document import Document
from langchain.input import ChainedInput
from langchain.llms.base import LLM
def predict_until_observation(
llm_chain: LLMChain, prompt: str, i: int
) -> Tuple[str, str, str]:
"""Generate text until an observation is needed."""
action_prefix = f"Action {i}: "
stop_seq = f"\nObservation {i}:"
ret_text = llm_chain.predict(input=prompt, stop=[stop_seq])
# Sometimes the LLM forgets to take an action, so we prompt it to.
while not ret_text.split("\n")[-1].startswith(action_prefix):
ret_text += f"\nAction {i}:"
new_text = llm_chain.predict(input=prompt + ret_text, stop=[stop_seq])
ret_text += new_text
# The action block should be the last line.
action_block = ret_text.split("\n")[-1]
action_str = action_block[len(action_prefix) :]
# Parse out the action and the directive.
re_matches = re.search(r"(.*?)\[(.*?)\]", action_str)
if re_matches is None:
raise ValueError(f"Could not parse action directive: {action_str}")
return ret_text, re_matches.group(1), re_matches.group(2)
class ReActChain(Chain, BaseModel):
"""Chain that implements the ReAct paper.
Example:
.. code-block:: python
from langchain import ReActChain, OpenAI
react = ReAct(llm=OpenAI())
"""
llm: LLM
"""LLM wrapper to use."""
docstore: Docstore
"""Docstore to use."""
input_key: str = "question" #: :meta private:
output_key: str = "answer" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Expect input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Expect output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
question = inputs[self.input_key]
llm_chain = LLMChain(llm=self.llm, prompt=PROMPT)
chained_input = ChainedInput(f"{question}\nThought 1:", verbose=self.verbose)
i = 1
document = None
while True:
ret_text, action, directive = predict_until_observation(
llm_chain, chained_input.input, i
)
chained_input.add(ret_text, color="green")
if action == "Search":
result = self.docstore.search(directive)
if isinstance(result, Document):
document = result
observation = document.summary
else:
document = None
observation = result
elif action == "Lookup":
if document is None:
raise ValueError("Cannot lookup without a successful search first")
observation = document.lookup(directive)
elif action == "Finish":
return {self.output_key: directive}
else:
raise ValueError(f"Got unknown action directive: {action}")
chained_input.add(f"\nObservation {i}: ")
chained_input.add(observation, color="yellow")
chained_input.add(f"\nThought {i + 1}:")
i += 1

View File

@@ -0,0 +1,112 @@
# flake8: noqa
from langchain.prompts.prompt import PromptTemplate
EXAMPLES = [
"""Question: What is the elevation range for the area that the eastern sector of the
Colorado orogeny extends into?
Thought 1: I need to search Colorado orogeny, find the area that the eastern sector
of the Colorado orogeny extends into, then find the elevation range of the
area.
Action 1: Search[Colorado orogeny]
Observation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in
Colorado and surrounding areas.
Thought 2: It does not mention the eastern sector. So I need to look up eastern
sector.
Action 2: Lookup[eastern sector]
Observation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called
the Central Plains orogeny.
Thought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I
need to search High Plains and find its elevation range.
Action 3: Search[High Plains]
Observation 3: High Plains refers to one of two distinct land regions
Thought 4: I need to instead search High Plains (United States).
Action 4: Search[High Plains (United States)]
Observation 4: The High Plains are a subregion of the Great Plains. From east to west, the
High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130
m).[3]
Thought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer
is 1,800 to 7,000 ft.
Action 5: Finish[1,800 to 7,000 ft]""",
"""Question: Musician and satirist Allie Goertz wrote a song about the "The Simpsons"
character Milhouse, who Matt Groening named after who?
Thought 1: The question simplifies to "The Simpsons" character Milhouse is named after
who. I only need to search Milhouse and find who it is named after.
Action 1: Search[Milhouse]
Observation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated
television series The Simpsons voiced by Pamela Hayden and created by Matt
Groening.
Thought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up
"named after".
Action 2: Lookup[named after]
Observation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose
middle name was Milhous.
Thought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is
Richard Nixon.
Action 3: Finish[Richard Nixon]""",
"""Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The
Saimaa Gesture?
Thought 1: I need to search Adam Clayton Powell and The Saimaa Gesture, and find which
documentary is about Finnish rock groups.
Action 1: Search[Adam Clayton Powell]
Observation 1 Could not find [Adam Clayton Powell]. Similar: [Adam Clayton Powell
III, Seventh Avenue (Manhattan), Adam Clayton Powell Jr. State Office
Building, Isabel Washington Powell, Adam Powell, Adam Clayton Powell
(film), Giancarlo Esposito].
Thought 2: To find the documentary, I can search Adam Clayton Powell (film).
Action 2: Search[Adam Clayton Powell (film)]
Observation 2: Adam Clayton Powell is a 1989 American documentary film directed by
Richard Kilberg. The film is about the rise and fall of influential
African-American politician Adam Clayton Powell Jr.[3][4] It was later aired
as part of the PBS series The American Experience.
Thought 3: Adam Clayton Powell (film) is a documentary about an African-American
politician, not Finnish rock groups. So the documentary about Finnish rock
groups must instead be The Saimaa Gesture.
Action 3: Finish[The Saimaa Gesture]""",
"""Question: What profession does Nicholas Ray and Elia Kazan have in common?
Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then
find the profession they have in common.
Action 1: Search[Nicholas Ray]
Observation 1: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 - June 16,
1979) was an American film director, screenwriter, and actor best known for
the 1955 film Rebel Without a Cause.
Thought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need
to search Elia Kazan next and find his professions.
Action 2: Search[Elia Kazan]
Observation 2: Elia Kazan was an American film and theatre director, producer, screenwriter
and actor.
Thought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor.
So profession Nicholas Ray and Elia Kazan have in common is director,
screenwriter, and actor.
Action 3: Finish[director, screenwriter, actor]""",
"""Question: Which magazine was started first Arthurs Magazine or First for Women?
Thought 1: I need to search Arthurs Magazine and First for Women, and find which was
started first.
Action 1: Search[Arthurs Magazine]
Observation 1: Arthurs Magazine (1844-1846) was an American literary periodical published
in Philadelphia in the 19th century.
Thought 2: Arthurs Magazine was started in 1844. I need to search First for Women
next.
Action 2: Search[First for Women]
Observation 2: First for Women is a womans magazine published by Bauer Media Group in the
USA.[1] The magazine was started in 1989.
Thought 3: First for Women was started in 1989. 1844 (Arthurs Magazine) < 1989 (First
for Women), so Arthurs Magazine was started first.
Action 3: Finish[Arthurs Magazine]""",
"""Question: Were Pavel Urysohn and Leonid Levin known for the same type of work?
Thought 1: I need to search Pavel Urysohn and Leonid Levin, find their types of work,
then find if they are the same.
Action 1: Search[Pavel Urysohn]
Observation 1: Pavel Samuilovich Urysohn (February 3, 1898 - August 17, 1924) was a Soviet
mathematician who is best known for his contributions in dimension theory.
Thought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and
find its type of work.
Action 2: Search[Leonid Levin]
Observation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer
scientist.
Thought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn
and Leonid Levin have the same type of work.
Action 3: Finish[yes]""",
]
SUFFIX = """\n\nQuestion: {input}"""
PROMPT = PromptTemplate.from_examples(EXAMPLES, SUFFIX, ["input"])

View File

@@ -7,6 +7,7 @@ from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.self_ask_with_search.prompt import PROMPT
from langchain.chains.serpapi import SerpAPIChain
from langchain.input import ChainedInput
from langchain.llms.base import LLM
@@ -113,58 +114,36 @@ class SelfAskWithSearchChain(Chain, BaseModel):
"""
return [self.output_key]
def _run(self, inputs: Dict[str, Any]) -> Dict[str, str]:
question = inputs[self.input_key]
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
chained_input = ChainedInput(inputs[self.input_key], verbose=self.verbose)
chained_input.add("\nAre follow up questions needed here:")
llm_chain = LLMChain(llm=self.llm, prompt=PROMPT)
intermediate = "\nIntermediate answer:"
followup = "Follow up:"
finalans = "\nSo the final answer is:"
cur_prompt = f"{question}\nAre follow up questions needed here:"
print(cur_prompt, end="")
ret_text = llm_chain.predict(input=cur_prompt, stop=[intermediate])
print(greenify(ret_text), end="")
ret_text = llm_chain.predict(input=chained_input.input, stop=[intermediate])
chained_input.add(ret_text, color="green")
while followup in get_last_line(ret_text):
cur_prompt += ret_text
question = extract_question(ret_text, followup)
external_answer = self.search_chain.search(question)
external_answer = self.search_chain.run(question)
if external_answer is not None:
cur_prompt += intermediate + " " + external_answer + "."
print(
intermediate + " " + yellowfy(external_answer) + ".",
end="",
)
chained_input.add(intermediate + " ")
chained_input.add(external_answer + ".", color="yellow")
ret_text = llm_chain.predict(
input=cur_prompt, stop=["\nIntermediate answer:"]
input=chained_input.input, stop=["\nIntermediate answer:"]
)
print(greenify(ret_text), end="")
chained_input.add(ret_text, color="green")
else:
# We only get here in the very rare case that Google returns no answer.
cur_prompt += intermediate
print(intermediate + " ")
cur_prompt += llm_chain.predict(
input=cur_prompt, stop=["\n" + followup, finalans]
chained_input.add(intermediate + " ")
preds = llm_chain.predict(
input=chained_input.input, stop=["\n" + followup, finalans]
)
chained_input.add(preds, color="green")
if finalans not in ret_text:
cur_prompt += finalans
print(finalans, end="")
ret_text = llm_chain.predict(input=cur_prompt, stop=["\n"])
print(greenify(ret_text), end="")
chained_input.add(finalans)
ret_text = llm_chain.predict(input=chained_input.input, stop=["\n"])
chained_input.add(ret_text, color="green")
return {self.output_key: cur_prompt + ret_text}
def run(self, question: str) -> str:
"""Run self ask with search chain.
Args:
question: Question to run self-ask-with-search with.
Returns:
The final answer
Example:
.. code-block:: python
answer = selfask.run("What is the capital of Idaho?")
"""
return self({self.input_key: question})[self.output_key]
return {self.output_key: ret_text}

View File

@@ -1,5 +1,5 @@
# flake8: noqa
from langchain.prompt import Prompt
from langchain.prompts.prompt import PromptTemplate
_DEFAULT_TEMPLATE = """Question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
@@ -38,7 +38,4 @@ Intermediate Answer: New Zealand.
So the final answer is: No
Question: {input}"""
PROMPT = Prompt(
input_variables=["input"],
template=_DEFAULT_TEMPLATE,
)
PROMPT = PromptTemplate(input_variables=["input"], template=_DEFAULT_TEMPLATE)

View File

@@ -4,12 +4,12 @@ Heavily borrowed from https://github.com/ofirpress/self-ask
"""
import os
import sys
from typing import Any, Dict, List
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from serpapi import GoogleSearch
from langchain.chains.base import Chain
from langchain.utils import get_from_dict_or_env
class HiddenPrints:
@@ -30,7 +30,8 @@ class SerpAPIChain(Chain, BaseModel):
"""Chain that calls SerpAPI.
To use, you should have the ``google-search-results`` python package installed,
and the environment variable ``SERPAPI_API_KEY`` set with your API key.
and the environment variable ``SERPAPI_API_KEY`` set with your API key, or pass
`serpapi_api_key` as a named parameter to the constructor.
Example:
.. code-block:: python
@@ -39,9 +40,12 @@ class SerpAPIChain(Chain, BaseModel):
serpapi = SerpAPIChain()
"""
search_engine: Any #: :meta private:
input_key: str = "search_query" #: :meta private:
output_key: str = "search_result" #: :meta private:
serpapi_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
@@ -66,16 +70,24 @@ class SerpAPIChain(Chain, BaseModel):
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
if "SERPAPI_API_KEY" not in os.environ:
serpapi_api_key = get_from_dict_or_env(
values, "serpapi_api_key", "SERPAPI_API_KEY"
)
values["serpapi_api_key"] = serpapi_api_key
try:
from serpapi import GoogleSearch
values["search_engine"] = GoogleSearch
except ImportError:
raise ValueError(
"Did not find SerpAPI API key, please add an environment variable"
" `SERPAPI_API_KEY` which contains it."
"Could not import serpapi python package. "
"Please it install it with `pip install google-search-results`."
)
return values
def _run(self, inputs: Dict[str, Any]) -> Dict[str, str]:
def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
params = {
"api_key": os.environ["SERPAPI_API_KEY"],
"api_key": self.serpapi_api_key,
"engine": "google",
"q": inputs[self.input_key],
"google_domain": "google.com",
@@ -83,9 +95,10 @@ class SerpAPIChain(Chain, BaseModel):
"hl": "en",
}
with HiddenPrints():
search = GoogleSearch(params)
search = self.search_engine(params)
res = search.get_dict()
if "error" in res.keys():
raise ValueError(f"Got error from SerpAPI: {res['error']}")
if "answer_box" in res.keys() and "answer" in res["answer_box"].keys():
toret = res["answer_box"]["answer"]
elif "answer_box" in res.keys() and "snippet" in res["answer_box"].keys():
@@ -100,19 +113,3 @@ class SerpAPIChain(Chain, BaseModel):
else:
toret = None
return {self.output_key: toret}
def search(self, search_question: str) -> str:
"""Run search query against SerpAPI.
Args:
search_question: Question to run against the SerpAPI.
Returns:
Answer from the search engine.
Example:
.. code-block:: python
answer = serpapi.search("What is the capital of Idaho?")
"""
return self({self.input_key: search_question})[self.output_key]

View File

@@ -0,0 +1 @@
"""Chain for interacting with SQL Database."""

View File

@@ -0,0 +1,74 @@
"""Chain for interacting with SQL Database."""
from typing import Dict, List
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.sql_database.prompt import PROMPT
from langchain.input import ChainedInput
from langchain.llms.base import LLM
from langchain.sql_database import SQLDatabase
class SQLDatabaseChain(Chain, BaseModel):
"""Chain for interacting with SQL Database.
Example:
.. code-block:: python
from langchain import SQLDatabaseChain, OpenAI, SQLDatabase
db = SQLDatabase(...)
db_chain = SelfAskWithSearchChain(llm=OpenAI(), database=db)
"""
llm: LLM
"""LLM wrapper to use."""
database: SQLDatabase
"""SQL Database to connect to."""
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Return the singular input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return the singular output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_chain = LLMChain(llm=self.llm, prompt=PROMPT)
chained_input = ChainedInput(
inputs[self.input_key] + "\nSQLQuery:", verbose=self.verbose
)
llm_inputs = {
"input": chained_input.input,
"dialect": self.database.dialect,
"table_info": self.database.table_info,
"stop": ["\nSQLResult:"],
}
sql_cmd = llm_chain.predict(**llm_inputs)
chained_input.add(sql_cmd, color="green")
result = self.database.run(sql_cmd)
chained_input.add("\nSQLResult: ")
chained_input.add(result, color="yellow")
chained_input.add("\nAnswer:")
llm_inputs["input"] = chained_input.input
final_result = llm_chain.predict(**llm_inputs)
chained_input.add(final_result, color="green")
return {self.output_key: final_result}

View File

@@ -0,0 +1,19 @@
# flake8: noqa
from langchain.prompts.prompt import PromptTemplate
_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Use the following format:
Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"
Only use the following tables:
{table_info}
Question: {input}"""
PROMPT = PromptTemplate(
input_variables=["input", "table_info", "dialect"], template=_DEFAULT_TEMPLATE
)

View File

@@ -0,0 +1 @@
"""Chain for question-answering against a vector database."""

View File

@@ -0,0 +1,66 @@
"""Chain for question-answering against a vector database."""
from typing import Dict, List
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chains.vector_db_qa.prompt import prompt
from langchain.llms.base import LLM
from langchain.vectorstores.base import VectorStore
class VectorDBQA(Chain, BaseModel):
"""Chain for question-answering against a vector database.
Example:
.. code-block:: python
from langchain import OpenAI, VectorDBQA
from langchain.faiss import FAISS
vectordb = FAISS(...)
vectordbQA = VectorDBQA(llm=OpenAI(), vector_db=vectordb)
"""
llm: LLM
"""LLM wrapper to use."""
vectorstore: VectorStore
"""Vector Database to connect to."""
k: int = 4
"""Number of documents to query for."""
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Return the singular input key.
:meta private:
"""
return [self.input_key]
@property
def output_keys(self) -> List[str]:
"""Return the singular output key.
:meta private:
"""
return [self.output_key]
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
question = inputs[self.input_key]
llm_chain = LLMChain(llm=self.llm, prompt=prompt)
docs = self.vectorstore.similarity_search(question, k=self.k)
contexts = []
for j, doc in enumerate(docs):
contexts.append(f"Context {j}:\n{doc.page_content}")
# TODO: handle cases where this context is too long.
answer = llm_chain.predict(question=question, context="\n\n".join(contexts))
return {self.output_key: answer}

View File

@@ -0,0 +1,12 @@
# flake8: noqa
from langchain.prompts import PromptTemplate
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}
Question: {question}
Helpful Answer:"""
prompt = PromptTemplate(
template=prompt_template, input_variables=["context", "question"]
)

View File

@@ -0,0 +1,4 @@
"""Wrappers on top of docstores."""
from langchain.docstore.wikipedia import Wikipedia
__all__ = ["Wikipedia"]

View File

@@ -0,0 +1,17 @@
"""Interface to access to place that stores documents."""
from abc import ABC, abstractmethod
from typing import Union
from langchain.docstore.document import Document
class Docstore(ABC):
"""Interface to access to place that stores documents."""
@abstractmethod
def search(self, search: str) -> Union[str, Document]:
"""Search for document.
If page exists, return the page summary, and a Document object.
If page does not exist, return similar entries.
"""

View File

@@ -0,0 +1,39 @@
"""Interface for interacting with a document."""
from typing import List
from pydantic import BaseModel, Field
class Document(BaseModel):
"""Interface for interacting with a document."""
page_content: str
lookup_str: str = ""
lookup_index = 0
metadata: dict = Field(default_factory=dict)
@property
def paragraphs(self) -> List[str]:
"""Paragraphs of the page."""
return self.page_content.split("\n\n")
@property
def summary(self) -> str:
"""Summary of the page (the first paragraph)."""
return self.paragraphs[0]
def lookup(self, string: str) -> str:
"""Lookup a term in the page, imitating cmd-F functionality."""
if string.lower() != self.lookup_str:
self.lookup_str = string.lower()
self.lookup_index = 0
else:
self.lookup_index += 1
lookups = [p for p in self.paragraphs if self.lookup_str in p.lower()]
if len(lookups) == 0:
return "No Results"
elif self.lookup_index >= len(lookups):
return "No More Results"
else:
result_prefix = f"(Result {self.lookup_index + 1}/{len(lookups)})"
return f"{result_prefix} {lookups[self.lookup_index]}"

View File

@@ -0,0 +1,20 @@
"""Simple in memory docstore in the form of a dict."""
from typing import Dict, Union
from langchain.docstore.base import Docstore
from langchain.docstore.document import Document
class InMemoryDocstore(Docstore):
"""Simple in memory docstore in the form of a dict."""
def __init__(self, _dict: Dict[str, Document]):
"""Initialize with dict."""
self._dict = _dict
def search(self, search: str) -> Union[str, Document]:
"""Search via direct lookup."""
if search not in self._dict:
return f"ID {search} not found."
else:
return self._dict[search]

View File

@@ -0,0 +1,38 @@
"""Wrapper around wikipedia API."""
from typing import Union
from langchain.docstore.base import Docstore
from langchain.docstore.document import Document
class Wikipedia(Docstore):
"""Wrapper around wikipedia API."""
def __init__(self) -> None:
"""Check that wikipedia package is installed."""
try:
import wikipedia # noqa: F401
except ImportError:
raise ValueError(
"Could not import wikipedia python package. "
"Please it install it with `pip install wikipedia`."
)
def search(self, search: str) -> Union[str, Document]:
"""Try to search for wiki page.
If page exists, return the page summary, and a PageWithLookups object.
If page does not exist, return similar entries.
"""
import wikipedia
try:
page_content = wikipedia.page(search).content
result: Union[str, Document] = Document(page_content=page_content)
except wikipedia.PageError:
result = f"Could not find [{search}]. Similar: {wikipedia.search(search)}"
except wikipedia.DisambiguationError:
result = f"Could not find [{search}]. Similar: {wikipedia.search(search)}"
return result

View File

@@ -0,0 +1,6 @@
"""Wrappers around embedding modules."""
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.embeddings.openai import OpenAIEmbeddings
__all__ = ["OpenAIEmbeddings", "HuggingFaceEmbeddings", "CohereEmbeddings"]

View File

@@ -0,0 +1,15 @@
"""Interface for embedding models."""
from abc import ABC, abstractmethod
from typing import List
class Embeddings(ABC):
"""Interface for embedding models."""
@abstractmethod
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Embed search docs."""
@abstractmethod
def embed_query(self, text: str) -> List[float]:
"""Embed query text."""

View File

@@ -0,0 +1,74 @@
"""Wrapper around Cohere embedding models."""
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
class CohereEmbeddings(BaseModel, Embeddings):
"""Wrapper around Cohere embedding models.
To use, you should have the ``cohere`` python package installed, and the
environment variable ``COHERE_API_KEY`` set with your API key or pass it
as a named parameter to the constructor.
Example:
.. code-block:: python
from langchain.embeddings import CohereEmbeddings
cohere = CohereEmbeddings(model_name="medium", cohere_api_key="my-api-key")
"""
client: Any #: :meta private:
model: str = "medium"
"""Model name to use."""
cohere_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
cohere_api_key = get_from_dict_or_env(
values, "cohere_api_key", "COHERE_API_KEY"
)
try:
import cohere
values["client"] = cohere.Client(cohere_api_key)
except ImportError:
raise ValueError(
"Could not import cohere python package. "
"Please it install it with `pip install cohere`."
)
return values
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Call out to Cohere's embedding endpoint.
Args:
texts: The list of texts to embed.
Returns:
List of embeddings, one for each text.
"""
embeddings = self.client.embed(model=self.model, texts=texts).embeddings
return embeddings
def embed_query(self, text: str) -> List[float]:
"""Call out to Cohere's embedding endpoint.
Args:
text: The text to embed.
Returns:
Embeddings for the text.
"""
embedding = self.client.embed(model=self.model, texts=[text]).embeddings[0]
return embedding

View File

@@ -0,0 +1,68 @@
"""Wrapper around HuggingFace embedding models."""
from typing import Any, List
from pydantic import BaseModel, Extra
from langchain.embeddings.base import Embeddings
class HuggingFaceEmbeddings(BaseModel, Embeddings):
"""Wrapper around sentence_transformers embedding models.
To use, you should have the ``sentence_transformers`` python package installed.
Example:
.. code-block:: python
from langchain.embeddings import HuggingFaceEmbeddings
model_name = "sentence-transformers/all-mpnet-base-v2"
huggingface = HuggingFaceEmbeddings(model_name=model_name)
"""
client: Any #: :meta private:
model_name: str = "sentence-transformers/all-mpnet-base-v2"
"""Model name to use."""
def __init__(self, **kwargs: Any):
"""Initialize the sentence_transformer."""
super().__init__(**kwargs)
try:
import sentence_transformers
self.client = sentence_transformers.SentenceTransformer(self.model_name)
except ImportError:
raise ValueError(
"Could not import sentence_transformers python package. "
"Please install it with `pip install sentence_transformers`."
)
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Compute doc embeddings using a HuggingFace transformer model.
Args:
texts: The list of texts to embed.
Returns:
List of embeddings, one for each text.
"""
texts = list(map(lambda x: x.replace("\n", " "), texts))
embeddings = self.client.encode(texts)
return embeddings
def embed_query(self, text: str) -> List[float]:
"""Compute query embeddings using a HuggingFace transformer model.
Args:
text: The text to embed.
Returns:
Embeddings for the text.
"""
text = text.replace("\n", " ")
embedding = self.client.encode(text)
return embedding

View File

@@ -0,0 +1,86 @@
"""Wrapper around OpenAI embedding models."""
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.embeddings.base import Embeddings
from langchain.utils import get_from_dict_or_env
class OpenAIEmbeddings(BaseModel, Embeddings):
"""Wrapper around OpenAI embedding models.
To use, you should have the ``openai`` python package installed, and the
environment variable ``OPENAI_API_KEY`` set with your API key or pass it
as a named parameter to the constructor.
Example:
.. code-block:: python
from langchain.embeddings import OpenAIEmbeddings
openai = OpenAIEmbeddings(model_name="davinci", openai_api_key="my-api-key")
"""
client: Any #: :meta private:
model_name: str = "babbage"
"""Model name to use."""
openai_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
openai_api_key = get_from_dict_or_env(
values, "openai_api_key", "OPENAI_API_KEY"
)
try:
import openai
openai.api_key = openai_api_key
values["client"] = openai.Embedding
except ImportError:
raise ValueError(
"Could not import openai python package. "
"Please it install it with `pip install openai`."
)
return values
def _embedding_func(self, text: str, *, engine: str) -> List[float]:
"""Call out to OpenAI's embedding endpoint."""
# replace newlines, which can negatively affect performance.
text = text.replace("\n", " ")
return self.client.create(input=[text], engine=engine)["data"][0]["embedding"]
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Call out to OpenAI's embedding endpoint for embedding search docs.
Args:
texts: The list of texts to embed.
Returns:
List of embeddings, one for each text.
"""
responses = [
self._embedding_func(text, engine=f"text-search-{self.model_name}-doc-001")
for text in texts
]
return responses
def embed_query(self, text: str) -> List[float]:
"""Call out to OpenAI's embedding endpoint for embedding query text.
Args:
text: The text to embed.
Returns:
Embeddings for the text.
"""
embedding = self._embedding_func(
text, engine=f"text-search-{self.model_name}-query-001"
)
return embedding

View File

@@ -0,0 +1,23 @@
"""Utility functions for working with prompts."""
from typing import List
from langchain.chains.llm import LLMChain
from langchain.llms.base import LLM
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate
TEST_GEN_TEMPLATE_SUFFIX = "Add another example."
def generate_example(
examples: List[dict], llm: LLM, prompt_template: PromptTemplate
) -> str:
"""Return another example given a list of examples for a prompt."""
prompt = FewShotPromptTemplate(
examples=examples,
suffix=TEST_GEN_TEMPLATE_SUFFIX,
input_variables=[],
example_prompt=prompt_template,
)
chain = LLMChain(llm=llm, prompt=prompt)
return chain.predict()

View File

@@ -1,6 +1,7 @@
"""Utilities for formatting strings."""
from string import Formatter
from typing import Any, Mapping, Sequence, Union
from mako.template import Template
class StrictFormatter(Formatter):
@@ -28,5 +29,10 @@ class StrictFormatter(Formatter):
)
return super().vformat(format_string, args, kwargs)
def mako_format(self, format_string: str, **kwargs: Any) -> str:
"""Format a string using mako."""
template = Template(format_string)
return template.render(**kwargs)
formatter = StrictFormatter()

51
langchain/input.py Normal file
View File

@@ -0,0 +1,51 @@
"""Handle chained inputs."""
from typing import Dict, List, Optional
_TEXT_COLOR_MAPPING = {
"blue": "36;1",
"yellow": "33;1",
"pink": "38;5;200",
"green": "32;1",
}
def get_color_mapping(
items: List[str], excluded_colors: Optional[List] = None
) -> Dict[str, str]:
"""Get mapping for items to a support color."""
colors = list(_TEXT_COLOR_MAPPING.keys())
if excluded_colors is not None:
colors = [c for c in colors if c not in excluded_colors]
color_mapping = {item: colors[i % len(colors)] for i, item in enumerate(items)}
return color_mapping
def print_text(text: str, color: Optional[str] = None, end: str = "") -> None:
"""Print text with highlighting and no end characters."""
if color is None:
print(text, end=end)
else:
color_str = _TEXT_COLOR_MAPPING[color]
print(f"\u001b[{color_str}m\033[1;3m{text}\u001b[0m", end=end)
class ChainedInput:
"""Class for working with input that is the result of chains."""
def __init__(self, text: str, verbose: bool = False):
"""Initialize with verbose flag and initial text."""
self._verbose = verbose
if self._verbose:
print_text(text, None)
self._input = text
def add(self, text: str, color: Optional[str] = None) -> None:
"""Add text to input, print if in verbose mode."""
if self._verbose:
print_text(text, color)
self._input += text
@property
def input(self) -> str:
"""Return the accumulated input."""
return self._input

View File

@@ -1,5 +1,7 @@
"""Wrappers on top of large language models APIs."""
from langchain.llms.cohere import Cohere
from langchain.llms.huggingface_hub import HuggingFaceHub
from langchain.llms.nlpcloud import NLPCloud
from langchain.llms.openai import OpenAI
__all__ = ["Cohere", "OpenAI"]
__all__ = ["Cohere", "NLPCloud", "OpenAI", "HuggingFaceHub"]

128
langchain/llms/ai21.py Normal file
View File

@@ -0,0 +1,128 @@
"""Wrapper around AI21 APIs."""
from typing import Any, Dict, List, Mapping, Optional
import requests
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.utils import get_from_dict_or_env
class AI21PenaltyData(BaseModel):
"""Parameters for AI21 penalty data."""
scale: int = 0
applyToWhitespaces: bool = True
applyToPunctuations: bool = True
applyToNumbers: bool = True
applyToStopwords: bool = True
applyToEmojis: bool = True
class AI21(BaseModel, LLM):
"""Wrapper around AI21 large language models.
To use, you should have the environment variable ``AI21_API_KEY``
set with your API key.
Example:
.. code-block:: python
from langchain import AI21
ai21 = AI21(model="j1-jumbo")
"""
model: str = "j1-jumbo"
"""Model name to use."""
temperature: float = 0.7
"""What sampling temperature to use."""
maxTokens: int = 256
"""The maximum number of tokens to generate in the completion."""
minTokens: int = 0
"""The minimum number of tokens to generate in the completion."""
topP: float = 1.0
"""Total probability mass of tokens to consider at each step."""
presencePenalty: AI21PenaltyData = AI21PenaltyData()
"""Penalizes repeated tokens."""
countPenalty: AI21PenaltyData = AI21PenaltyData()
"""Penalizes repeated tokens according to count."""
frequencyPenalty: AI21PenaltyData = AI21PenaltyData()
"""Penalizes repeated tokens according to frequency."""
numResults: int = 1
"""How many completions to generate for each prompt."""
logitBias: Optional[Dict[str, float]] = None
"""Adjust the probability of specific tokens being generated."""
ai21_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key exists in environment."""
ai21_api_key = get_from_dict_or_env(values, "ai21_api_key", "AI21_API_KEY")
values["ai21_api_key"] = ai21_api_key
return values
@property
def _default_params(self) -> Mapping[str, Any]:
"""Get the default parameters for calling AI21 API."""
return {
"temperature": self.temperature,
"maxTokens": self.maxTokens,
"minTokens": self.minTokens,
"topP": self.topP,
"presencePenalty": self.presencePenalty.dict(),
"countPenalty": self.countPenalty.dict(),
"frequencyPenalty": self.frequencyPenalty.dict(),
"numResults": self.numResults,
"logitBias": self.logitBias,
}
@property
def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
return {**{"model": self.model}, **self._default_params}
def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
"""Call out to AI21's complete endpoint.
Args:
prompt: The prompt to pass into the model.
stop: Optional list of stop words to use when generating.
Returns:
The string generated by the model.
Example:
.. code-block:: python
response = ai21("Tell me a joke.")
"""
if stop is None:
stop = []
response = requests.post(
url=f"https://api.ai21.com/studio/v1/{self.model}/complete",
headers={"Authorization": f"Bearer {self.ai21_api_key}"},
json={"prompt": prompt, "stopSequences": stop, **self._default_params},
)
if response.status_code != 200:
optional_detail = response.json().get("error")
raise ValueError(
f"AI21 /complete call failed with status code {response.status_code}."
f" Details: {optional_detail}"
)
response_json = response.json()
return response_json["completions"][0]["data"]["text"]

View File

@@ -1,6 +1,6 @@
"""Base interface for large language models to expose."""
from abc import ABC, abstractmethod
from typing import List, Optional
from typing import Any, List, Mapping, Optional
class LLM(ABC):
@@ -9,3 +9,13 @@ class LLM(ABC):
@abstractmethod
def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
"""Run the LLM on the given prompt and input."""
@property
@abstractmethod
def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
def __str__(self) -> str:
"""Get a string representation of the object for printing."""
cls_name = f"\033[1m{self.__class__.__name__}\033[0m"
return f"{cls_name}\nParams: {self._identifying_params}"

View File

@@ -1,35 +1,29 @@
"""Wrapper around Cohere APIs."""
import os
from typing import Dict, List, Optional
from typing import Any, Dict, List, Mapping, Optional
import cohere
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.llms.utils import enforce_stop_tokens
from langchain.utils import get_from_dict_or_env
def remove_stop_tokens(text: str, stop: List[str]) -> str:
"""Remove stop tokens, should they occur at end."""
for s in stop:
if text.endswith(s):
return text[: -len(s)]
return text
class Cohere(BaseModel, LLM):
class Cohere(LLM, BaseModel):
"""Wrapper around Cohere large language models.
To use, you should have the ``cohere`` python package installed, and the
environment variable ``COHERE_API_KEY`` set with your API key.
environment variable ``COHERE_API_KEY`` set with your API key, or pass
it as a named parameter to the constructor.
Example:
.. code-block:: python
from langchain import Cohere
cohere = Cohere(model="gptd-instruct-tft")
cohere = Cohere(model="gptd-instruct-tft", cohere_api_key="my-api-key")
"""
model: str = "gptd-instruct-tft"
client: Any #: :meta private:
model: Optional[str] = None
"""Model name to use."""
max_tokens: int = 256
@@ -50,21 +44,47 @@ class Cohere(BaseModel, LLM):
presence_penalty: int = 0
"""Penalizes repeated tokens."""
cohere_api_key: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
@root_validator()
def template_is_valid(cls, values: Dict) -> Dict:
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
if "COHERE_API_KEY" not in os.environ:
cohere_api_key = get_from_dict_or_env(
values, "cohere_api_key", "COHERE_API_KEY"
)
try:
import cohere
values["client"] = cohere.Client(cohere_api_key)
except ImportError:
raise ValueError(
"Did not find Cohere API key, please add an environment variable"
" `COHERE_API_KEY` which contains it."
"Could not import cohere python package. "
"Please it install it with `pip install cohere`."
)
return values
@property
def _default_params(self) -> Mapping[str, Any]:
"""Get the default parameters for calling Cohere API."""
return {
"max_tokens": self.max_tokens,
"temperature": self.temperature,
"k": self.k,
"p": self.p,
"frequency_penalty": self.frequency_penalty,
"presence_penalty": self.presence_penalty,
}
@property
def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
return {**{"model": self.model}, **self._default_params}
def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
"""Call out to Cohere's generate endpoint.
@@ -80,21 +100,12 @@ class Cohere(BaseModel, LLM):
response = cohere("Tell me a joke.")
"""
client = cohere.Client(os.environ["COHERE_API_KEY"])
response = client.generate(
model=self.model,
prompt=prompt,
max_tokens=self.max_tokens,
temperature=self.temperature,
k=self.k,
p=self.p,
frequency_penalty=self.frequency_penalty,
presence_penalty=self.presence_penalty,
stop_sequences=stop,
response = self.client.generate(
model=self.model, prompt=prompt, stop_sequences=stop, **self._default_params
)
text = response.generations[0].text
# If stop tokens are provided, Cohere's endpoint returns them.
# In order to make this consistent with other endpoints, we strip them.
if stop is not None:
text = remove_stop_tokens(text, stop)
text = enforce_stop_tokens(text, stop)
return text

View File

@@ -0,0 +1,112 @@
"""Wrapper around HuggingFace APIs."""
from typing import Any, Dict, List, Mapping, Optional
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.llms.utils import enforce_stop_tokens
from langchain.utils import get_from_dict_or_env
DEFAULT_REPO_ID = "gpt2"
VALID_TASKS = ("text2text-generation", "text-generation")
class HuggingFaceHub(LLM, BaseModel):
"""Wrapper around HuggingFaceHub models.
To use, you should have the ``huggingface_hub`` python package installed, and the
environment variable ``HUGGINGFACEHUB_API_TOKEN`` set with your API token, or pass
it as a named parameter to the constructor.
Only supports `text-generation` and `text2text-generation` for now.
Example:
.. code-block:: python
from langchain import HuggingFaceHub
hf = HuggingFaceHub(repo_id="gpt2", huggingfacehub_api_token="my-api-key")
"""
client: Any #: :meta private:
repo_id: str = DEFAULT_REPO_ID
"""Model name to use."""
task: Optional[str] = None
"""Task to call the model with. Should be a task that returns `generated_text`."""
model_kwargs: Optional[dict] = None
"""Key word arguments to pass to the model."""
huggingfacehub_api_token: Optional[str] = None
class Config:
"""Configuration for this pydantic object."""
extra = Extra.forbid
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
huggingfacehub_api_token = get_from_dict_or_env(
values, "huggingfacehub_api_token", "HUGGINGFACEHUB_API_TOKEN"
)
try:
from huggingface_hub.inference_api import InferenceApi
repo_id = values.get("repo_id", DEFAULT_REPO_ID)
client = InferenceApi(
repo_id=repo_id,
token=huggingfacehub_api_token,
task=values.get("task"),
)
if client.task not in VALID_TASKS:
raise ValueError(
f"Got invalid task {client.task}, "
f"currently only {VALID_TASKS} are supported"
)
values["client"] = client
except ImportError:
raise ValueError(
"Could not import huggingface_hub python package. "
"Please it install it with `pip install huggingface_hub`."
)
return values
@property
def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
_model_kwargs = self.model_kwargs or {}
return {**{"repo_id": self.repo_id}, **_model_kwargs}
def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
"""Call out to HuggingFace Hub's inference endpoint.
Args:
prompt: The prompt to pass into the model.
stop: Optional list of stop words to use when generating.
Returns:
The string generated by the model.
Example:
.. code-block:: python
response = hf("Tell me a joke.")
"""
_model_kwargs = self.model_kwargs or {}
response = self.client(inputs=prompt, params=_model_kwargs)
if "error" in response:
raise ValueError(f"Error raised by inference API: {response['error']}")
if self.client.task == "text-generation":
# Text generation return includes the starter text.
text = response[0]["generated_text"][len(prompt) :]
elif self.client.task == "text2text-generation":
text = response[0]["generated_text"]
else:
raise ValueError(
f"Got invalid task {self.client.task}, "
f"currently only {VALID_TASKS} are supported"
)
if stop is not None:
# This is a bit hacky, but I can't figure out a better way to enforce
# stop tokens when making calls to huggingface_hub.
text = enforce_stop_tokens(text, stop)
return text

Some files were not shown because too many files have changed in this diff Show More