Compare commits

..

109 Commits

Author SHA1 Message Date
Harrison Chase
4d32441b86 bump version to 0076 (#847) 2023-02-02 10:05:39 -08:00
Harrison Chase
23d5f64bda Harrison/ngram example (#846)
Co-authored-by: Sean Spriggens <ssprigge@syr.edu>
2023-02-02 09:44:42 -08:00
Harrison Chase
0de55048b7 return code for pal (#844) 2023-02-02 08:47:20 -08:00
Harrison Chase
d564308e0f rfc: instruct embeddings (#811)
Co-authored-by: seanaedmiston <seane999@gmail.com>
2023-02-02 08:44:02 -08:00
Nick Furlotte
576609e665 Update PAL to allow passing local and global context to PythonREPL (#774)
Passing additional variables to the python environment can be useful for
example if you want to generate code to analyze a dataset.

I also added a tracker for the executed code - `code_history`.
2023-02-02 08:34:23 -08:00
Harrison Chase
3f952eb597 add from string method (#820) 2023-02-02 08:23:54 -08:00
Ikko Eltociear Ashimine
ba26a879e0 Fix typo in crawler.py (#842)
seperator -> separator
2023-02-02 08:23:38 -08:00
Eli Mernit
bfabd1d5c0 Added new deployment template (#835)
This PR introduces a new template for deploying LangChain apps as web
endpoints. It includes template code, and links to a detailed
code-walkthrough.
2023-02-01 23:38:36 -08:00
Jonas Ehrenstein
f3508228df Minor fix for google search util: it's uncertain if "snippet" in results exists (#830)
The results from Google search may not always contain a "snippet". 

Example:
`{'kind': 'customsearch#result', 'title': 'FEMA Flood Map', 'htmlTitle':
'FEMA Flood Map', 'link': 'https://msc.fema.gov/portal/home',
'displayLink': 'msc.fema.gov', 'formattedUrl':
'https://msc.fema.gov/portal/home', 'htmlFormattedUrl':
'https://<b>msc</b>.fema.gov/portal/home'}`

This will cause a KeyError at line 99
`snippets.append(result["snippet"])`.
2023-02-01 23:37:52 -08:00
Zach Schillaci
b4eb043b81 Minor fix to SQLDatabaseChain doc (#826) 2023-02-01 23:37:38 -08:00
Istora Mandiri
06438794e1 Fix typo in textsplitter docs (#825) 2023-02-01 23:32:35 -08:00
Raza Habib
9f8e05ffd4 Update __init__.py (#827)
Remove duplicate APIChain
2023-02-01 23:31:38 -08:00
Harrison Chase
b0d560be56 add to gallery (#824) 2023-02-01 07:10:15 -08:00
Johanna Appel
ebea40ce86 Add 'truncate' parameter for CohereEmbeddings (#798)
Currently, the 'truncate' parameter of the cohere API is not supported.

This means that by default, if trying to generate and embedding that is
too big, the call will just fail with an error (which is frustrating if
using this embedding source e.g. with GPT-Index, because it's hard to
handle it properly when generating a lot of embeddings).
With the parameter, one can decide to either truncate the START or END
of the text to fit the max token length and still generate an embedding
without throwing the error.

In this PR, I added this parameter to the class.

_Arguably, there should be a better way to handle this error, e.g. by
optionally calling a function or so that gets triggered when the token
limit is reached and can split the document or some such. Especially in
the use case with GPT-Index, its often hard to estimate the token counts
for each document and I'd rather sort out the troublemakers or simply
split them than interrupting the whole execution.
Thoughts?_

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-02-01 07:09:03 -08:00
Harrison Chase
b9045f7e0d bump version to 0075 (#819) 2023-01-31 00:18:32 -08:00
Harrison Chase
7b4882a2f4 Harrison/tf embeddings (#817)
Co-authored-by: Ryohei Kuroki <10434946+yakigac@users.noreply.github.com>
2023-01-31 00:00:08 -08:00
Harrison Chase
5d4b6e4d4e conversational agent fix (#818) 2023-01-30 23:59:55 -08:00
Harrison Chase
94ae126747 return sql intermediate steps (#792) 2023-01-30 15:10:48 -08:00
bair82
ae5695ad32 Update cohere.py (#795)
When stop tokens are set in Cohere LLM constructor, they are currently
not stripped from the response, and they should be stripped
2023-01-30 14:55:44 -08:00
Johanna Appel
cacf4091c0 Fix documentation for 'model' parameter in CohereEmbeddings (#797)
Currently, the class parameter 'model_name' of the CohereEmbeddings
class is not supported, but 'model' is. The class documentation is
inconsistent with this, though, so I propose to either fix the
documentation (this PR right now) or fix the parameter.

It will create the following error:
```
ValidationError: 1 validation error for CohereEmbeddings
model_name
  extra fields not permitted (type=value_error.extra)
```
2023-01-30 14:55:08 -08:00
Jason Liu
54f9e4287f Pass kwargs from initialize_agent into agent classmethod (#799)
# Problem
I noticed that in order to change the prefix of the prompt in the
`zero-shot-react-description` agent
we had to dig around to subset strings deep into the agent's attributes.
It requires the user to inspect a long chain of attributes and classes.

`initialize_agent -> AgentExecutor -> Agent -> LLMChain -> Prompt from
Agent.create_prompt`

``` python
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description"
)
prompt_str = agent.agent.llm_chain.prompt.template
new_prompt_str = change_prefix(prompt_str)
agent.agent.llm_chain.prompt.template = new_prompt_str
```

# Implemented Solution

`initialize_agent` accepts `**kwargs` but passes it to `AgentExecutor`
but not `ZeroShotAgent`, by simply giving the kwargs to the agent class
methods we can support changing the prefix and suffix for one agent
while allowing future agents to take advantage of `initialize_agent`.


```
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description",
    agent_kwargs={"prefix": prefix, "suffix": suffix}
)
```

To be fair, this was before finding docs around custom agents here:
https://langchain.readthedocs.io/en/latest/modules/agents/examples/custom_agent.html?highlight=custom%20#custom-llmchain
but i find that my use case just needed to change the prefix a little.


# Changes

* Pass kwargs to Agent class method
* Added a test to check suffix and prefix

---------

Co-authored-by: Jason Liu <jason@jxnl.coA>
2023-01-30 14:54:09 -08:00
Roger Zurawicki
c331009440 docs: Update langchain link to PyPI (#800)
Simple one-line fix

CONTRIBUTING used a link that pointed to the `ruff` project.
2023-01-30 14:53:16 -08:00
Roy Williams
6086292252 Centralize logic for loading from LangChainHub, add ability to pin dependencies (#805)
It's generally considered to be a good practice to pin dependencies to
prevent surprise breakages when a new version of a dependency is
released. This commit adds the ability to pin dependencies when loading
from LangChainHub.

Centralizing this logic and using urllib fixes an issue identified by
some windows users highlighted in this video -
https://youtu.be/aJ6IQUh8MLQ?t=537
2023-01-30 14:52:17 -08:00
Harrison Chase
b3916f74a7 enable mmr search (#807) 2023-01-30 14:48:24 -08:00
Harrison Chase
f46f1d28af expose memory key name (#808) 2023-01-30 14:48:12 -08:00
Harrison Chase
7728a848d0 Harrison/tracing docs (#806)
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
2023-01-29 20:49:35 -08:00
Harrison Chase
f3da4dc6ba Harrison/tracing docs (#804)
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
2023-01-29 20:24:22 -08:00
Harrison Chase
ae1b589f60 Harrison/add link for support (#794) 2023-01-28 22:53:04 -08:00
Harrison Chase
6a20f07f0d add link for support (#793) 2023-01-28 22:44:23 -08:00
Harrison Chase
fb2d7afe71 bump version to 0074 (#791) 2023-01-28 18:50:22 -08:00
Harrison Chase
1ad7973cc6 Harrison/tool decorator (#790)
Co-authored-by: Jason Liu <jxnl@users.noreply.github.com>
Co-authored-by: Jason Liu <jason@jxnl.coA>
2023-01-28 18:26:24 -08:00
Harrison Chase
5f73d06502 Harrison/fix caching bug (#788)
Co-authored-by: thepok <richterthepok@yahoo.de>
2023-01-28 14:24:30 -08:00
Harrison Chase
248c297f1b Sample row in table info for SQLDatabase (#769) (#782)
The agents usually benefit from understanding what the data looks like
to be able to filter effectively. Sending just one row in the table info
allows the agent to understand the data before querying and get better
results.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-01-28 13:37:07 -08:00
Francisco Ingham
213c2e33e5 Sql prompt improvement (#787)
Co-authored-by: Francisco Ingham <>
2023-01-28 13:34:15 -08:00
Harrison Chase
2e0219cac0 fixing bash util (#779) 2023-01-28 08:26:29 -08:00
Harrison Chase
966611bbfa add model kwargs to handle stop token from cohere (#773) 2023-01-28 08:24:55 -08:00
Harrison Chase
7198a1cb22 Harrison/refactor agent (#781)
Co-authored-by: Amos Ng <me@amos.ng>
2023-01-28 08:24:13 -08:00
Harrison Chase
5bb2952860 Harrison/hf pipeline (#780)
Co-authored-by: Parth Chadha <parth29@gmail.com>
2023-01-28 08:23:59 -08:00
Harrison Chase
c658f0aed3 Harrison/add to search (#778)
Co-authored-by: Enrico Shippole <enricoship@gmail.com>
2023-01-28 08:06:00 -08:00
Bill Kish
309d86e339 increase text-davinci-003 contextsize to 4097 (#748)
text-davinci-003 supports a context size of 4097 tokens so return 4097
instead of 4000 in modelname_to_contextsize() for text-davinci-003

Co-authored-by: Bill Kish <bill@cogniac.co>
2023-01-28 08:05:35 -08:00
Amos Ng
6ad360bdef Suggestions for better debugging (#765)
Please feel free to disregard any changes you disagree with
2023-01-28 08:05:20 -08:00
Albert Ziegler
5198d6f541 Add missing verb (#768)
Mini drive-by PR:

I came across this sentence in a stack trace for an error I had, and it
confused me because the verb I missing. So I added the verb.
2023-01-28 07:26:27 -08:00
Harrison Chase
a5d003f0c9 update notebook and make backwards compatible (#772) 2023-01-28 07:23:04 -08:00
Harrison Chase
924b7ecf89 pass kwargs and bump (#770) 2023-01-27 08:56:36 -08:00
Harrison Chase
fc19d14a65 bump version to 0072 (#767) 2023-01-27 08:03:41 -08:00
Harrison Chase
b9ad214801 add docs for loading from hub (#763) 2023-01-27 07:10:26 -08:00
Samantha Whitmore
be7de427ca Serialize all the chains! (#761)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-01-27 00:45:17 -08:00
Harrison Chase
e2a7fed890 Harrison/serialize from llm and tools (#760) 2023-01-26 23:30:39 -08:00
Harrison Chase
12dc7f26cc load agents from hub (#759) 2023-01-26 22:49:26 -08:00
Harrison Chase
7129f23511 output parser serialization (#758) 2023-01-26 21:51:13 -08:00
Harrison Chase
f273c50d62 add loading chains from hub (#757) 2023-01-26 21:11:31 -08:00
Harrison Chase
1b89a438cf (wip) Harrison/serialize agents (#725) 2023-01-26 19:48:47 -08:00
Harrison Chase
cc70565886 add prompt type (#730) 2023-01-26 19:48:00 -08:00
Francisco Ingham
374e510f94 Upper bound on number of iterations (#754)
Some custom agents might continue to iterate until they find the correct
answer, getting stuck on loops that generate request after request and
are really expensive for the end user. Putting an upper bound for the
number of iterations
by default controls this and can be explicitly tweaked by the user if
necessary.

Co-authored-by: Francisco Ingham <>
2023-01-26 19:47:01 -08:00
Smit Shah
28efbb05bf Add params to reduce K dynamically to reduce it below token limit (#739)
Referring to #687, I implemented the functionality to reduce K if it
exceeds the token limit.

Edit: I should have ran make lint locally. Also, this only applies to
`StuffDocumentChain`
2023-01-26 19:43:01 -08:00
Roy Williams
d2f882158f Add type information for crawler.py (#738)
Added type information to `crawler.py` to make it safer to use and
understand.
2023-01-26 19:37:31 -08:00
Harrison Chase
a80897478e bump version to 0071 (#755) 2023-01-26 18:55:25 -08:00
Ankush Gola
57609845df add tracing support to langchain (#741)
* add implementations of `BaseCallbackHandler` to support tracing:
`SharedTracer` which is thread-safe and `Tracer` which is not and is
meant to be used locally.
* Tracers persist runs to locally running `langchain-server`

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-01-26 17:38:13 -08:00
Harrison Chase
7f76a1189c bump version to 0.0.70 (#744) 2023-01-25 17:58:37 -08:00
Harrison Chase
2ba1128095 Harrison/backwards compat (#740) 2023-01-25 17:47:29 -08:00
Francisco Ingham
f9ddcb5705 Hotfix: distance_func and collection_name must not be in kwargs (#735)
If `distance_func` and `collection_name` are in `kwargs` they are sent
to the `QdrantClient` which results in an error being raised.

Co-authored-by: Francisco Ingham <>
2023-01-25 09:39:50 -08:00
Amos Ng
fa6826e417 Fix sqlalchemy warnings when running tests (#733)
This has been bugging me when running my own tests that call langchain
methods :P
2023-01-25 07:14:07 -08:00
Harrison Chase
bd0bf4e0a9 Harrison/generate blog post (#732)
Co-authored-by: Ren <yirenlu92@users.noreply.github.com>
2023-01-24 22:54:12 -08:00
Harrison Chase
9194a8be89 add stop to stream (#729) 2023-01-24 22:49:24 -08:00
scadEfUr
e3df8ab6dc move hyde into chains (#728)
Co-authored-by: scadEfUr <>
2023-01-24 22:23:32 -08:00
Harrison Chase
0ffeabd14f Harrison/serialize llm chain (#671) 2023-01-24 21:36:19 -08:00
Sam Hogan
499e54edda fix typos in readme and text splitter docs (#720)
Fix typos in readme and TextSplitter documentation.
2023-01-24 10:59:23 -08:00
I-E-E-E
f62dbb018b fix a url (#719) 2023-01-24 10:56:15 -08:00
Николай Шангин
18b1466893 Fix not imported 'validator' (#715)
otherwise `@validator("input_variables")` do not work
2023-01-24 07:06:50 -08:00
Feynman Liang
2824f36401 Add namespace to Pinecone.from_index (#716)
Resolves https://github.com/hwchase17/langchain/issues/718
2023-01-24 07:02:57 -08:00
Kacper Łukawski
d4f719c34b Convert numpy arrays to lists in HuggingFaceEmbeddings (#714)
`SentenceTransformer` returns a NumPy array, not a `List[List[float]]`
or `List[float]` as specified in the interface of `Embeddings`. That PR
makes it consistent with the interface.
2023-01-24 07:01:40 -08:00
Kacper Łukawski
97c3544a1e Hotfix: Qdrant.from_text embeddings (#713)
I'm providing a hotfix for Qdrant integration. Calculating a single
embedding to obtain the vector size was great idea. However, that change
introduced a bug trying to put only that single embedding into the
database. It's fixed. Right now all the embeddings will be pushed to
Qdrant.
2023-01-24 07:01:07 -08:00
Harrison Chase
b69b551c8b clarify use cases (#711) 2023-01-24 00:37:26 -08:00
Harrison Chase
1e4927a1d2 bump version to 0069 (#710) 2023-01-24 00:24:54 -08:00
Feynman Liang
3a38604f07 Fix typo (#705) 2023-01-23 23:08:38 -08:00
Nicolas
66fd57878a docs: Update vector_db_qa_with_sources.ipynb (#706) 2023-01-23 23:06:54 -08:00
Harrison Chase
fc4ad2db0f langchain hub docs (#704)
Co-authored-by: scadEfUr <123224380+scadEfUr@users.noreply.github.com>
2023-01-23 23:06:23 -08:00
Scott Leibrand
34932dd211 remove legacy embedding model name (#703)
Now that OpenAI has deprecated all embeddings models except
text-embedding-ada-002, we should stop specifying a legacy embedding
model in the example. This will also avoid confusion from people (like
me) trying to specify model="text-embedding-ada-002" and having that
erroneously expanded to text-search-text-embedding-ada-002-query-001
2023-01-23 14:31:31 -08:00
Harrison Chase
75edd85fed version 0068 (#701) 2023-01-23 07:24:09 -08:00
scadEfUr
4aba0abeaa added common prompt load method (#699)
Co-authored-by: scadEfUr
2023-01-22 23:46:11 -08:00
xloem
36b6b3cdf6 HuggingFacePipeline: Forward model_kwargs. (#696)
Since the tokenizer and model are constructed manually, model_kwargs
needs to
be passed to their constructors. Additionally, the pipeline has a
specific
named parameter to pass these with, which can provide forward
compatibility if
they are used for something other than tokenizer or model construction.
2023-01-22 23:38:47 -08:00
Harrison Chase
3a30e6daa8 Harrison/openai callback (#684) 2023-01-22 23:37:01 -08:00
Harrison Chase
aef82f5d59 fix whitespace for conversational agent (#690) 2023-01-22 22:39:53 -08:00
Amos Ng
8baf6fb920 Update examples to fix execution problems (#685)
On the [Getting Started
page](https://langchain.readthedocs.io/en/latest/modules/prompts/getting_started.html)
for prompt templates, I believe the very last example

```python
print(dynamic_prompt.format(adjective=long_string))
```

should actually be

```python
print(dynamic_prompt.format(input=long_string))
```

The existing example produces `KeyError: 'input'` as expected

***

On the [Create a custom prompt
template](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/custom_prompt_template.html#id1)
page, I believe the line

```python
Function Name: {kwargs["function_name"]}
```

should actually be

```python
Function Name: {kwargs["function_name"].__name__}
```

The existing example produces the prompt:

```
        Given the function name and source code, generate an English language explanation of the function.
        Function Name: <function get_source_code at 0x7f907bc0e0e0>
        Source Code:
        def get_source_code(function_name):
    # Get the source code of the function
    return inspect.getsource(function_name)

        Explanation:
```

***

On the [Example
Selectors](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/example_selectors.html)
page, the first example does not define `example_prompt`, which is also
subtly different from previous example prompts used. For user
convenience, I suggest including

```python
example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)
```

in the code to be copy-pasted
2023-01-22 14:49:25 -08:00
Harrison Chase
86dbdb118b Harrison/serpapi extra tools (#691)
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
2023-01-22 14:48:54 -08:00
Saurav Maheshkar
b4fcdeb56c chore: move coverage config to pyproject (#694)
This PR aims to move the contents of `.coveragerc` to `pyproject.toml`
to make the overall file structure more minimal.
2023-01-22 14:48:20 -08:00
Nicolas
4ddfa82bb7 docs: small typo on serpapi.md (#693) 2023-01-22 13:10:24 -08:00
Nicolas
34cb8850e9 docs: small typo google_search.md (#692) 2023-01-22 13:09:15 -08:00
Harrison Chase
cbc146720b verbose flag (#683) 2023-01-22 12:44:14 -08:00
Harrison Chase
27cef0870d bump version to 0.0.67 (#689) 2023-01-22 10:24:03 -08:00
Samantha Whitmore
77e3d58922 ConversationEntityMemory: Chain which uses an entity extraction & sum… (#678)
…marization prompt to maintain a key-value store of memory information

cc @devennavani

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-01-22 10:10:02 -08:00
Ikko Eltociear Ashimine
64580259d0 Fix typo in hyde.ipynb (#688)
therefor -> therefore
2023-01-22 08:21:31 -08:00
dham
e04b063ff4 add faiss local saving/loading (#676)
- This uses the faiss built-in `write_index` and `load_index` to save
and load faiss indexes locally
- Also fixes #674
- The save/load functions also use the faiss library, so I refactored
the dependency into a function
2023-01-21 16:08:14 -08:00
Harrison Chase
e45f7e40e8 Harrison/few shot yaml (#682)
Co-authored-by: vintro <77507980+vintrocode@users.noreply.github.com>
2023-01-21 16:08:03 -08:00
Harrison Chase
a2eeaf3d43 strip whitespace (#680) 2023-01-21 16:03:48 -08:00
Will Olson
2f57d18b25 Update hyperlink in Custom Prompt Template page (#677)
The current link points to a non-existent page. I've updated the link to
match what is on the "Create a custom example selector" page.

<img width="584" alt="Screen Shot 2023-01-21 at 10 33 05 AM"
src="https://user-images.githubusercontent.com/6773706/213879535-d8f2953d-ac37-448d-9b32-fdeb7b73cc32.png">
2023-01-21 16:03:21 -08:00
Harrison Chase
3d41af0aba Harrison/load tools kwargs (#681)
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
2023-01-21 16:03:02 -08:00
trigaten
90e4b6b040 Create CITATION.cff (#672)
You may want to add doi/orcid

Followed this:
https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files
2023-01-21 15:55:58 -08:00
Harrison Chase
236ae93610 bump version to 0066 (#667) 2023-01-20 14:22:31 -08:00
Harrison Chase
0b204d8c21 Harrison/quadrant (#665)
Co-authored-by: Kacper Łukawski <kacperlukawski@users.noreply.github.com>
2023-01-20 09:45:01 -08:00
Harrison Chase
983b73f47c add search kwargs (#664) 2023-01-20 07:42:08 -08:00
vertinski
65f3a341b0 Prompt fix for empty intermediate steps in summarization (#660)
Adding quotation marks around {text} avoids generating empty or
completely random responses from OpenAI davinci-003. Empty or completely
unrelated intermediate responses in summarization messes up the final
result or makes it very inaccurate.
The error from OpenAI would be: "The model predicted a completion that
begins with a stop sequence, resulting in no output. Consider adjusting
your prompt or stop sequences."
This fix corrects the prompting for summarization chain. This works on
API too, the images are for demonstrative purposes.
This approach can be applied to other similar prompts too. 

Examples:

1) Without quotation marks
![Screenshot from 2023-01-20
07-18-19](https://user-images.githubusercontent.com/22897470/213624365-9dfc18f9-5f3f-45d2-abe1-56de67397e22.png)

2) With quotation marks
![Screenshot from 2023-01-20
07-18-35](https://user-images.githubusercontent.com/22897470/213624478-c958e742-a4a7-46fe-a163-eca6326d9dae.png)
2023-01-20 07:37:01 -08:00
iocuydi
69998b5fad Add ids parameter for pinecone from_texts / add_texts (#659)
Allow optionally specifying a list of ids for pinecone rather than
having them randomly generated.
This also permits editing the embedding/metadata of existing pinecone
entries, by id.
2023-01-20 06:50:03 -08:00
Harrison Chase
54d7f1c933 fix caching (#658) 2023-01-19 15:33:45 -08:00
Harrison Chase
d0fdc6da11 Harrison/bing wrapper (#656)
Co-authored-by: Enrico Shippole <henryshippole@gmail.com>
2023-01-19 14:48:30 -08:00
iocuydi
207e319a70 Add search_kwargs option for VectorDBQAWithSourcesChain (#657)
Allows for passing additional vectorstore params like namespace, etc. to
VectorDBQAWithSourcesChain

Example:
`chain = VectorDBQAWithSourcesChain.from_llm(OpenAI(temperature=0),
vectorstore=store, search_kwargs={"namespace": namespace})`
2023-01-19 14:48:13 -08:00
Charles Frye
bfb23f4608 typo bugfixes in getting started with prompts (#651)
tl;dr: input -> word, output -> antonym, rename to dynamic_prompt
consistently

The provided code in this example doesn't run, because the keys are
`word` and `antonym`, rather than `input` and `output`.

Also, the `ExampleSelector`-based prompt is named `few_shot_prompt` when
defined and `dynamic_prompt` in the follow-up example. The former name
is less descriptive and collides with an earlier example, so I opted for
the latter.

Thanks for making a really cool library!
2023-01-19 07:05:20 -08:00
John
3adc5227cd typo (#650) 2023-01-19 07:03:11 -08:00
Harrison Chase
052c361031 pinecone docstring (#654) 2023-01-19 07:02:52 -08:00
163 changed files with 9384 additions and 1162 deletions

View File

@@ -1,2 +0,0 @@
[run]
omit = tests/*

8
CITATION.cff Normal file
View File

@@ -0,0 +1,8 @@
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Chase"
given-names: "Harrison"
title: "LangChain"
date-released: 2022-10-17
url: "https://github.com/hwchase17/langchain"

View File

@@ -47,7 +47,7 @@ good code into the codebase.
### 🏭Release process
As of now, LangChain has an ad hoc release process: releases are cut with high frequency via by
a developer and published to [PyPI](https://pypi.org/project/ruff/).
a developer and published to [PyPI](https://pypi.org/project/langchain/).
LangChain follows the [semver](https://semver.org/) versioning standard. However, as pre-1.0 software,
even patch releases may contain [non-backwards-compatible changes](https://semver.org/#spec-item-4).

View File

@@ -4,6 +4,9 @@
[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml) [![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml) [![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai) [![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
**Production Support:** As you move your LangChains into production, we'd love to offer more comprehensive support.
Please fill out [this form](https://forms.gle/57d8AmXBYp8PP8tZA) and we'll set up a dedicated support Slack channel.
## Quick Install
`pip install langchain`
@@ -15,7 +18,22 @@ developers to build applications that they previously could not.
But using these LLMs in isolation is often not enough to
create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications.
This library is aimed at assisting in the development of those types of applications. Common examples of these types of applications include:
**❓ Question Answering over specific documents**
- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html)
- End-to-end Example: [Question Answering over Notion Database](https://github.com/hwchase17/notion-qa)
**💬 Chatbots**
- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/chatbots.html)
- End-to-end Example: [Chat-LangChain](https://github.com/hwchase17/chat-langchain)
**🤖 Agents**
- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/agents.html)
- End-to-end Example: [GPT+WolframAlpha](https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain)
## 📖 Documentation

View File

@@ -22,3 +22,9 @@ This repo serves as a template for how deploy a LangChain with Gradio.
It implements a chatbot interface, with a "Bring-Your-Own-Token" approach (nice for not wracking up big bills).
It also contains instructions for how to deploy this app on the Hugging Face platform.
This is heavily influenced by James Weaver's [excellent examples](https://huggingface.co/JavaFXpert).
## [Beam](https://github.com/slai-labs/get-beam/tree/main/examples/langchain-question-answering)
This repo serves as a template for how deploy a LangChain with [Beam](https://beam.cloud).
It implements a Question Answering app and contains instructions for deploying the app as a serverless REST API.

View File

@@ -1,7 +1,7 @@
# Google Search Wrapper
This page covers how to use the Google Search API within LangChain.
It is broken into two parts: installation and setup, and then references to specific Pinecone wrappers.
It is broken into two parts: installation and setup, and then references to the specific Google Search wrapper.
## Installation and Setup
- Install requirements with `pip install google-api-python-client`

View File

@@ -1,7 +1,7 @@
# SerpAPI
This page covers how to use the SerpAPI search APIs within LangChain.
It is broken into two parts: installation and setup, and then references to specific Pinecone wrappers.
It is broken into two parts: installation and setup, and then references to the specific SerpAPI wrapper.
## Installation and Setup
- Install requirements with `pip install google-search-results`

View File

@@ -77,6 +77,17 @@ Open Source
+++
A jupyter notebook demonstrating how you could create a semantic search engine on documents in one of your Google Folders
---
.. link-button:: https://github.com/venuv/langchain_semantic_search
:type: url
:text: Google Folder Semantic Search
:classes: stretched-link btn-lg
+++
Build a GitHub support bot with GPT3, LangChain, and Python.
---

View File

@@ -7,7 +7,22 @@ But using these LLMs in isolation is often not enough to
create a truly powerful app - the real power comes when you are able to
combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications.
This library is aimed at assisting in the development of those types of applications. Common examples of these types of applications include:
**❓ Question Answering over specific documents**
- `Documentation <./use_cases/question_answering.html>`_
- End-to-end Example: `Question Answering over Notion Database <https://github.com/hwchase17/notion-qa>`_
**💬 Chatbots**
- `Documentation <./use_cases/chatbots.html>`_
- End-to-end Example: `Chat-LangChain <https://github.com/hwchase17/chat-langchain>`_
**🤖 Agents**
- `Documentation <./use_cases/agents.html>`_
- End-to-end Example: `GPT+WolframAlpha <https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain>`_
Getting Started
----------------
@@ -137,6 +152,8 @@ Additional Resources
Additional collection of resources we think may be useful as you develop your application!
- `LangChainHub <https://github.com/hwchase17/langchain-hub>`_: The LangChainHub is a place to share and explore other prompts, chains, and agents.
- `Glossary <./glossary.html>`_: A glossary of all related terms, papers, methods, etc. Whether implemented in LangChain or not!
- `Gallery <./gallery.html>`_: A collection of our favorite projects that use LangChain. Useful for finding inspiration or seeing how things were done in other applications.
@@ -145,6 +162,10 @@ Additional collection of resources we think may be useful as you develop your ap
- `Discord <https://discord.gg/6adMQxSpJS>`_: Join us on our Discord to discuss all things LangChain!
- `Tracing <./tracing.html>`_: A guide on using tracing in LangChain to visualize the execution of chains and agents.
- `Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>`_: As you move your LangChains into production, we'd love to offer more comprehensive support. Please fill out this form and we'll set up a dedicated support Slack channel.
.. toctree::
:maxdepth: 1
@@ -152,6 +173,10 @@ Additional collection of resources we think may be useful as you develop your ap
:name: resources
:hidden:
LangChainHub <https://github.com/hwchase17/langchain-hub>
./glossary.md
./gallery.rst
./deployments.md
./tracing.md
Discord <https://discord.gg/6adMQxSpJS>
Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>

View File

@@ -53,7 +53,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 2,
"id": "becda2a1",
"metadata": {},
"outputs": [],
@@ -70,7 +70,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 3,
"id": "339b1bb8",
"metadata": {},
"outputs": [],
@@ -99,7 +99,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 4,
"id": "e21d2098",
"metadata": {},
"outputs": [
@@ -134,7 +134,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "5e028e6d",
"metadata": {},
@@ -146,7 +145,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 5,
"id": "9b1cc2a2",
"metadata": {},
"outputs": [],
@@ -156,17 +155,18 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 7,
"id": "e4f5092f",
"metadata": {},
"outputs": [],
"source": [
"agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools)"
"tool_names = [tool.name for tool in tools]\n",
"agent = ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 8,
"id": "490604e9",
"metadata": {},
"outputs": [],
@@ -176,7 +176,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 9,
"id": "653b1617",
"metadata": {},
"outputs": [
@@ -191,22 +191,23 @@
"Action: Search\n",
"Action Input: Population of Canada\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCanada is a country in North America. Its ten provinces and three territories extend from the Atlantic Ocean to the Pacific Ocean and northward into the Arctic Ocean, covering over 9.98 million square kilometres, making it the world's second-largest country by total area.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out the exact population of Canada\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out the population of Canada\n",
"Action: Search\n",
"Action Input: Population of Canada 2020\u001b[0m\n",
"Action Input: Population of Canada\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mCanada is a country in North America. Its ten provinces and three territories extend from the Atlantic Ocean to the Pacific Ocean and northward into the Arctic Ocean, covering over 9.98 million square kilometres, making it the world's second-largest country by total area.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the population of Canada\n",
"Final Answer: Arrr, Canada be home to 37.59 million people!\u001b[0m\n",
"\u001b[1m> Finished AgentExecutor chain.\u001b[0m\n"
"Final Answer: Arrr, Canada be home to over 37 million people!\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Arrr, Canada be home to 37.59 million people!'"
"'Arrr, Canada be home to over 37 million people!'"
]
},
"execution_count": 19,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -361,7 +362,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -375,7 +376,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12 (default, Feb 15 2022, 17:41:09) \n[Clang 12.0.5 (clang-1205.0.22.11)]"
"version": "3.10.9"
},
"vscode": {
"interpreter": {

View File

@@ -10,15 +10,17 @@
"When constructing your own agent, you will need to provide it with a list of Tools that it can use. A Tool is defined as below.\n",
"\n",
"```python\n",
"class Tool(NamedTuple):\n",
"@dataclass \n",
"class Tool:\n",
" \"\"\"Interface for tools.\"\"\"\n",
"\n",
" name: str\n",
" func: Callable[[str], str]\n",
" description: Optional[str] = None\n",
" return_direct: bool = True\n",
"```\n",
"\n",
"The two required components of a Tool are the name and then the tool itself. A tool description is optional, as it is needed for some agents but not all."
"The two required components of a Tool are the name and then the tool itself. A tool description is optional, as it is needed for some agents but not all. You can create these tools directly, but we also provide a decorator to easily convert any function into a tool."
]
},
{
@@ -151,6 +153,94 @@
"agent.run(\"Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?\")"
]
},
{
"cell_type": "markdown",
"id": "824eaf74",
"metadata": {},
"source": [
"## Using the `tool` decorator\n",
"\n",
"To make it easier to define custom tools, a `@tool` decorator is provided. This decorator can be used to quickly create a `Tool` from a simple function. The decorator uses the function name as the tool name by default, but this can be overridden by passing a string as the first argument. Additionally, the decorator will use the function's docstring as the tool's description."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "8f15307d",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import tool\n",
"\n",
"@tool\n",
"def search_api(query: str) -> str:\n",
" \"\"\"Searches the API for the query.\"\"\"\n",
" return \"Results\""
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "0a23b91b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Tool(name='search_api', func=<function search_api at 0x10dad7d90>, description='search_api(query: str) -> str - Searches the API for the query.', return_direct=False)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search_api"
]
},
{
"cell_type": "markdown",
"id": "cc6ee8c1",
"metadata": {},
"source": [
"You can also provide arguments like the tool name and whether to return directly."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "28cdf04d",
"metadata": {},
"outputs": [],
"source": [
"@tool(\"search\", return_direct=True)\n",
"def search_api(query: str) -> str:\n",
" \"\"\"Searches the API for the query.\"\"\"\n",
" return \"Results\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1085a4bd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Tool(name='search', func=<function search_api at 0x112301bd0>, description='search(query: str) -> str - Searches the API for the query.', return_direct=True)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search_api"
]
},
{
"cell_type": "markdown",
"id": "1d0430d6",
@@ -432,7 +522,7 @@
},
"vscode": {
"interpreter": {
"hash": "cb23c3a7a387ab03496baa08507270f8e0861b23170e79d5edc545893cdca840"
"hash": "e90c8aa204a57276aa905271aff2d11799d0acb3547adabc5892e639a5e45e34"
}
}
},

View File

@@ -0,0 +1,108 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "991b1cc1",
"metadata": {},
"source": [
"# Loading from LangChainHub\n",
"\n",
"This notebook covers how to load agents from [LangChainHub](https://github.com/hwchase17/langchain-hub)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "bd4450a2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[36;1m\u001b[1;3m2016 · SUI · Stan Wawrinka ; 2017 · ESP · Rafael Nadal ; 2018 · SRB · Novak Djokovic ; 2019 · ESP · Rafael Nadal.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mSo the reigning men's U.S. Open champion is Rafael Nadal.\n",
"Follow up: What is Rafael Nadal's hometown?\u001b[0m\n",
"Intermediate answer: \u001b[36;1m\u001b[1;3mIn 2016, he once again showed his deep ties to Mallorca and opened the Rafa Nadal Academy in his hometown of Manacor.\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mSo the final answer is: Manacor, Mallorca, Spain.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Manacor, Mallorca, Spain.'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain import OpenAI, SerpAPIWrapper\n",
"from langchain.agents import initialize_agent, Tool\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"search = SerpAPIWrapper()\n",
"tools = [\n",
" Tool(\n",
" name=\"Intermediate Answer\",\n",
" func=search.run\n",
" )\n",
"]\n",
"\n",
"self_ask_with_search = initialize_agent(tools, llm, agent_path=\"lc://agents/self-ask-with-search/agent.json\", verbose=True)\n",
"self_ask_with_search.run(\"What is the hometown of the reigning men's U.S. Open champion?\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3aede965",
"metadata": {},
"source": [
"# Pinning Dependencies\n",
"\n",
"Specific versions of LangChainHub agents can be pinned with the `lc@<ref>://` syntax."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e679f7b6",
"metadata": {},
"outputs": [],
"source": [
"self_ask_with_search = initialize_agent(tools, llm, agent_path=\"lc@2826ef9e8acdf88465e1e5fc8a7bf59e0f9d0a85://agents/self-ask-with-search/agent.json\", verbose=True)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,148 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "bfe18e28",
"metadata": {},
"source": [
"# Serialization\n",
"\n",
"This notebook goes over how to serialize agents. For this notebook, it is important to understand the distinction we draw between `agents` and `tools`. An agent is the LLM powered decision maker that decides which actions to take and in which order. Tools are various instruments (functions) an agent has access to, through which an agent can interact with the outside world. When people generally use agents, they primarily talk about using an agent WITH tools. However, when we talk about serialization of agents, we are talking about the agent by itself. We plan to add support for serializing an agent WITH tools sometime in the future.\n",
"\n",
"Let's start by creating an agent with tools as we normally do:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "eb729f16",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import load_tools\n",
"from langchain.agents import initialize_agent\n",
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
"agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)"
]
},
{
"cell_type": "markdown",
"id": "0578f566",
"metadata": {},
"source": [
"Let's now serialize the agent. To be explicit that we are serializing ONLY the agent, we will call the `save_agent` method."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "dc544de6",
"metadata": {},
"outputs": [],
"source": [
"agent.save_agent('agent.json')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "62dd45bf",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"llm_chain\": {\r\n",
" \"memory\": null,\r\n",
" \"verbose\": false,\r\n",
" \"prompt\": {\r\n",
" \"input_variables\": [\r\n",
" \"input\",\r\n",
" \"agent_scratchpad\"\r\n",
" ],\r\n",
" \"output_parser\": null,\r\n",
" \"template\": \"Answer the following questions as best you can. You have access to the following tools:\\n\\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\\nCalculator: Useful for when you need to answer questions about math.\\n\\nUse the following format:\\n\\nQuestion: the input question you must answer\\nThought: you should always think about what to do\\nAction: the action to take, should be one of [Search, Calculator]\\nAction Input: the input to the action\\nObservation: the result of the action\\n... (this Thought/Action/Action Input/Observation can repeat N times)\\nThought: I now know the final answer\\nFinal Answer: the final answer to the original input question\\n\\nBegin!\\n\\nQuestion: {input}\\nThought:{agent_scratchpad}\",\r\n",
" \"template_format\": \"f-string\"\r\n",
" },\r\n",
" \"llm\": {\r\n",
" \"model_name\": \"text-davinci-003\",\r\n",
" \"temperature\": 0.0,\r\n",
" \"max_tokens\": 256,\r\n",
" \"top_p\": 1,\r\n",
" \"frequency_penalty\": 0,\r\n",
" \"presence_penalty\": 0,\r\n",
" \"n\": 1,\r\n",
" \"best_of\": 1,\r\n",
" \"request_timeout\": null,\r\n",
" \"logit_bias\": {},\r\n",
" \"_type\": \"openai\"\r\n",
" },\r\n",
" \"output_key\": \"text\",\r\n",
" \"_type\": \"llm_chain\"\r\n",
" },\r\n",
" \"return_values\": [\r\n",
" \"output\"\r\n",
" ],\r\n",
" \"_type\": \"zero-shot-react-description\"\r\n",
"}"
]
}
],
"source": [
"!cat agent.json"
]
},
{
"cell_type": "markdown",
"id": "0eb72510",
"metadata": {},
"source": [
"We can now load the agent back in"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "eb660b76",
"metadata": {},
"outputs": [],
"source": [
"agent = initialize_agent(tools, llm, agent_path=\"agent.json\", verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa624ea5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -152,7 +152,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.0 64-bit ('llm-env')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},

View File

@@ -3,6 +3,8 @@ How-To Guides
The first category of how-to guides here cover specific parts of working with agents.
`Load From Hub <./examples/load_from_hub.html>`_: This notebook covers how to load agents from `LangChainHub <https://github.com/hwchase17/langchain-hub>`_.
`Custom Tools <./examples/custom_tools.html>`_: How to create custom tools that an agent can use.
`Intermediate Steps <./examples/intermediate_steps.html>`_: How to access and use intermediate steps to get more visibility into the internals of an agent.

View File

@@ -2,7 +2,7 @@
import time
from langchain.chains.natbot.base import NatBotChain
from langchain.chains.natbot.crawler import Crawler # type: ignore
from langchain.chains.natbot.crawler import Crawler
def run_cmd(cmd: str, _crawler: Crawler) -> None:

View File

@@ -22,6 +22,7 @@ tools = load_tools(tool_names, llm=llm)
```
Below is a list of all supported tools and relevant information:
- Tool Name: The name the LLM refers to the tool by.
- Tool Description: The description of the tool that is passed to the LLM.
- Notes: Notes about the tool that are NOT passed to the LLM.
@@ -31,61 +32,71 @@ Below is a list of all supported tools and relevant information:
## List of Tools
**python_repl**
- Tool Name: Python REPL
- Tool Description: A Python shell. Use this to execute python commands. Input should be a valid python command. If you expect output it should be printed out.
- Notes: Maintains state.
- Requires LLM: No
**serpapi**
- Tool Name: Search
- Tool Description: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
- Notes: Calls the Serp API and then parses results.
- Requires LLM: No
**wolfram-alpha**
- Tool Name: Wolfram Alpha
- Tool Description: A wolfram alpha search engine. Useful for when you need to answer questions about Math, Science, Technology, Culture, Society and Everyday Life. Input should be a search query.
- Notes: Calls the Wolfram Alpha API and then parses results.
- Requires LLM: No
- Extra Parameters: `wolfram_alpha_appid`: The Wolfram Alpha app id.
**requests**
- Tool Name: Requests
- Tool Description: A portal to the internet. Use this when you need to get specific content from a site. Input should be a specific url, and the output will be all the text on that page.
- Notes: Uses the Python requests module.
- Requires LLM: No
**terminal**
- Tool Name: Terminal
- Tool Description: Executes commands in a terminal. Input should be valid commands, and the output will be any output from running that command.
- Notes: Executes commands with subprocess.
- Requires LLM: No
**pal-math**
- Tool Name: PAL-MATH
- Tool Description: A language model that is excellent at solving complex word math problems. Input should be a fully worded hard word math problem.
- Notes: Based on [this paper](https://arxiv.org/pdf/2211.10435.pdf).
- Requires LLM: Yes
**pal-colored-objects**
- Tool Name: PAL-COLOR-OBJ
- Tool Description: A language model that is wonderful at reasoning about position and the color attributes of objects. Input should be a fully worded hard reasoning problem. Make sure to include all information about the objects AND the final question you want to answer.
- Notes: Based on [this paper](https://arxiv.org/pdf/2211.10435.pdf).
- Requires LLM: Yes
**llm-math**
- Tool Name: Calculator
- Tool Description: Useful for when you need to answer questions about math.
- Notes: An instance of the `LLMMath` chain.
- Requires LLM: Yes
**open-meteo-api**
- Tool Name: Open Meteo API
- Tool Description: Useful for when you want to get weather information from the OpenMeteo API. The input should be a question in natural language that this API can answer.
- Notes: A natural language connection to the Open Meteo API (`https://api.open-meteo.com/`), specifically the `/v1/forecast` endpoint.
- Requires LLM: Yes
**news-api**
- Tool Name: News API
- Tool Description: Use this when you want to get information about the top headlines of current news stories. The input should be a question in natural language that this API can answer.
- Notes: A natural language connection to the News API (`https://newsapi.org`), specifically the `/v2/top-headlines` endpoint.
@@ -93,8 +104,18 @@ Below is a list of all supported tools and relevant information:
- Extra Parameters: `news_api_key` (your API key to access this endpoint)
**tmdb-api**
- Tool Name: TMDB API
- Tool Description: Useful for when you want to get information from The Movie Database. The input should be a question in natural language that this API can answer.
- Notes: A natural language connection to the TMDB API (`https://api.themoviedb.org/3`), specifically the `/search/movie` endpoint.
- Requires LLM: Yes
- Extra Parameters: `tmdb_bearer_token` (your Bearer Token to access this endpoint - note that this is different from the API key)
**google-search**
- Tool Name: Search
- Tool Description: A wrapper around Google Search. Useful for when you need to answer questions about current events. Input should be a search query.
- Notes: Uses the Google Custom Search API
- Requires LLM: No
- Extra Parameters: `google_api_key`, `google_cse_id`
- For more information on this, see [this page](../../ecosystem/google_search.md)

View File

@@ -187,7 +187,7 @@
}
],
"source": [
"chain({\"question\": \"What did the president say about Justice Breyer\"}, return_only_outputs=True)"
"qa({\"question\": \"What did the president say about Justice Breyer\"}, return_only_outputs=True)"
]
}
],

View File

@@ -0,0 +1,199 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Vector DB Text Generation\n",
"\n",
"This notebook walks through how to use LangChain for text generation over a vector index. This is useful if we want to generate text that is able to draw from a large body of custom text, for example, generating blog posts that have an understanding of previous blog posts written, or product tutorials that can refer to product documentation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prepare Data\n",
"\n",
"First, we prepare the data. For this example, we fetch a documentation site that consists of markdown files hosted on Github and split them into small enough Documents."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.docstore.document import Document\n",
"import requests\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores.faiss import FAISS\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.prompts import PromptTemplate\n",
"import pathlib\n",
"import subprocess\n",
"import tempfile"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Cloning into '.'...\n"
]
}
],
"source": [
"def get_github_docs(repo_owner, repo_name):\n",
" with tempfile.TemporaryDirectory() as d:\n",
" subprocess.check_call(\n",
" f\"git clone --depth 1 https://github.com/{repo_owner}/{repo_name}.git .\",\n",
" cwd=d,\n",
" shell=True,\n",
" )\n",
" git_sha = (\n",
" subprocess.check_output(\"git rev-parse HEAD\", shell=True, cwd=d)\n",
" .decode(\"utf-8\")\n",
" .strip()\n",
" )\n",
" repo_path = pathlib.Path(d)\n",
" markdown_files = list(repo_path.glob(\"*/*.md\")) + list(\n",
" repo_path.glob(\"*/*.mdx\")\n",
" )\n",
" for markdown_file in markdown_files:\n",
" with open(markdown_file, \"r\") as f:\n",
" relative_path = markdown_file.relative_to(repo_path)\n",
" github_url = f\"https://github.com/{repo_owner}/{repo_name}/blob/{git_sha}/{relative_path}\"\n",
" yield Document(page_content=f.read(), metadata={\"source\": github_url})\n",
"\n",
"sources = get_github_docs(\"yirenlu92\", \"deno-manual-forked\")\n",
"\n",
"source_chunks = []\n",
"splitter = CharacterTextSplitter(separator=\" \", chunk_size=1024, chunk_overlap=0)\n",
"for source in sources:\n",
" for chunk in splitter.split_text(source.page_content):\n",
" source_chunks.append(Document(page_content=chunk, metadata=source.metadata))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set Up Vector DB\n",
"\n",
"Now that we have the documentation content in chunks, let's put all this information in a vector index for easy retrieval."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"search_index = FAISS.from_documents(source_chunks, OpenAIEmbeddings())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set Up LLM Chain with Custom Prompt\n",
"\n",
"Next, let's set up a simple LLM chain but give it a custom prompt for blog post generation. Note that the custom prompt is parameterized and takes two inputs: `context`, which will be the documents fetched from the vector search, and `topic`, which is given by the user."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import LLMChain\n",
"prompt_template = \"\"\"Use the context below to write a 400 word blog post about the topic below:\n",
" Context: {context}\n",
" Topic: {topic}\n",
" Blog post:\"\"\"\n",
"\n",
"PROMPT = PromptTemplate(\n",
" template=prompt_template, input_variables=[\"context\", \"topic\"]\n",
")\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"\n",
"chain = LLMChain(llm=llm, prompt=PROMPT)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate Text\n",
"\n",
"Finally, we write a function to apply our inputs to the chain. The function takes an input parameter `topic`. We find the documents in the vector index that correspond to that `topic`, and use them as additional context in our simple LLM chain."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def generate_blog_post(topic):\n",
" docs = search_index.similarity_search(topic, k=4)\n",
" inputs = [{\"context\": doc.page_content, \"topic\": topic} for doc in docs]\n",
" print(chain.apply(inputs))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[{'text': '\\n\\nEnvironment variables are a great way to store and access sensitive information in your Deno applications. Deno offers built-in support for environment variables with `Deno.env`, and you can also use a `.env` file to store and access environment variables.\\n\\nUsing `Deno.env` is simple. It has getter and setter methods, so you can easily set and retrieve environment variables. For example, you can set the `FIREBASE_API_KEY` and `FIREBASE_AUTH_DOMAIN` environment variables like this:\\n\\n```ts\\nDeno.env.set(\"FIREBASE_API_KEY\", \"examplekey123\");\\nDeno.env.set(\"FIREBASE_AUTH_DOMAIN\", \"firebasedomain.com\");\\n\\nconsole.log(Deno.env.get(\"FIREBASE_API_KEY\")); // examplekey123\\nconsole.log(Deno.env.get(\"FIREBASE_AUTH_DOMAIN\")); // firebasedomain.com\\n```\\n\\nYou can also store environment variables in a `.env` file. This is a great'}, {'text': '\\n\\nEnvironment variables are a powerful tool for managing configuration settings in a program. They allow us to set values that can be used by the program, without having to hard-code them into the code. This makes it easier to change settings without having to modify the code.\\n\\nIn Deno, environment variables can be set in a few different ways. The most common way is to use the `VAR=value` syntax. This will set the environment variable `VAR` to the value `value`. This can be used to set any number of environment variables before running a command. For example, if we wanted to set the environment variable `VAR` to `hello` before running a Deno command, we could do so like this:\\n\\n```\\nVAR=hello deno run main.ts\\n```\\n\\nThis will set the environment variable `VAR` to `hello` before running the command. We can then access this variable in our code using the `Deno.env.get()` function. For example, if we ran the following command:\\n\\n```\\nVAR=hello && deno eval \"console.log(\\'Deno: \\' + Deno.env.get(\\'VAR'}, {'text': '\\n\\nEnvironment variables are a powerful tool for developers, allowing them to store and access data without having to hard-code it into their applications. In Deno, you can access environment variables using the `Deno.env.get()` function.\\n\\nFor example, if you wanted to access the `HOME` environment variable, you could do so like this:\\n\\n```js\\n// env.js\\nDeno.env.get(\"HOME\");\\n```\\n\\nWhen running this code, you\\'ll need to grant the Deno process access to environment variables. This can be done by passing the `--allow-env` flag to the `deno run` command. You can also specify which environment variables you want to grant access to, like this:\\n\\n```shell\\n# Allow access to only the HOME env var\\ndeno run --allow-env=HOME env.js\\n```\\n\\nIt\\'s important to note that environment variables are case insensitive on Windows, so Deno also matches them case insensitively (on Windows only).\\n\\nAnother thing to be aware of when using environment variables is subprocess permissions. Subprocesses are powerful and can access system resources regardless of the permissions you granted to the Den'}, {'text': '\\n\\nEnvironment variables are an important part of any programming language, and Deno is no exception. Deno is a secure JavaScript and TypeScript runtime built on the V8 JavaScript engine, and it recently added support for environment variables. This feature was added in Deno version 1.6.0, and it is now available for use in Deno applications.\\n\\nEnvironment variables are used to store information that can be used by programs. They are typically used to store configuration information, such as the location of a database or the name of a user. In Deno, environment variables are stored in the `Deno.env` object. This object is similar to the `process.env` object in Node.js, and it allows you to access and set environment variables.\\n\\nThe `Deno.env` object is a read-only object, meaning that you cannot directly modify the environment variables. Instead, you must use the `Deno.env.set()` function to set environment variables. This function takes two arguments: the name of the environment variable and the value to set it to. For example, if you wanted to set the `FOO` environment variable to `bar`, you would use the following code:\\n\\n```'}]\n"
]
}
],
"source": [
"generate_blog_post(\"environment variables\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -11,6 +11,8 @@ The examples here are all end-to-end chains for working with documents.
`Summarization <./combine_docs_examples/summarize.html>`_: A walkthrough of how to use LangChain for summarization over specific documents.
`Vector DB Text Generation <./combine_docs_examples/vector_db_text_generation.html>`_: A walkthrough of how to use LangChain for text generation over a vector database.
`Vector DB Question Answering <./combine_docs_examples/vector_db_qa.html>`_: A walkthrough of how to use LangChain for question answering over a vector database.
`Vector DB Question Answering with Sources <./combine_docs_examples/vector_db_qa_with_sources.html>`_: A walkthrough of how to use LangChain for question answering (with sources) over a vector database.

View File

@@ -21,6 +21,24 @@
"from langchain import OpenAI"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9a58e15e",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(model_name='code-davinci-002', temperature=0, max_tokens=512)"
]
},
{
"cell_type": "markdown",
"id": "095adc76",
"metadata": {},
"source": [
"## Math Prompt"
]
},
{
"cell_type": "code",
"execution_count": 2,
@@ -28,7 +46,6 @@
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(model_name='code-davinci-002', temperature=0, max_tokens=512)\n",
"pal_chain = PALChain.from_math_prompt(llm, verbose=True)"
]
},
@@ -64,7 +81,7 @@
" result = total_pets\n",
" return result\u001b[0m\n",
"\n",
"\u001b[1m> Finished PALChain chain.\u001b[0m\n"
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
@@ -82,6 +99,14 @@
"pal_chain.run(question)"
]
},
{
"cell_type": "markdown",
"id": "0269d20a",
"metadata": {},
"source": [
"## Colored Objects"
]
},
{
"cell_type": "code",
"execution_count": 5,
@@ -89,7 +114,6 @@
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(model_name='code-davinci-002', temperature=0, max_tokens=512)\n",
"pal_chain = PALChain.from_colored_object_prompt(llm, verbose=True)"
]
},
@@ -147,10 +171,94 @@
"pal_chain.run(question)"
]
},
{
"cell_type": "markdown",
"id": "fc3d7f10",
"metadata": {},
"source": [
"## Intermediate Steps\n",
"You can also use the intermediate steps flag to return the code executed that generates the answer."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "9d2d9c61",
"metadata": {},
"outputs": [],
"source": [
"pal_chain = PALChain.from_colored_object_prompt(llm, verbose=True, return_intermediate_steps=True)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b29b971b",
"metadata": {},
"outputs": [],
"source": [
"question = \"On the desk, you see two blue booklets, two purple booklets, and two yellow pairs of sunglasses. If I remove all the pairs of sunglasses from the desk, how many purple items remain on it?\""
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a2c40c28",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new PALChain chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m# Put objects into a list to record ordering\n",
"objects = []\n",
"objects += [('booklet', 'blue')] * 2\n",
"objects += [('booklet', 'purple')] * 2\n",
"objects += [('sunglasses', 'yellow')] * 2\n",
"\n",
"# Remove all pairs of sunglasses\n",
"objects = [object for object in objects if object[0] != 'sunglasses']\n",
"\n",
"# Count number of purple objects\n",
"num_purple = len([object for object in objects if object[1] == 'purple'])\n",
"answer = num_purple\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"result = pal_chain({\"question\": question})"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "efddd033",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"# Put objects into a list to record ordering\\nobjects = []\\nobjects += [('booklet', 'blue')] * 2\\nobjects += [('booklet', 'purple')] * 2\\nobjects += [('sunglasses', 'yellow')] * 2\\n\\n# Remove all pairs of sunglasses\\nobjects = [object for object in objects if object[0] != 'sunglasses']\\n\\n# Count number of purple objects\\nnum_purple = len([object for object in objects if object[1] == 'purple'])\\nanswer = num_purple\""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result['intermediate_steps']"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ab20fec",
"id": "dfd88594",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -58,7 +58,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 3,
"id": "a8fc8f23",
"metadata": {},
"outputs": [],
@@ -68,7 +68,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"id": "15ff81df",
"metadata": {
"pycharm": {
@@ -96,7 +96,7 @@
"' There are 9 employees.'"
]
},
"execution_count": 3,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -188,6 +188,62 @@
"db_chain.run(\"How many employees are there in the foobar table?\")"
]
},
{
"cell_type": "markdown",
"id": "88d8b969",
"metadata": {},
"source": [
"## Return Intermediate Steps\n",
"\n",
"You can also return the intermediate steps of the SQLDatabaseChain. This allows you to access the SQL statement that was generated, as well as the result of running that against the SQL Database."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "38559487",
"metadata": {},
"outputs": [],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, prompt=PROMPT, verbose=True, return_intermediate_steps=True)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "78b6af4d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new SQLDatabaseChain chain...\u001b[0m\n",
"How many employees are there in the foobar table? \n",
"SQLQuery:\u001b[32;1m\u001b[1;3m SELECT COUNT(*) FROM Employee;\u001b[0m\n",
"SQLResult: \u001b[33;1m\u001b[1;3m[(9,)]\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3m There are 9 employees in the foobar table.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"[' SELECT COUNT(*) FROM Employee;', '[(9,)]']"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result = db_chain(\"How many employees are there in the foobar table?\")\n",
"result[\"intermediate_steps\"]"
]
},
{
"cell_type": "markdown",
"id": "b408f800",
@@ -242,6 +298,74 @@
"db_chain.run(\"What are some example tracks by composer Johann Sebastian Bach?\")"
]
},
{
"cell_type": "markdown",
"id": "bcc5e936",
"metadata": {},
"source": [
"## Adding first row of each table\n",
"Sometimes, the format of the data is not obvious and it is optimal to include the first row of the table in the prompt to allow the LLM to understand the data before providing a final query. Here we will use this feature to let the LLM know that artists are saved with their full names."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "9a22ee47",
"metadata": {},
"outputs": [],
"source": [
"db = SQLDatabase.from_uri(\n",
" \"sqlite:///../../../../notebooks/Chinook.db\", \n",
" include_tables=['Track'], # we include only one table to save tokens in the prompt :)\n",
" sample_row_in_table_info=True)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "bcb7a489",
"metadata": {},
"outputs": [],
"source": [
"db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "81e05d82",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new SQLDatabaseChain chain...\u001b[0m\n",
"What are some example tracks by Bach? \n",
"SQLQuery:Table 'Track' has columns: TrackId (INTEGER), Name (NVARCHAR(200)), AlbumId (INTEGER), MediaTypeId (INTEGER), GenreId (INTEGER), Composer (NVARCHAR(220)), Milliseconds (INTEGER), Bytes (INTEGER), UnitPrice (NUMERIC(10, 2)). Here is an example row for this table (long strings are truncated): ['1', 'For Those About To Rock (We Salute You)', '1', '1', '1', 'Angus Young, Malcolm Young, Brian Johnson', '343719', '11170334', '0.99'].\n",
"\u001b[32;1m\u001b[1;3m SELECT TrackId, Name, Composer FROM Track WHERE Composer LIKE '%Bach%' ORDER BY Name LIMIT 5;\u001b[0m\n",
"SQLResult: \u001b[33;1m\u001b[1;3m[(1709, 'American Woman', 'B. Cummings/G. Peterson/M.J. Kale/R. Bachman'), (3408, 'Aria Mit 30 Veränderungen, BWV 988 \"Goldberg Variations\": Aria', 'Johann Sebastian Bach'), (3433, 'Concerto No.2 in F Major, BWV1047, I. Allegro', 'Johann Sebastian Bach'), (3407, 'Concerto for 2 Violins in D Minor, BWV 1043: I. Vivace', 'Johann Sebastian Bach'), (3490, 'Partita in E Major, BWV 1006A: I. Prelude', 'Johann Sebastian Bach')]\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3m Some example tracks by Bach are 'American Woman', 'Aria Mit 30 Veränderungen, BWV 988 \"Goldberg Variations\": Aria', 'Concerto No.2 in F Major, BWV1047, I. Allegro', 'Concerto for 2 Violins in D Minor, BWV 1043: I. Vivace', and 'Partita in E Major, BWV 1006A: I. Prelude'.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' Some example tracks by Bach are \\'American Woman\\', \\'Aria Mit 30 Veränderungen, BWV 988 \"Goldberg Variations\": Aria\\', \\'Concerto No.2 in F Major, BWV1047, I. Allegro\\', \\'Concerto for 2 Violins in D Minor, BWV 1043: I. Vivace\\', and \\'Partita in E Major, BWV 1006A: I. Prelude\\'.'"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db_chain.run(\"What are some example tracks by Bach?\")"
]
},
{
"cell_type": "markdown",
"id": "c12ae15a",
@@ -319,14 +443,6 @@
"source": [
"chain.run(\"How many employees are also customers?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b2998b03",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -0,0 +1,157 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "25c90e9e",
"metadata": {},
"source": [
"# Loading from LangChainHub\n",
"\n",
"This notebook covers how to load chains from [LangChainHub](https://github.com/hwchase17/langchain-hub)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8b54479e",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import load_chain\n",
"\n",
"chain = load_chain(\"lc://chains/llm-math/chain.json\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4828f31f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
"whats 2 raised to .12\u001b[32;1m\u001b[1;3m\n",
"Answer: 1.0791812460476249\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Answer: 1.0791812460476249'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"whats 2 raised to .12\")"
]
},
{
"cell_type": "markdown",
"id": "8db72cda",
"metadata": {},
"source": [
"Sometimes chains will require extra arguments that were not serialized with the chain. For example, a chain that does question answering over a vector database will require a vector database."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "aab39528",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores.faiss import FAISS\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain import OpenAI, VectorDBQA"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "16a85d5e",
"metadata": {},
"outputs": [],
"source": [
"with open('../../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"vectorstore = FAISS.from_texts(texts, embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "6a82e91e",
"metadata": {},
"outputs": [],
"source": [
"chain = load_chain(\"lc://chains/vector-db-qa/stuff/chain.json\", vectorstore=vectorstore)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "efe9b25b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\" The president said that Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers, and that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"chain.run(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f910a32f",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,13 @@
{
"model_name": "text-davinci-003",
"temperature": 0.0,
"max_tokens": 256,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0,
"n": 1,
"best_of": 1,
"request_timeout": null,
"logit_bias": {},
"_type": "openai"
}

View File

@@ -0,0 +1,27 @@
{
"memory": null,
"verbose": true,
"prompt": {
"input_variables": [
"question"
],
"output_parser": null,
"template": "Question: {question}\n\nAnswer: Let's think step by step.",
"template_format": "f-string"
},
"llm": {
"model_name": "text-davinci-003",
"temperature": 0.0,
"max_tokens": 256,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0,
"n": 1,
"best_of": 1,
"request_timeout": null,
"logit_bias": {},
"_type": "openai"
},
"output_key": "text",
"_type": "llm_chain"
}

View File

@@ -0,0 +1,8 @@
{
"memory": null,
"verbose": true,
"prompt_path": "prompt.json",
"llm_path": "llm.json",
"output_key": "text",
"_type": "llm_chain"
}

View File

@@ -0,0 +1,8 @@
{
"input_variables": [
"question"
],
"output_parser": null,
"template": "Question: {question}\n\nAnswer: Let's think step by step.",
"template_format": "f-string"
}

View File

@@ -0,0 +1,376 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "cbe47c3a",
"metadata": {},
"source": [
"# Serialization\n",
"This notebook covers how to serialize chains to and from disk. The serialization format we use is json or yaml. Currently, only some chains support this type of serialization. We will grow the number of supported chains over time.\n"
]
},
{
"cell_type": "markdown",
"id": "e4a8a447",
"metadata": {},
"source": [
"## Saving a chain to disk\n",
"First, let's go over how to save a chain to disk. This can be done with the `.save` method, and specifying a file path with a json or yaml extension."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "26e28451",
"metadata": {},
"outputs": [],
"source": [
"from langchain import PromptTemplate, OpenAI, LLMChain\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer: Let's think step by step.\"\"\"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0), verbose=True)\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "bfa18e1f",
"metadata": {},
"outputs": [],
"source": [
"llm_chain.save(\"llm_chain.json\")"
]
},
{
"cell_type": "markdown",
"id": "ea82665d",
"metadata": {},
"source": [
"Let's now take a look at what's inside this saved file"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "0fd33328",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"memory\": null,\r\n",
" \"verbose\": true,\r\n",
" \"prompt\": {\r\n",
" \"input_variables\": [\r\n",
" \"question\"\r\n",
" ],\r\n",
" \"output_parser\": null,\r\n",
" \"template\": \"Question: {question}\\n\\nAnswer: Let's think step by step.\",\r\n",
" \"template_format\": \"f-string\"\r\n",
" },\r\n",
" \"llm\": {\r\n",
" \"model_name\": \"text-davinci-003\",\r\n",
" \"temperature\": 0.0,\r\n",
" \"max_tokens\": 256,\r\n",
" \"top_p\": 1,\r\n",
" \"frequency_penalty\": 0,\r\n",
" \"presence_penalty\": 0,\r\n",
" \"n\": 1,\r\n",
" \"best_of\": 1,\r\n",
" \"request_timeout\": null,\r\n",
" \"logit_bias\": {},\r\n",
" \"_type\": \"openai\"\r\n",
" },\r\n",
" \"output_key\": \"text\",\r\n",
" \"_type\": \"llm_chain\"\r\n",
"}"
]
}
],
"source": [
"!cat llm_chain.json"
]
},
{
"cell_type": "markdown",
"id": "2012c724",
"metadata": {},
"source": [
"## Loading a chain from disk\n",
"We can load a chain from disk by using the `load_chain` method."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "342a1974",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import load_chain"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "394b7da8",
"metadata": {},
"outputs": [],
"source": [
"chain = load_chain(\"llm_chain.json\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "20d99787",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mQuestion: whats 2 + 2\n",
"\n",
"Answer: Let's think step by step.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' 2 + 2 = 4'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"whats 2 + 2\")"
]
},
{
"cell_type": "markdown",
"id": "14449679",
"metadata": {},
"source": [
"## Saving components separately\n",
"In the above example, we can see that the prompt and llm configuration information is saved in the same json as the overall chain. Alternatively, we can split them up and save them separately. This is often useful to make the saved components more modular. In order to do this, we just need to specify `llm_path` instead of the `llm` component, and `prompt_path` instead of the `prompt` component."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "50ec35ab",
"metadata": {},
"outputs": [],
"source": [
"llm_chain.prompt.save(\"prompt.json\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "c48b39aa",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"input_variables\": [\r\n",
" \"question\"\r\n",
" ],\r\n",
" \"output_parser\": null,\r\n",
" \"template\": \"Question: {question}\\n\\nAnswer: Let's think step by step.\",\r\n",
" \"template_format\": \"f-string\"\r\n",
"}"
]
}
],
"source": [
"!cat prompt.json"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "13c92944",
"metadata": {},
"outputs": [],
"source": [
"llm_chain.llm.save(\"llm.json\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "1b815f89",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"model_name\": \"text-davinci-003\",\r\n",
" \"temperature\": 0.0,\r\n",
" \"max_tokens\": 256,\r\n",
" \"top_p\": 1,\r\n",
" \"frequency_penalty\": 0,\r\n",
" \"presence_penalty\": 0,\r\n",
" \"n\": 1,\r\n",
" \"best_of\": 1,\r\n",
" \"request_timeout\": null,\r\n",
" \"logit_bias\": {},\r\n",
" \"_type\": \"openai\"\r\n",
"}"
]
}
],
"source": [
"!cat llm.json"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "7e6aa9ab",
"metadata": {},
"outputs": [],
"source": [
"config = {\n",
" \"memory\": None,\n",
" \"verbose\": True,\n",
" \"prompt_path\": \"prompt.json\",\n",
" \"llm_path\": \"llm.json\",\n",
" \"output_key\": \"text\",\n",
" \"_type\": \"llm_chain\"\n",
"}\n",
"import json\n",
"with open(\"llm_chain_separate.json\", \"w\") as f:\n",
" json.dump(config, f, indent=2)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "8e959ca6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"memory\": null,\r\n",
" \"verbose\": true,\r\n",
" \"prompt_path\": \"prompt.json\",\r\n",
" \"llm_path\": \"llm.json\",\r\n",
" \"output_key\": \"text\",\r\n",
" \"_type\": \"llm_chain\"\r\n",
"}"
]
}
],
"source": [
"!cat llm_chain_separate.json"
]
},
{
"cell_type": "markdown",
"id": "662731c0",
"metadata": {},
"source": [
"We can then load it in the same way"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "d69ceb93",
"metadata": {},
"outputs": [],
"source": [
"chain = load_chain(\"llm_chain_separate.json\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "a99d61b9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mQuestion: whats 2 + 2\n",
"\n",
"Answer: Let's think step by step.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' 2 + 2 = 4'"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"whats 2 + 2\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "822b7c12",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -18,3 +18,7 @@ They are broken up into three categories:
./generic_how_to.rst
./combine_docs_how_to.rst
./utility_how_to.rst
In addition to different types of chains, we also have the following how-to guides for working with chains in general:
`Load From Hub <./generic/from_hub.html>`_: This notebook covers how to load chains from `LangChainHub <https://github.com/hwchase17/langchain-hub>`_.

View File

@@ -0,0 +1,179 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e5715368",
"metadata": {},
"source": [
"# Token Usage Tracking\n",
"\n",
"This notebook goes over how to track your token usage for specific calls. It is currently only implemented for the OpenAI API.\n",
"\n",
"Let's first look at an extremely simple example of tracking token usage for a single LLM call."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9455db35",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.callbacks import get_openai_callback"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d1c55cc9",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(model_name=\"text-davinci-002\", n=2, best_of=2)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "31667d54",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"42\n"
]
}
],
"source": [
"with get_openai_callback() as cb:\n",
" result = llm(\"Tell me a joke\")\n",
" print(cb.total_tokens)"
]
},
{
"cell_type": "markdown",
"id": "c0ab6d27",
"metadata": {},
"source": [
"Anything inside the context manager will get tracked. Here's an example of using it to track multiple calls in sequence."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "e09420f4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"83\n"
]
}
],
"source": [
"with get_openai_callback() as cb:\n",
" result = llm(\"Tell me a joke\")\n",
" result2 = llm(\"Tell me a joke\")\n",
" print(cb.total_tokens)"
]
},
{
"cell_type": "markdown",
"id": "d8186e7b",
"metadata": {},
"source": [
"If a chain or agent with multiple steps in it is used, it will track all those steps."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5d1125c6",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import load_tools\n",
"from langchain.agents import initialize_agent\n",
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
"agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "2f98c536",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
"Action: Search\n",
"Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mJason Sudeikis\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Jason Sudeikis' age\n",
"Action: Search\n",
"Action Input: \"Jason Sudeikis age\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m47 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 47 raised to the 0.23 power\n",
"Action: Calculator\n",
"Action Input: 47^0.23\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.4242784855673896\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Jason Sudeikis, Olivia Wilde's boyfriend, is 47 years old and his age raised to the 0.23 power is 2.4242784855673896.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"1465\n"
]
}
],
"source": [
"with get_openai_callback() as cb:\n",
" response = agent.run(\"Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?\")\n",
" print(cb.total_tokens)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "80ca77a3",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -9,6 +9,8 @@ The examples here all address certain "how-to" guides for working with LLMs.
`Custom LLM <./examples/custom_llm.html>`_: How to create and use a custom LLM class, in case you have an LLM not from one of the standard providers (including one that you host yourself).
`Token Usage Tracking <./examples/token_usage_tracking.html>`_: How to track the token usage of various chains/agents/LLM calls.
.. toctree::
:maxdepth: 1

View File

@@ -35,7 +35,7 @@
"Overall, Assistant is a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.\n",
"\n",
"\n",
"Human: I want you to act as a Linux terminal. I will type commands and you will reply with what the terminal should show. I want you to only reply wiht the terminal output inside one unique code block, and nothing else. Do not write explanations. Do not type commands unless I instruct you to do so. When I need to tell you something in English I will do so by putting text inside curly brackets {like this}. My first command is pwd.\n",
"Human: I want you to act as a Linux terminal. I will type commands and you will reply with what the terminal should show. I want you to only reply with the terminal output inside one unique code block, and nothing else. Do not write explanations. Do not type commands unless I instruct you to do so. When I need to tell you something in English I will do so by putting text inside curly brackets {like this}. My first command is pwd.\n",
"Assistant:\u001b[0m\n",
"\n",
"\u001b[1m> Finished LLMChain chain.\u001b[0m\n",

View File

@@ -0,0 +1,459 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ff31084d",
"metadata": {},
"source": [
"# Entity Memory\n",
"This notebook shows how to work with a memory module that remembers things about specific entities. It extracts information on entities (using LLMs) and builds up its knowledge about that entity over time (also using LLMs)."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "13471fbd",
"metadata": {},
"outputs": [],
"source": [
"from langchain import OpenAI, ConversationChain\n",
"from langchain.chains.conversation.memory import ConversationEntityMemory\n",
"from langchain.chains.conversation.prompt import ENTITY_MEMORY_CONVERSATION_TEMPLATE\n",
"from pydantic import BaseModel\n",
"from typing import List, Dict, Any"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "183346e2",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"conversation = ConversationChain(\n",
" llm=llm, \n",
" verbose=True,\n",
" prompt=ENTITY_MEMORY_CONVERSATION_TEMPLATE,\n",
" memory=ConversationEntityMemory(llm=llm)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "7eb1460a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mYou are an assistant to a human, powered by a large language model trained by OpenAI.\n",
"\n",
"You are designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, you are able to generate human-like text based on the input you receive, allowing you to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.\n",
"\n",
"You are constantly learning and improving, and your capabilities are constantly evolving. You are able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. You have access to some personalized information provided by the human in the Context section below. Additionally, you are able to generate your own text based on the input you receive, allowing you to engage in discussions and provide explanations and descriptions on a wide range of topics.\n",
"\n",
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': '', 'Sam': ''}\n",
"\n",
"Current conversation:\n",
"\n",
"Last line:\n",
"Human: Deven & Sam are working on a hackathon project\n",
"You:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' That sounds like a great project! What kind of project are they working on?'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation.predict(input=\"Deven & Sam are working on a hackathon project\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "46324ca8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mYou are an assistant to a human, powered by a large language model trained by OpenAI.\n",
"\n",
"You are designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, you are able to generate human-like text based on the input you receive, allowing you to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.\n",
"\n",
"You are constantly learning and improving, and your capabilities are constantly evolving. You are able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. You have access to some personalized information provided by the human in the Context section below. Additionally, you are able to generate your own text based on the input you receive, allowing you to engage in discussions and provide explanations and descriptions on a wide range of topics.\n",
"\n",
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': 'Deven is working on a hackathon project with Sam.', 'Sam': 'Sam is working on a hackathon project with Deven.', 'Langchain': ''}\n",
"\n",
"Current conversation:\n",
"Human: Deven & Sam are working on a hackathon project\n",
"AI: That sounds like a great project! What kind of project are they working on?\n",
"Last line:\n",
"Human: They are trying to add more complex memory structures to Langchain\n",
"You:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' That sounds like an interesting project! What kind of memory structures are they trying to add?'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation.predict(input=\"They are trying to add more complex memory structures to Langchain\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ff2ebf6b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mYou are an assistant to a human, powered by a large language model trained by OpenAI.\n",
"\n",
"You are designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, you are able to generate human-like text based on the input you receive, allowing you to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.\n",
"\n",
"You are constantly learning and improving, and your capabilities are constantly evolving. You are able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. You have access to some personalized information provided by the human in the Context section below. Additionally, you are able to generate your own text based on the input you receive, allowing you to engage in discussions and provide explanations and descriptions on a wide range of topics.\n",
"\n",
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': 'Deven is working on a hackathon project with Sam to add more complex memory structures to Langchain.', 'Sam': 'Sam is working on a hackathon project with Deven to add more complex memory structures to Langchain.', 'Langchain': 'Langchain is a project that seeks to add more complex memory structures.', 'Key-Value Store': ''}\n",
"\n",
"Current conversation:\n",
"Human: Deven & Sam are working on a hackathon project\n",
"AI: That sounds like a great project! What kind of project are they working on?\n",
"Human: They are trying to add more complex memory structures to Langchain\n",
"AI: That sounds like an interesting project! What kind of memory structures are they trying to add?\n",
"Last line:\n",
"Human: They are adding in a key-value store for entities mentioned so far in the conversation.\n",
"You:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' That sounds like a great idea! How will the key-value store work?'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation.predict(input=\"They are adding in a key-value store for entities mentioned so far in the conversation.\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "56cfd4ba",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mYou are an assistant to a human, powered by a large language model trained by OpenAI.\n",
"\n",
"You are designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, you are able to generate human-like text based on the input you receive, allowing you to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.\n",
"\n",
"You are constantly learning and improving, and your capabilities are constantly evolving. You are able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. You have access to some personalized information provided by the human in the Context section below. Additionally, you are able to generate your own text based on the input you receive, allowing you to engage in discussions and provide explanations and descriptions on a wide range of topics.\n",
"\n",
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Deven': 'Deven is working on a hackathon project with Sam to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.', 'Sam': 'Sam is working on a hackathon project with Deven to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.'}\n",
"\n",
"Current conversation:\n",
"Human: Deven & Sam are working on a hackathon project\n",
"AI: That sounds like a great project! What kind of project are they working on?\n",
"Human: They are trying to add more complex memory structures to Langchain\n",
"AI: That sounds like an interesting project! What kind of memory structures are they trying to add?\n",
"Human: They are adding in a key-value store for entities mentioned so far in the conversation.\n",
"AI: That sounds like a great idea! How will the key-value store work?\n",
"Last line:\n",
"Human: What do you know about Deven & Sam?\n",
"You:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' Deven and Sam are working on a hackathon project to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be very motivated and passionate about their project, and are working hard to make it a success.'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation.predict(input=\"What do you know about Deven & Sam?\")"
]
},
{
"cell_type": "markdown",
"id": "4e6df549",
"metadata": {},
"source": [
"## Inspecting the memory store\n",
"We can also inspect the memory store directly. In the following examaples, we look at it directly, and then go through some examples of adding information and watch how it changes."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "038b4d3f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Deven': 'Deven is working on a hackathon project with Sam to add more '\n",
" 'complex memory structures to Langchain, including a key-value store '\n",
" 'for entities mentioned so far in the conversation.',\n",
" 'Key-Value Store': 'Key-Value Store: A data structure that stores values '\n",
" 'associated with a unique key, allowing for efficient '\n",
" 'retrieval of values. Deven and Sam are adding a key-value '\n",
" 'store for entities mentioned so far in the conversation.',\n",
" 'Langchain': 'Langchain is a project that seeks to add more complex memory '\n",
" 'structures, including a key-value store for entities mentioned '\n",
" 'so far in the conversation.',\n",
" 'Sam': 'Sam is working on a hackathon project with Deven to add more complex '\n",
" 'memory structures to Langchain, including a key-value store for '\n",
" 'entities mentioned so far in the conversation.'}\n"
]
}
],
"source": [
"from pprint import pprint\n",
"pprint(conversation.memory.store)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "2df4800e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mYou are an assistant to a human, powered by a large language model trained by OpenAI.\n",
"\n",
"You are designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, you are able to generate human-like text based on the input you receive, allowing you to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.\n",
"\n",
"You are constantly learning and improving, and your capabilities are constantly evolving. You are able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. You have access to some personalized information provided by the human in the Context section below. Additionally, you are able to generate your own text based on the input you receive, allowing you to engage in discussions and provide explanations and descriptions on a wide range of topics.\n",
"\n",
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Daimon': '', 'Sam': 'Sam is working on a hackathon project with Deven to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation.'}\n",
"\n",
"Current conversation:\n",
"Human: They are trying to add more complex memory structures to Langchain\n",
"AI: That sounds like an interesting project! What kind of memory structures are they trying to add?\n",
"Human: They are adding in a key-value store for entities mentioned so far in the conversation.\n",
"AI: That sounds like a great idea! How will the key-value store work?\n",
"Human: What do you know about Deven & Sam?\n",
"AI: Deven and Sam are working on a hackathon project to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be very motivated and passionate about their project, and are working hard to make it a success.\n",
"Last line:\n",
"Human: Sam is the founder of a company called Daimon.\n",
"You:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"\"\\nThat's impressive! It sounds like Sam is a very successful entrepreneur. What kind of company is Daimon?\""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation.predict(input=\"Sam is the founder of a company called Daimon.\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "ebe9e36f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Daimon': 'Daimon is a company founded by Sam.',\n",
" 'Deven': 'Deven is working on a hackathon project with Sam to add more '\n",
" 'complex memory structures to Langchain, including a key-value store '\n",
" 'for entities mentioned so far in the conversation.',\n",
" 'Key-Value Store': 'Key-Value Store: A data structure that stores values '\n",
" 'associated with a unique key, allowing for efficient '\n",
" 'retrieval of values. Deven and Sam are adding a key-value '\n",
" 'store for entities mentioned so far in the conversation.',\n",
" 'Langchain': 'Langchain is a project that seeks to add more complex memory '\n",
" 'structures, including a key-value store for entities mentioned '\n",
" 'so far in the conversation.',\n",
" 'Sam': 'Sam is working on a hackathon project with Deven to add more complex '\n",
" 'memory structures to Langchain, including a key-value store for '\n",
" 'entities mentioned so far in the conversation. He is also the founder '\n",
" 'of a company called Daimon.'}\n"
]
}
],
"source": [
"from pprint import pprint\n",
"pprint(conversation.memory.store)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "dd547144",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mYou are an assistant to a human, powered by a large language model trained by OpenAI.\n",
"\n",
"You are designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, you are able to generate human-like text based on the input you receive, allowing you to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.\n",
"\n",
"You are constantly learning and improving, and your capabilities are constantly evolving. You are able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. You have access to some personalized information provided by the human in the Context section below. Additionally, you are able to generate your own text based on the input you receive, allowing you to engage in discussions and provide explanations and descriptions on a wide range of topics.\n",
"\n",
"Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.\n",
"\n",
"Context:\n",
"{'Sam': 'Sam is working on a hackathon project with Deven to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. He is also the founder of a company called Daimon.', 'Daimon': 'Daimon is a company founded by Sam.'}\n",
"\n",
"Current conversation:\n",
"Human: They are adding in a key-value store for entities mentioned so far in the conversation.\n",
"AI: That sounds like a great idea! How will the key-value store work?\n",
"Human: What do you know about Deven & Sam?\n",
"AI: Deven and Sam are working on a hackathon project to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. They seem to be very motivated and passionate about their project, and are working hard to make it a success.\n",
"Human: Sam is the founder of a company called Daimon.\n",
"AI: \n",
"That's impressive! It sounds like Sam is a very successful entrepreneur. What kind of company is Daimon?\n",
"Last line:\n",
"Human: What do you know about Sam?\n",
"You:\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"' Sam is the founder of a company called Daimon. He is also working on a hackathon project with Deven to add more complex memory structures to Langchain, including a key-value store for entities mentioned so far in the conversation. He seems to be very motivated and passionate about his project, and is working hard to make it a success.'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversation.predict(input=\"What do you know about Sam?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e00463b5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -7,6 +7,9 @@ The examples here all highlight how to use memory in different ways.
`ChatGPT Clone <./examples/chatgpt_clone.html>`_: How to recreate ChatGPT with LangChain prompting + memory components.
`Entity Memory <./examples/entity_summary_memory.html>`_: How to use a type of memory that organizes information by entity.
`Adding Memory to Multi-Input Chain <./examples/adding_memory_chain_multiple_inputs.html>`_: How to add a memory component to any multiple input chain.
`Conversational Memory Customization <./examples/conversational_customization.html>`_: How to customize existing conversation memory components.

View File

@@ -12,3 +12,8 @@ There are a few different ways to accomplish this:
- Summary: This involves summarizing previous conversations and passing that summary in, instead of the raw dialouge itself. Compared to `Buffer`, this compresses information: meaning it is more lossy, but also less likely to run into context length limits.
- Combination: A combination of the above two approaches, where you compute a summary but also pass in some previous interfactions directly!
## Entity Memory
A more complex form of memory is remembering information about specific entities in the conversation.
This is a more direct and organized way of remembering information over time.
Putting it a more structured form also has the benefit of allowing easy inspection of what is known about specific entities.
For a guide on how to use this type of memory, see [this notebook](./examples/entity_summary_memory.ipynb).

View File

@@ -7,7 +7,7 @@ Let's suppose we want the LLM to generate English language explanations of a fun
LangChain provides a set of default prompt templates that can be used to generate prompts for a variety of tasks. However, there may be cases where the default prompt templates do not meet your needs. For example, you may want to create a prompt template with specific dynamic instructions for your language model. In such cases, you can create a custom prompt template.
:::{note}
Take a look at the current set of default prompt templates [here](../prompt_templates.md).
Take a look at the current set of default prompt templates [here](../getting_started.md).
:::
<!-- TODO(shreya): Add correct link here. -->
@@ -34,7 +34,7 @@ Next, we'll create a custom prompt template that takes in the function name as i
```python
from langchain.prompts import BasePromptTemplate
from pydantic import BaseModel
from pydantic import BaseModel, validator
class FunctionExplainerPromptTemplate(BasePromptTemplate, BaseModel):
@@ -54,7 +54,7 @@ class FunctionExplainerPromptTemplate(BasePromptTemplate, BaseModel):
# Generate the prompt to be sent to the language model
prompt = f"""
Given the function name and source code, generate an English language explanation of the function.
Function Name: {kwargs["function_name"]}
Function Name: {kwargs["function_name"].__name__}
Source Code:
{source_code}
Explanation:

View File

@@ -23,7 +23,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "8244ff60",
"metadata": {},
"outputs": [],
@@ -48,6 +48,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.prompts.example_selector import LengthBasedExampleSelector"
]
},
@@ -75,8 +76,12 @@
"metadata": {},
"outputs": [],
"source": [
"example_prompt = PromptTemplate(\n",
" input_variables=[\"input\", \"output\"],\n",
" template=\"Input: {input}\\nOutput: {output}\",\n",
")\n",
"example_selector = LengthBasedExampleSelector(\n",
" # These are the examples is has available to choose from.\n",
" # These are the examples it has available to choose from.\n",
" examples=examples, \n",
" # This is the PromptTemplate being used to format the examples.\n",
" example_prompt=example_prompt, \n",
@@ -434,10 +439,242 @@
"print(similar_prompt.format(adjective=\"worried\"))"
]
},
{
"cell_type": "markdown",
"id": "4aaeed2f",
"metadata": {},
"source": [
"## NGram Overlap ExampleSelector\n",
"\n",
"The NGramOverlapExampleSelector selects and orders examples based on which examples are most similar to the input, according to an ngram overlap score. The ngram overlap score is a float between 0.0 and 1.0, inclusive. \n",
"\n",
"The selector allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "9cbc0acc",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.prompts.example_selector.ngram_overlap import NGramOverlapExampleSelector"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4f318f4b",
"metadata": {},
"outputs": [],
"source": [
"# These are examples of a fictional translation task.\n",
"examples = [\n",
" {\"input\": \"See Spot run.\", \"output\": \"Ver correr a Spot.\"},\n",
" {\"input\": \"My dog barks.\", \"output\": \"Mi perro ladra.\"},\n",
" {\"input\": \"Spot can run.\", \"output\": \"Spot puede correr.\"},\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "bf75e0fe",
"metadata": {},
"outputs": [],
"source": [
"example_prompt = PromptTemplate(\n",
" input_variables=[\"input\", \"output\"],\n",
" template=\"Input: {input}\\nOutput: {output}\",\n",
")\n",
"example_selector = NGramOverlapExampleSelector(\n",
" # These are the examples it has available to choose from.\n",
" examples=examples, \n",
" # This is the PromptTemplate being used to format the examples.\n",
" example_prompt=example_prompt, \n",
" # This is the threshold, at which selector stops.\n",
" # It is set to -1.0 by default.\n",
" threshold=-1.0,\n",
" # For negative threshold:\n",
" # Selector sorts examples by ngram overlap score, and excludes none.\n",
" # For threshold greater than 1.0:\n",
" # Selector excludes all examples, and returns an empty list.\n",
" # For threshold equal to 0.0:\n",
" # Selector sorts examples by ngram overlap score,\n",
" # and excludes those with no ngram overlap with input.\n",
")\n",
"dynamic_prompt = FewShotPromptTemplate(\n",
" # We provide an ExampleSelector instead of examples.\n",
" example_selector=example_selector,\n",
" example_prompt=example_prompt,\n",
" prefix=\"Give the Spanish translation of every input\",\n",
" suffix=\"Input: {sentence}\\nOutput:\", \n",
" input_variables=[\"sentence\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "83fb218a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the Spanish translation of every input\n",
"\n",
"Input: Spot can run.\n",
"Output: Spot puede correr.\n",
"\n",
"Input: See Spot run.\n",
"Output: Ver correr a Spot.\n",
"\n",
"Input: My dog barks.\n",
"Output: Mi perro ladra.\n",
"\n",
"Input: Spot can run fast.\n",
"Output:\n"
]
}
],
"source": [
"# An example input with large ngram overlap with \"Spot can run.\"\n",
"# and no overlap with \"My dog barks.\"\n",
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "485f5307",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the Spanish translation of every input\n",
"\n",
"Input: Spot can run.\n",
"Output: Spot puede correr.\n",
"\n",
"Input: See Spot run.\n",
"Output: Ver correr a Spot.\n",
"\n",
"Input: Spot plays fetch.\n",
"Output: Spot juega a buscar.\n",
"\n",
"Input: My dog barks.\n",
"Output: Mi perro ladra.\n",
"\n",
"Input: Spot can run fast.\n",
"Output:\n"
]
}
],
"source": [
"# You can add examples to NGramOverlapExampleSelector as well.\n",
"new_example = {\"input\": \"Spot plays fetch.\", \"output\": \"Spot juega a buscar.\"}\n",
"\n",
"example_selector.add_example(new_example)\n",
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "606ce697",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the Spanish translation of every input\n",
"\n",
"Input: Spot can run.\n",
"Output: Spot puede correr.\n",
"\n",
"Input: See Spot run.\n",
"Output: Ver correr a Spot.\n",
"\n",
"Input: Spot plays fetch.\n",
"Output: Spot juega a buscar.\n",
"\n",
"Input: Spot can run fast.\n",
"Output:\n"
]
}
],
"source": [
"# You can set a threshold at which examples are excluded.\n",
"# For example, setting threshold equal to 0.0\n",
"# excludes examples with no ngram overlaps with input.\n",
"# Since \"My dog barks.\" has no ngram overlaps with \"Spot can run fast.\"\n",
"# it is excluded.\n",
"example_selector.threshold=0.0\n",
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "7f8d72f7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the Spanish translation of every input\n",
"\n",
"Input: Spot can run.\n",
"Output: Spot puede correr.\n",
"\n",
"Input: Spot plays fetch.\n",
"Output: Spot juega a buscar.\n",
"\n",
"Input: Spot can play fetch.\n",
"Output:\n"
]
}
],
"source": [
"# Setting small nonzero threshold\n",
"example_selector.threshold=0.09\n",
"print(dynamic_prompt.format(sentence=\"Spot can play fetch.\"))"
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "09633aa8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Give the Spanish translation of every input\n",
"\n",
"Input: Spot can play fetch.\n",
"Output:\n"
]
}
],
"source": [
"# Setting threshold greater than 1.0\n",
"example_selector.threshold=1.0+1e-9\n",
"print(dynamic_prompt.format(sentence=\"Spot can play fetch.\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c746d6f4",
"id": "39f30097",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -0,0 +1,4 @@
- input: happy
output: sad
- input: tall
output: short

View File

@@ -0,0 +1,14 @@
_type: few_shot
input_variables:
["adjective"]
prefix:
Write antonyms for the following words.
example_prompt:
input_variables:
["input", "output"]
template:
"Input: {input}\nOutput: {output}"
examples:
examples.yaml
suffix:
"Input: {adjective}\nOutput:"

View File

@@ -225,6 +225,35 @@
"!cat examples.json"
]
},
{
"cell_type": "markdown",
"id": "d3052850",
"metadata": {},
"source": [
"And here is what the same examples stored as yaml might look like."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "901385d1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"- input: happy\r\n",
" output: sad\r\n",
"- input: tall\r\n",
" output: short\r\n"
]
}
],
"source": [
"!cat examples.yaml"
]
},
{
"cell_type": "markdown",
"id": "8e300335",
@@ -236,7 +265,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 10,
"id": "e2bec0fc",
"metadata": {},
"outputs": [
@@ -267,7 +296,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 11,
"id": "98c8f356",
"metadata": {},
"outputs": [
@@ -293,6 +322,73 @@
"print(prompt.format(adjective=\"funny\"))"
]
},
{
"cell_type": "markdown",
"id": "13620324",
"metadata": {},
"source": [
"The same would work if you loaded examples from the yaml file."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "831e5e4a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"_type: few_shot\r\n",
"input_variables:\r\n",
" [\"adjective\"]\r\n",
"prefix: \r\n",
" Write antonyms for the following words.\r\n",
"example_prompt:\r\n",
" input_variables:\r\n",
" [\"input\", \"output\"]\r\n",
" template:\r\n",
" \"Input: {input}\\nOutput: {output}\"\r\n",
"examples:\r\n",
" examples.yaml\r\n",
"suffix:\r\n",
" \"Input: {adjective}\\nOutput:\"\r\n"
]
}
],
"source": [
"!cat few_shot_prompt_yaml_examples.yaml"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "6f0a7eaa",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Write antonyms for the following words.\n",
"\n",
"Input: happy\n",
"Output: sad\n",
"\n",
"Input: tall\n",
"Output: short\n",
"\n",
"Input: funny\n",
"Output:\n"
]
}
],
"source": [
"prompt = load_prompt(\"few_shot_prompt_yaml_examples.yaml\")\n",
"print(prompt.format(adjective=\"funny\"))"
]
},
{
"cell_type": "markdown",
"id": "4870aa9d",
@@ -304,7 +400,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 14,
"id": "9d996a86",
"metadata": {},
"outputs": [
@@ -332,7 +428,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 15,
"id": "dd2c10bb",
"metadata": {},
"outputs": [
@@ -369,7 +465,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 16,
"id": "6cd781ef",
"metadata": {},
"outputs": [
@@ -400,7 +496,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 17,
"id": "533ab8a7",
"metadata": {},
"outputs": [
@@ -437,7 +533,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 18,
"id": "0b6dd7b8",
"metadata": {},
"outputs": [
@@ -458,7 +554,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 19,
"id": "76a1065d",
"metadata": {},
"outputs": [
@@ -483,7 +579,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 20,
"id": "744d275d",
"metadata": {},
"outputs": [
@@ -530,7 +626,7 @@
},
"vscode": {
"interpreter": {
"hash": "b1677b440931f40d89ef8be7bf03acb108ce003de0ac9b18e8d43753ea2e7103"
"hash": "8eb71adebe840dca1185e9603533462bc47eb1b1a73bf7dab2d0a8a4c932882e"
}
}
},

View File

@@ -80,6 +80,20 @@ Currently, the template should be formatted as a Python f-string. We also suppor
:::
## Load a prompt template from LangChainHub
LangChainHub contains a collection of prompts which can be loaded directly via LangChain.
```python
from langchain.prompts import load_prompt
prompt = load_prompt("lc://prompts/conversation/prompt.json")
prompt.format(history="", input="What is 1 + 1?")
```
You can read more about LangChainHub and the prompts available with it [here](https://github.com/hwchase17/langchain-hub).
## Pass few shot examples to a prompt template
Few shot examples are a set of examples that can be used to help the language model generate a better response.
@@ -155,11 +169,11 @@ from langchain.prompts.example_selector import LengthBasedExampleSelector
# These are a lot of examples of a pretend task of creating antonyms.
examples = [
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"},
{"input": "energetic", "output": "lethargic"},
{"input": "sunny", "output": "gloomy"},
{"input": "windy", "output": "calm"},
{"word": "happy", "antonym": "sad"},
{"word": "tall", "antonym": "short"},
{"word": "energetic", "antonym": "lethargic"},
{"word": "sunny", "antonym": "gloomy"},
{"word": "windy", "antonym": "calm"},
]
# We'll use the `LengthBasedExampleSelector` to select the examples.
@@ -174,7 +188,7 @@ example_selector = LengthBasedExampleSelector(
)
# We can now use the `example_selector` to create a `FewShotPromptTemplate`.
few_shot_prompt = FewShotPromptTemplate(
dynamic_prompt = FewShotPromptTemplate(
# We provide an ExampleSelector instead of examples.
example_selector=example_selector,
example_prompt=example_prompt,
@@ -185,7 +199,7 @@ few_shot_prompt = FewShotPromptTemplate(
)
# We can now generate a prompt using the `format` method.
print(few_shot_prompt.format(input="big"))
print(dynamic_prompt.format(input="big"))
# -> Give the antonym of every input
# ->
# -> Word: happy
@@ -211,7 +225,7 @@ In contrast, if we provide a very long input, the `LengthBasedExampleSelector` w
```python
long_string = "big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else"
print(dynamic_prompt.format(adjective=long_string))
print(dynamic_prompt.format(input=long_string))
# -> Give the antonym of every input
# -> Word: happy
@@ -224,4 +238,4 @@ print(dynamic_prompt.format(adjective=long_string))
<!-- TODO(shreya): Add correct link here. -->
LangChain comes with a few example selectors that you can use. For more details on how to use them, see [Example Selectors](./examples/example_selectors.ipynb).
You can create custom example selectors that select examples based on any criteria you want. For more details on how to do this, see [Creating a custom example selector](examples/custom_example_selector.ipynb).
You can create custom example selectors that select examples based on any criteria you want. For more details on how to do this, see [Creating a custom example selector](examples/custom_example_selector.ipynb).

View File

@@ -77,7 +77,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "42f76e43",
"metadata": {},
@@ -138,7 +137,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ed47bb62",
"metadata": {},
@@ -196,11 +194,137 @@
"source": [
"doc_result = embeddings.embed_documents([text])"
]
},
{
"cell_type": "markdown",
"id": "fff4734f",
"metadata": {},
"source": [
"## TensorflowHub\n",
"Let's load the TensorflowHub Embedding class."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f822104b",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import TensorflowHubEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "bac84e46",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-30 23:53:01.652176: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
"2023-01-30 23:53:34.362802: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
}
],
"source": [
"embeddings = TensorflowHubEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4790d770",
"metadata": {},
"outputs": [],
"source": [
"text = \"This is a test document.\""
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f556dcdb",
"metadata": {},
"outputs": [],
"source": [
"query_result = embeddings.embed_query(text)"
]
},
{
"cell_type": "markdown",
"id": "59428e05",
"metadata": {},
"source": [
"## InstructEmbeddings\n",
"Let's load the HuggingFace instruct Embeddings class."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "92c5b61e",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import HuggingFaceInstructEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "062547b9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"load INSTRUCTOR_Transformer\n",
"max_seq_length 512\n"
]
}
],
"source": [
"embeddings = HuggingFaceInstructEmbeddings(query_instruction=\"Represent the query for retrieval: \")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e1dcc4bd",
"metadata": {},
"outputs": [],
"source": [
"text = \"This is a test document.\""
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "90f0db94",
"metadata": {},
"outputs": [],
"source": [
"query_result = embeddings.embed_query(text)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a961cdb5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "cohere",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -214,7 +338,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
"version": "3.10.9"
},
"vscode": {
"interpreter": {

View File

@@ -10,7 +10,7 @@
"\n",
"At a high level, HyDE is an embedding technique that takes queries, generates a hypothetical answer, and then embeds that generated document and uses that as the final example. \n",
"\n",
"In order to use HyDE, we therefor need to provide a base embedding model, as well as an LLMChain that can be used to generate those documents. By default, the HyDE class comes with some default prompts to use (see the paper for more details on them), but we can also create our own."
"In order to use HyDE, we therefore need to provide a base embedding model, as well as an LLMChain that can be used to generate those documents. By default, the HyDE class comes with some default prompts to use (see the paper for more details on them), but we can also create our own."
]
},
{
@@ -21,8 +21,8 @@
"outputs": [],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.embeddings import OpenAIEmbeddings, HypotheticalDocumentEmbedder\n",
"from langchain.chains import LLMChain\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.chains import LLMChain, HypotheticalDocumentEmbedder\n",
"from langchain.prompts import PromptTemplate"
]
},
@@ -220,7 +220,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "llm-env",
"language": "python",
"name": "python3"
},
@@ -234,7 +234,12 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.9.0 (default, Nov 15 2020, 06:25:35) \n[Clang 10.0.0 ]"
},
"vscode": {
"interpreter": {
"hash": "9dd01537e9ab68cf47cb0398488d182358f774f73101197b3bd1b5502c6ec7f9"
}
}
},
"nbformat": 4,

View File

@@ -1,13 +1,14 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "b118c9dc",
"metadata": {},
"source": [
"# Text Splitter\n",
"\n",
"When you want to deal wit long pieces of text, it is necessary to split up that text into chunks.\n",
"When you want to deal with long pieces of text, it is necessary to split up that text into chunks.\n",
"This notebook showcases several ways to do that.\n",
"\n",
"At a high level, text splitters work as following:\n",
@@ -151,7 +152,7 @@
"metadata": {},
"source": [
"## Document creation\n",
"We can also use the text splitter to create \"Documents\" directly. Documents a way of bundling pieces of text with associated metadata so that chains can interact with them. We can also create documents with empty metadata though!\n",
"We can also use the text splitter to create \"Documents\" directly. Documents are a way of bundling pieces of text with associated metadata so that chains can interact with them. We can also create documents with empty metadata though!\n",
"\n",
"In the below example, we pass two pieces of text to get split up (we pass two just to show off the interface of splitting multiple pieces of text)."
]
@@ -486,7 +487,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
@@ -500,7 +501,12 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.9.12 (main, Mar 26 2022, 15:51:15) \n[Clang 13.1.6 (clang-1316.0.21.2)]"
},
"vscode": {
"interpreter": {
"hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
}
}
},
"nbformat": 4,

View File

@@ -16,7 +16,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 1,
"id": "965eecee",
"metadata": {
"pycharm": {
@@ -27,12 +27,12 @@
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import ElasticVectorSearch, Pinecone, Weaviate, FAISS"
"from langchain.vectorstores import ElasticVectorSearch, Pinecone, Weaviate, FAISS, Qdrant"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 2,
"id": "68481687",
"metadata": {
"pycharm": {
@@ -514,10 +514,62 @@
"docs[0]"
]
},
{
"cell_type": "markdown",
"id": "9b852079",
"metadata": {},
"source": [
"## Qdrant"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e7d74bd2",
"id": "e5ec70ce",
"metadata": {},
"outputs": [],
"source": [
"host = \"<---host name here --->\"\n",
"api_key = \"<---api key here--->\"\n",
"qdrant = Qdrant.from_texts(texts, embeddings, host=host, prefer_grpc=True, api_key=api_key)\n",
"query = \"What did the president say about Ketanji Brown Jackson\""
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "9805ad1f",
"metadata": {},
"outputs": [],
"source": [
"docs = qdrant.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "bd097a0e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content='In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \\n\\nWe cannot let this happen. \\n\\nTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', lookup_str='', metadata={}, lookup_index=0)"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8ffd66e2",
"metadata": {},
"outputs": [],
"source": []

View File

@@ -0,0 +1,192 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Bing Search"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook goes over how to use the bing search component.\n",
"\n",
"First, you need to set up the proper API keys and environment variables. To set it up, follow the instructions found [here](https://levelup.gitconnected.com/api-tutorial-how-to-use-bing-web-search-api-in-python-4165d5592a7e).\n",
"\n",
"Then we will need to set some environment variables."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"BING_SUBSCRIPTION_KEY\"] = \"\"\n",
"os.environ[\"BING_SEARCH_URL\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"from langchain.utilities import BingSearchAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"search = BingSearchAPIWrapper()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Thanks to the flexibility of <b>Python</b> and the powerful ecosystem of packages, the Azure CLI supports features such as autocompletion (in shells that support it), persistent credentials, JMESPath result parsing, lazy initialization, network-less unit tests, and more. Building an open-source and cross-platform Azure CLI with <b>Python</b> by Dan Taylor. <b>Python</b> releases by version number: Release version Release date Click for more. <b>Python</b> 3.11.1 Dec. 6, 2022 Download Release Notes. <b>Python</b> 3.10.9 Dec. 6, 2022 Download Release Notes. <b>Python</b> 3.9.16 Dec. 6, 2022 Download Release Notes. <b>Python</b> 3.8.16 Dec. 6, 2022 Download Release Notes. <b>Python</b> 3.7.16 Dec. 6, 2022 Download Release Notes. In this lesson, we will look at the += operator in <b>Python</b> and see how it works with several simple examples.. The operator += is a shorthand for the addition assignment operator.It adds two values and assigns the sum to a variable (left operand). W3Schools offers free online tutorials, references and exercises in all the major languages of the web. Covering popular subjects like HTML, CSS, JavaScript, <b>Python</b>, SQL, Java, and many, many more. This tutorial introduces the reader informally to the basic concepts and features of the <b>Python</b> language and system. It helps to have a <b>Python</b> interpreter handy for hands-on experience, but all examples are self-contained, so the tutorial can be read off-line as well. For a description of standard objects and modules, see The <b>Python</b> Standard ... <b>Python</b> is a general-purpose, versatile, and powerful programming language. It&#39;s a great first language because <b>Python</b> code is concise and easy to read. Whatever you want to do, <b>python</b> can do it. From web development to machine learning to data science, <b>Python</b> is the language for you. To install <b>Python</b> using the Microsoft Store: Go to your Start menu (lower left Windows icon), type &quot;Microsoft Store&quot;, select the link to open the store. Once the store is open, select Search from the upper-right menu and enter &quot;<b>Python</b>&quot;. Select which version of <b>Python</b> you would like to use from the results under Apps. Under the “<b>Python</b> Releases for Mac OS X” heading, click the link for the Latest <b>Python</b> 3 Release - <b>Python</b> 3.x.x. As of this writing, the latest version was <b>Python</b> 3.8.4. Scroll to the bottom and click macOS 64-bit installer to start the download. When the installer is finished downloading, move on to the next step. Step 2: Run the Installer'"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.run(\"python\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Number of results\n",
"You can use the `k` parameter to set the number of results"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"search = BingSearchAPIWrapper(k=1)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Thanks to the flexibility of <b>Python</b> and the powerful ecosystem of packages, the Azure CLI supports features such as autocompletion (in shells that support it), persistent credentials, JMESPath result parsing, lazy initialization, network-less unit tests, and more. Building an open-source and cross-platform Azure CLI with <b>Python</b> by Dan Taylor.'"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.run(\"python\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Metadata Results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run query through BingSearch and return snippet, title, and link metadata.\n",
"\n",
"- Snippet: The description of the result.\n",
"- Title: The title of the result.\n",
"- Link: The link to the result."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"search = BingSearchAPIWrapper()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'snippet': 'Lady Alice. Pink Lady <b>apples</b> arent the only lady in the apple family. Lady Alice <b>apples</b> were discovered growing, thanks to bees pollinating, in Washington. They are smaller and slightly more stout in appearance than other varieties. Their skin color appears to have red and yellow stripes running from stem to butt.',\n",
" 'title': '25 Types of Apples - Jessica Gavin',\n",
" 'link': 'https://www.jessicagavin.com/types-of-apples/'},\n",
" {'snippet': '<b>Apples</b> can do a lot for you, thanks to plant chemicals called flavonoids. And they have pectin, a fiber that breaks down in your gut. If you take off the apples skin before eating it, you won ...',\n",
" 'title': 'Apples: Nutrition &amp; Health Benefits - WebMD',\n",
" 'link': 'https://www.webmd.com/food-recipes/benefits-apples'},\n",
" {'snippet': '<b>Apples</b> boast many vitamins and minerals, though not in high amounts. However, <b>apples</b> are usually a good source of vitamin C. Vitamin C. Also called ascorbic acid, this vitamin is a common ...',\n",
" 'title': 'Apples 101: Nutrition Facts and Health Benefits',\n",
" 'link': 'https://www.healthline.com/nutrition/foods/apples'},\n",
" {'snippet': 'Weight management. The fibers in <b>apples</b> can slow digestion, helping one to feel greater satisfaction after eating. After following three large prospective cohorts of 133,468 men and women for 24 years, researchers found that higher intakes of fiber-rich fruits with a low glycemic load, particularly <b>apples</b> and pears, were associated with the least amount of weight gain over time.',\n",
" 'title': 'Apples | The Nutrition Source | Harvard T.H. Chan School of Public Health',\n",
" 'link': 'https://www.hsph.harvard.edu/nutritionsource/food-features/apples/'}]"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.results(\"apples\", 5)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
},
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -16,19 +16,19 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 1,
"id": "34bb5968",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"GOOGLE_CSE_ID\"] = \n",
"os.environ[\"GOOGLE_API_KEY\"] = "
"os.environ[\"GOOGLE_CSE_ID\"] = \"\"\n",
"os.environ[\"GOOGLE_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 2,
"id": "ac4910f8",
"metadata": {},
"outputs": [],
@@ -38,7 +38,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 3,
"id": "84b8f773",
"metadata": {},
"outputs": [],
@@ -48,17 +48,17 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 4,
"id": "068991a6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'STATE OF HAWAII. 1 Child\\'s First Name. (Type or print). 2. Sex. BARACK. 3. This Birth. CERTIFICATE OF LIVE BIRTH. FILE. NUMBER 151 le. lb. Middle Name. Barack Hussein Obama II is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party,\\xa0... First Lady Michelle LaVaughn Robinson Obama is a lawyer, writer, and the wife of the 44th President, Barack Obama. She is the first African-American First\\xa0... Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (200917) and the first\\xa0... Aug 18, 2017 ... It took him several seconds and multiple clues to remember former President Barack Obama\\'s first name. Miller knew that every answer had to\\xa0... Feb 9, 2015 ... Michael Jordan misspelled Barack Obama\\'s first name on 50th-birthday gift ... Knowing Obama is a Chicagoan and huge basketball fan,\\xa0... His full name is Barack Hussein Obama II. Since the “II” is simply because he was named for his father, his last name is Obama. Jan 16, 2007 ... 4, 1961, in Honolulu. His first name means \"one who is blessed\" in Swahili. While Obama\\'s father, Barack Hussein Obama Sr., was from Kenya, his\\xa0... Jan 19, 2017 ... Hopeful parents named their sons for the first Black president, whose name is a variation of the Hebrew name Baruch, which means “blessed”\\xa0... Feb 27, 2020 ... President Barack Obama was born Barack Hussein Obama, II, as shown here on his birth certificate here . As reported by Reuters here , his\\xa0...'"
"'1 Child\\'s First Name. 2. 6. 7d. Street Address. 71. (Type or print). BARACK. Sex. 3. This Birth. 4. If Twin or Triplet,. Was Child Born. Barack Hussein Obama II is an American retired politician who served as the 44th president of the United States from 2009 to 2017. His full name is Barack Hussein Obama II. Since the “II” is simply because he was named for his father, his last name is Obama. Feb 9, 2015 ... Michael Jordan misspelled Barack Obama\\'s first name on 50th-birthday gift ... Knowing Obama is a Chicagoan and huge basketball fan,\\xa0... Aug 18, 2017 ... It took him several seconds and multiple clues to remember former President Barack Obama\\'s first name. Miller knew that every answer had to end\\xa0... First Lady Michelle LaVaughn Robinson Obama is a lawyer, writer, and the wife of the 44th President, Barack Obama. She is the first African-American First\\xa0... Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (200917) and the first\\xa0... When Barack Obama was elected president in 2008, he became the first African American to hold ... The Middle East remained a key foreign policy challenge. Feb 27, 2020 ... President Barack Obama was born Barack Hussein Obama, II, as shown here on his birth certificate here . As reported by Reuters here , his\\xa0... Jan 16, 2007 ... 4, 1961, in Honolulu. His first name means \"one who is blessed\" in Swahili. While Obama\\'s father, Barack Hussein Obama Sr., was from Kenya, his\\xa0...'"
]
},
"execution_count": 7,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -67,13 +67,118 @@
"search.run(\"Obama's first name?\")"
]
},
{
"cell_type": "markdown",
"id": "074b7f07",
"metadata": {},
"source": [
"## Number of Results\n",
"You can use the `k` parameter to set the number of results"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"id": "5083fbdd",
"metadata": {},
"outputs": [],
"source": [
"search = GoogleSearchAPIWrapper(k=1)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "77aaa857",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The official home of the Python Programming Language.'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.run(\"python\")"
]
},
{
"cell_type": "markdown",
"id": "11c8d94f",
"metadata": {},
"source": [
"'The official home of the Python Programming Language.'"
]
},
{
"cell_type": "markdown",
"id": "73473110",
"metadata": {},
"source": [
"## Metadata Results"
]
},
{
"cell_type": "markdown",
"id": "109fe796",
"metadata": {},
"source": [
"Run query through GoogleSearch and return snippet, title, and link metadata.\n",
"\n",
"- Snippet: The description of the result.\n",
"- Title: The title of the result.\n",
"- Link: The link to the result."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "028f4cba",
"metadata": {},
"outputs": [],
"source": []
"source": [
"search = GoogleSearchAPIWrapper()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "4d8f734f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'snippet': 'Discover the innovative world of Apple and shop everything iPhone, iPad, Apple Watch, Mac, and Apple TV, plus explore accessories, entertainment,\\xa0...',\n",
" 'title': 'Apple',\n",
" 'link': 'https://www.apple.com/'},\n",
" {'snippet': \"Jul 10, 2022 ... Whether or not you're up on your apple trivia, no doubt you know how delicious this popular fruit is, and how nutritious. Apples are rich in\\xa0...\",\n",
" 'title': '25 Types of Apples and What to Make With Them - Parade ...',\n",
" 'link': 'https://parade.com/1330308/bethlipton/types-of-apples/'},\n",
" {'snippet': 'An apple is an edible fruit produced by an apple tree (Malus domestica). Apple trees are cultivated worldwide and are the most widely grown species in the\\xa0...',\n",
" 'title': 'Apple - Wikipedia',\n",
" 'link': 'https://en.wikipedia.org/wiki/Apple'},\n",
" {'snippet': 'Apples are a popular fruit. They contain antioxidants, vitamins, dietary fiber, and a range of other nutrients. Due to their varied nutrient content,\\xa0...',\n",
" 'title': 'Apples: Benefits, nutrition, and tips',\n",
" 'link': 'https://www.medicalnewstoday.com/articles/267290'},\n",
" {'snippet': \"An apple is a crunchy, bright-colored fruit, one of the most popular in the United States. You've probably heard the age-old saying, “An apple a day keeps\\xa0...\",\n",
" 'title': 'Apples: Nutrition & Health Benefits',\n",
" 'link': 'https://www.webmd.com/food-recipes/benefits-apples'}]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search.results(\"apples\", 5)"
]
}
],
"metadata": {
@@ -93,6 +198,11 @@
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
},
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
}
}
},
"nbformat": 4,

57
docs/tracing.md Normal file
View File

@@ -0,0 +1,57 @@
# Tracing
By enabling tracing in your LangChain runs, youll be able to more effectively visualize, step through, and debug your chains and agents.
First, you should install tracing and set up your environment properly.
You can use either a locally hosted version of this (uses Docker) or a cloud hosted version (in closed alpha).
If you're interested in using the hosted platform, please fill out the form [here](https://forms.gle/tRCEMSeopZf6TE3b6).
- [Locally Hosted Setup](./tracing/local_installation.md)
- [Cloud Hosted Setup](./tracing/hosted_installation.md)
## Tracing Walkthrough
When you first access the UI, you should see a page with your tracing sessions.
An initial one "default" should already be created for you.
A session is just a way to group traces together.
If you click on a session, it will take you to a page with no recorded traces that says "No Runs."
You can create a new session with the new session form.
![](tracing/homepage.png)
If we click on the `default` session, we can see that to start we have no traces stored.
![](tracing/default_empty.png)
If we now start running chains and agents with tracing enabled, we will see data show up here.
To do so, we can run [this notebook](tracing/agent_with_tracing.ipynb) as an example.
After running it, we will see an initial trace show up.
![](tracing/first_trace.png)
From here we can explore the trace at a high level by clicking on the arrow to show nested runs.
We can keep on clicking further and further down to explore deeper and deeper.
![](tracing/explore.png)
We can also click on the "Explore" button of the top level run to dive even deeper.
Here, we can see the inputs and outputs in full, as well as all the nested traces.
![](tracing/explore_trace.png)
We can keep on exploring each of these nested traces in more detail.
For example, here is the lowest level trace with the exact inputs/outputs to the LLM.
![](tracing/explore_llm.png)
## Changing Sessions
1. To initially record traces to a session other than `"default"`, you can set the `LANGCHAIN_SESSION` environment variable to the name of the session you want to record to:
```python
import os
os.environ["LANGCHAIN_HANDLER"] = "langchain"
os.environ["LANGCHAIN_SESSION"] = "my_session" # Make sure this session actually exists. You can create a new session in the UI.
```
2. To switch sessions mid-script or mid-notebook, do NOT set the `LANGCHAIN_SESSION` environment variable. Instead: `langchain.set_tracing_callback_manager(session_name="my_session")`

View File

@@ -0,0 +1,116 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "5371a9bb",
"metadata": {},
"source": [
"# Tracing Walkthrough"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "17c04cc6-c93d-4b6c-a033-e897577f4ed1",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"LANGCHAIN_HANDLER\"] = \"langchain\"\n",
"\n",
"## Uncomment this if using hosted setup.\n",
"\n",
"# os.environ[\"LANGCHAIN_ENDPOINT\"] = \"https://langchain-api-gateway-57eoxz8z.uc.gateway.dev\" \n",
"\n",
"## Uncomment this if you want traces to be recorded to \"my_session\" instead of default.\n",
"\n",
"# os.environ[\"LANGCHAIN_SESSION\"] = \"my_session\" \n",
"\n",
"## Better to set this environment variable in the terminal\n",
"## Uncomment this if using hosted version. Replace \"my_api_key\" with your actual API Key.\n",
"\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = \"my_api_key\" \n",
"\n",
"import langchain\n",
"from langchain.agents import Tool, initialize_agent, load_tools\n",
"from langchain.llms import OpenAI"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "bfa16b79-aa4b-4d41-a067-70d1f593f667",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to use a calculator to solve this.\n",
"Action: Calculator\n",
"Action Input: 2^.123243\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mAnswer: 1.0891804557407723\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: 1.0891804557407723\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'1.0891804557407723'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Agent run with tracing. Ensure that OPENAI_API_KEY is set appropriately to run this example.\n",
"\n",
"llm = OpenAI(temperature=0)\n",
"tools = load_tools([\"llm-math\"], llm=llm)\n",
"agent = initialize_agent(\n",
" tools, llm, agent=\"zero-shot-react-description\", verbose=True\n",
")\n",
"\n",
"agent.run(\"What is 2 raised to .123243 power?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "25addd7f",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

BIN
docs/tracing/explore.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 348 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 239 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 253 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 117 KiB

BIN
docs/tracing/homepage.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

View File

@@ -0,0 +1,36 @@
# Cloud Hosted Setup
We offer a hosted version of tracing at [langchainplus.vercel.app](https://langchainplus.vercel.app/). You can use this to view traces from your run without having to run the server locally.
Note: we are currently only offering this to a limited number of users. The hosted platform is VERY alpha, in active development, and data might be dropped at any time. Don't depend on data being persisted in the system long term and don't log traces that may contain sensitive information. If you're interested in using the hosted platform, please fill out the form [here](https://forms.gle/tRCEMSeopZf6TE3b6).
## Installation
1. Login to the system and click "API Key" in the top right corner. Generate a new key and keep it safe. You will need it to authenticate with the system.
## Environment Setup
After installation, you must now set up your environment to use tracing.
This can be done by setting an environment variable in your terminal by running `export LANGCHAIN_HANDLER=langchain`.
You can also do this by adding the below snippet to the top of every script. **IMPORTANT:** this must go at the VERY TOP of your script, before you import anything from `langchain`.
```python
import os
os.environ["LANGCHAIN_HANDLER"] = "langchain"
```
You will also need to set an environment variable to specify the endpoint and your API key. This can be done with the following environment variables:
1. `LANGCHAIN_ENDPOINT` = "https://langchain-api-gateway-57eoxz8z.uc.gateway.dev"
2. `LANGCHAIN_API_KEY` - set this to the API key you generated during installation.
An example of adding all relevant environment variables is below:
```python
import os
os.environ["LANGCHAIN_HANDLER"] = "langchain"
os.environ["LANGCHAIN_ENDPOINT"] = "https://langchain-api-gateway-57eoxz8z.uc.gateway.dev"
os.environ["LANGCHAIN_API_KEY"] = "my_api_key" # Don't commit this to your repo! Better to set it in your terminal.
```

View File

@@ -0,0 +1,35 @@
# Locally Hosted Setup
This page contains instructions for installing and then setting up the environment to use the locally hosted version of tracing.
## Installation
1. Ensure you have Docker installed (see [Get Docker](https://docs.docker.com/get-docker/)) and that its running.
2. Install the latest version of `langchain`: `pip install langchain` or `pip install langchain -U` to upgrade your
existing version.
3. Run `langchain-server`
1. This will spin up the server in the terminal.
2. Once you see the terminal
output `langchain-langchain-frontend-1 | ➜ Local: [http://localhost:4173/](http://localhost:4173/)`, navigate
to [http://localhost:4173/](http://localhost:4173/)
4. You should see a page with your tracing sessions. See the overview page for a walkthrough of the UI.
5. Currently, trace data is not guaranteed to be persisted between runs of `langchain-server`. If you want to
persist your data, you can mount a volume to the Docker container. See the [Docker docs](https://docs.docker.com/storage/volumes/) for more info.
6. To stop the server, press `Ctrl+C` in the terminal where you ran `langchain-server`.
## Environment Setup
After installation, you must now set up your environment to use tracing.
This can be done by setting an environment variable in your terminal by running `export LANGCHAIN_HANDLER=langchain`.
You can also do this by adding the below snippet to the top of every script. **IMPORTANT:** this must go at the VERY TOP of your script, before you import anything from `langchain`.
```python
import os
os.environ["LANGCHAIN_HANDLER"] = "langchain"
```

View File

@@ -4,7 +4,11 @@ from typing import Optional
from langchain.agents import MRKLChain, ReActChain, SelfAskWithSearchChain
from langchain.cache import BaseCache
from langchain.callbacks import set_default_callback_manager, set_handler
from langchain.callbacks import (
set_default_callback_manager,
set_handler,
set_tracing_callback_manager,
)
from langchain.chains import (
ConversationChain,
LLMBashChain,
@@ -68,4 +72,5 @@ __all__ = [
"QAWithSourcesChain",
"PALChain",
"set_handler",
"set_tracing_callback_manager",
]

View File

@@ -1,12 +1,13 @@
"""Interface for agents."""
from langchain.agents.agent import Agent, AgentExecutor
from langchain.agents.conversational.base import ConversationalAgent
from langchain.agents.initialize import initialize_agent
from langchain.agents.load_tools import get_all_tool_names, load_tools
from langchain.agents.loading import initialize_agent
from langchain.agents.loading import load_agent
from langchain.agents.mrkl.base import MRKLChain, ZeroShotAgent
from langchain.agents.react.base import ReActChain, ReActTextWorldAgent
from langchain.agents.self_ask_with_search.base import SelfAskWithSearchChain
from langchain.agents.tools import Tool
from langchain.agents.tools import Tool, tool
__all__ = [
"MRKLChain",
@@ -15,10 +16,12 @@ __all__ = [
"AgentExecutor",
"Agent",
"Tool",
"tool",
"initialize_agent",
"ZeroShotAgent",
"ReActTextWorldAgent",
"load_tools",
"get_all_tool_names",
"ConversationalAgent",
"load_agent",
]

View File

@@ -1,10 +1,13 @@
"""Chain that takes in an input and produces an action and action input."""
from __future__ import annotations
import json
import logging
from abc import abstractmethod
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple, Union
import yaml
from pydantic import BaseModel, root_validator
from langchain.agents.tools import Tool
@@ -30,6 +33,7 @@ class Agent(BaseModel):
"""
llm_chain: LLMChain
allowed_tools: Optional[List[str]] = None
return_values: List[str] = ["output"]
@abstractmethod
@@ -44,6 +48,29 @@ class Agent(BaseModel):
def _stop(self) -> List[str]:
return [f"\n{self.observation_prefix}"]
def _construct_scratchpad(
self, intermediate_steps: List[Tuple[AgentAction, str]]
) -> str:
"""Construct the scratchpad that lets the agent continue its thought process."""
thoughts = ""
for action, observation in intermediate_steps:
thoughts += action.log
thoughts += f"\n{self.observation_prefix}{observation}\n{self.llm_prefix}"
return thoughts
def _get_next_action(self, full_inputs: Dict[str, str]) -> AgentAction:
full_output = self.llm_chain.predict(**full_inputs)
parsed_output = self._extract_tool_and_input(full_output)
while parsed_output is None:
full_output = self._fix_text(full_output)
full_inputs["agent_scratchpad"] += full_output
output = self.llm_chain.predict(**full_inputs)
full_output += output
parsed_output = self._extract_tool_and_input(full_output)
return AgentAction(
tool=parsed_output[0], tool_input=parsed_output[1], log=full_output
)
def plan(
self, intermediate_steps: List[Tuple[AgentAction, str]], **kwargs: Any
) -> Union[AgentAction, AgentFinish]:
@@ -57,24 +84,14 @@ class Agent(BaseModel):
Returns:
Action specifying what tool to use.
"""
thoughts = ""
for action, observation in intermediate_steps:
thoughts += action.log
thoughts += f"\n{self.observation_prefix}{observation}\n{self.llm_prefix}"
thoughts = self._construct_scratchpad(intermediate_steps)
new_inputs = {"agent_scratchpad": thoughts, "stop": self._stop}
full_inputs = {**kwargs, **new_inputs}
full_output = self.llm_chain.predict(**full_inputs)
parsed_output = self._extract_tool_and_input(full_output)
while parsed_output is None:
full_output = self._fix_text(full_output)
full_inputs["agent_scratchpad"] += full_output
output = self.llm_chain.predict(**full_inputs)
full_output += output
parsed_output = self._extract_tool_and_input(full_output)
tool, tool_input = parsed_output
if tool == self.finish_tool_name:
return AgentFinish({"output": tool_input}, full_output)
return AgentAction(tool, tool_input, full_output)
action = self._get_next_action(full_inputs)
if action.tool == self.finish_tool_name:
return AgentFinish({"output": action.tool_input}, action.log)
return action
def prepare_for_new_call(self) -> None:
"""Prepare the agent for new call, if needed."""
@@ -146,7 +163,8 @@ class Agent(BaseModel):
prompt=cls.create_prompt(tools),
callback_manager=callback_manager,
)
return cls(llm_chain=llm_chain, **kwargs)
tool_names = [tool.name for tool in tools]
return cls(llm_chain=llm_chain, allowed_tools=tool_names, **kwargs)
def return_stopped_response(
self,
@@ -192,6 +210,50 @@ class Agent(BaseModel):
f"got {early_stopping_method}"
)
@property
@abstractmethod
def _agent_type(self) -> str:
"""Return Identifier of agent type."""
def dict(self, **kwargs: Any) -> Dict:
"""Return dictionary representation of agent."""
_dict = super().dict()
_dict["_type"] = self._agent_type
return _dict
def save(self, file_path: Union[Path, str]) -> None:
"""Save the agent.
Args:
file_path: Path to file to save the agent to.
Example:
.. code-block:: python
# If working with agent executor
agent.agent.save(file_path="path/agent.yaml")
"""
# Convert file to Path object.
if isinstance(file_path, str):
save_path = Path(file_path)
else:
save_path = file_path
directory_path = save_path.parent
directory_path.mkdir(parents=True, exist_ok=True)
# Fetch dictionary to save
agent_dict = self.dict()
if save_path.suffix == ".json":
with open(file_path, "w") as f:
json.dump(agent_dict, f, indent=4)
elif save_path.suffix == ".yaml":
with open(file_path, "w") as f:
yaml.dump(agent_dict, f, default_flow_style=False)
else:
raise ValueError(f"{save_path} must be json or yaml")
class AgentExecutor(Chain, BaseModel):
"""Consists of an agent using tools."""
@@ -199,7 +261,7 @@ class AgentExecutor(Chain, BaseModel):
agent: Agent
tools: List[Tool]
return_intermediate_steps: bool = False
max_iterations: Optional[int] = None
max_iterations: Optional[int] = 15
early_stopping_method: str = "force"
@classmethod
@@ -215,6 +277,31 @@ class AgentExecutor(Chain, BaseModel):
agent=agent, tools=tools, callback_manager=callback_manager, **kwargs
)
@root_validator()
def validate_tools(cls, values: Dict) -> Dict:
"""Validate that tools are compatible with agent."""
agent = values["agent"]
tools = values["tools"]
if agent.allowed_tools is not None:
if set(agent.allowed_tools) != set([tool.name for tool in tools]):
raise ValueError(
f"Allowed tools ({agent.allowed_tools}) different than "
f"provided tools ({[tool.name for tool in tools]})"
)
return values
def save(self, file_path: Union[Path, str]) -> None:
"""Raise error - saving not supported for Agent Executors."""
raise ValueError(
"Saving not supported for agent executors. "
"If you are trying to save the agent, please use the "
"`.save_agent(...)`"
)
def save_agent(self, file_path: Union[Path, str]) -> None:
"""Save the underlying agent."""
return self.agent.save(file_path)
@property
def input_keys(self) -> List[str]:
"""Return the input keys.
@@ -241,8 +328,9 @@ class AgentExecutor(Chain, BaseModel):
return iterations < self.max_iterations
def _return(self, output: AgentFinish, intermediate_steps: list) -> Dict[str, Any]:
if self.verbose:
self.callback_manager.on_agent_finish(output, color="green")
self.callback_manager.on_agent_finish(
output, color="green", verbose=self.verbose
)
final_output = output.return_values
if self.return_intermediate_steps:
final_output["intermediate_steps"] = intermediate_steps
@@ -272,35 +360,35 @@ class AgentExecutor(Chain, BaseModel):
# Otherwise we lookup the tool
if output.tool in name_to_tool_map:
tool = name_to_tool_map[output.tool]
if self.verbose:
self.callback_manager.on_tool_start(
{"name": str(tool.func)[:60] + "..."}, output, color="green"
)
self.callback_manager.on_tool_start(
{"name": str(tool.func)[:60] + "..."},
output,
color="green",
verbose=self.verbose,
)
try:
# We then call the tool on the tool input to get an observation
observation = tool.func(output.tool_input)
color = color_mapping[output.tool]
return_direct = tool.return_direct
except Exception as e:
if self.verbose:
self.callback_manager.on_tool_error(e)
except (KeyboardInterrupt, Exception) as e:
self.callback_manager.on_tool_error(e, verbose=self.verbose)
raise e
else:
if self.verbose:
self.callback_manager.on_tool_start(
{"name": "N/A"}, output, color="green"
)
self.callback_manager.on_tool_start(
{"name": "N/A"}, output, color="green", verbose=self.verbose
)
observation = f"{output.tool} is not a valid tool, try another one."
color = None
return_direct = False
if self.verbose:
llm_prefix = "" if return_direct else self.agent.llm_prefix
self.callback_manager.on_tool_end(
observation,
color=color,
observation_prefix=self.agent.observation_prefix,
llm_prefix=llm_prefix,
)
llm_prefix = "" if return_direct else self.agent.llm_prefix
self.callback_manager.on_tool_end(
observation,
color=color,
observation_prefix=self.agent.observation_prefix,
llm_prefix=llm_prefix,
verbose=self.verbose,
)
intermediate_steps.append((output, observation))
if return_direct:
# Set the log to "" because we do not want to log it.

View File

@@ -18,6 +18,11 @@ class ConversationalAgent(Agent):
ai_prefix: str = "AI"
@property
def _agent_type(self) -> str:
"""Return Identifier of agent type."""
return "conversational-react-description"
@property
def observation_prefix(self) -> str:
"""Prefix to append the observation with."""
@@ -70,15 +75,15 @@ class ConversationalAgent(Agent):
return self.ai_prefix
def _extract_tool_and_input(self, llm_output: str) -> Optional[Tuple[str, str]]:
if f"{self.ai_prefix}: " in llm_output:
return self.ai_prefix, llm_output.split(f"{self.ai_prefix}: ")[-1]
if f"{self.ai_prefix}:" in llm_output:
return self.ai_prefix, llm_output.split(f"{self.ai_prefix}:")[-1].strip()
regex = r"Action: (.*?)\nAction Input: (.*)"
match = re.search(regex, llm_output)
if not match:
raise ValueError(f"Could not parse LLM output: `{llm_output}`")
action = match.group(1)
action_input = match.group(2)
return action, action_input.strip(" ").strip('"')
return action.strip(), action_input.strip(" ").strip('"')
@classmethod
def from_llm_and_tools(
@@ -86,18 +91,29 @@ class ConversationalAgent(Agent):
llm: BaseLLM,
tools: List[Tool],
callback_manager: Optional[BaseCallbackManager] = None,
prefix: str = PREFIX,
suffix: str = SUFFIX,
ai_prefix: str = "AI",
human_prefix: str = "Human",
input_variables: Optional[List[str]] = None,
**kwargs: Any,
) -> Agent:
"""Construct an agent from an LLM and tools."""
cls._validate_tools(tools)
prompt = cls.create_prompt(
tools, ai_prefix=ai_prefix, human_prefix=human_prefix
tools,
ai_prefix=ai_prefix,
human_prefix=human_prefix,
prefix=prefix,
suffix=suffix,
input_variables=input_variables,
)
llm_chain = LLMChain(
llm=llm,
prompt=prompt,
callback_manager=callback_manager,
)
return cls(llm_chain=llm_chain, ai_prefix=ai_prefix, **kwargs)
tool_names = [tool.name for tool in tools]
return cls(
llm_chain=llm_chain, allowed_tools=tool_names, ai_prefix=ai_prefix, **kwargs
)

View File

@@ -0,0 +1,72 @@
"""Load agent."""
from typing import Any, List, Optional
from langchain.agents.agent import AgentExecutor
from langchain.agents.loading import AGENT_TO_CLASS, load_agent
from langchain.agents.tools import Tool
from langchain.callbacks.base import BaseCallbackManager
from langchain.llms.base import BaseLLM
def initialize_agent(
tools: List[Tool],
llm: BaseLLM,
agent: Optional[str] = None,
callback_manager: Optional[BaseCallbackManager] = None,
agent_path: Optional[str] = None,
agent_kwargs: Optional[dict] = None,
**kwargs: Any,
) -> AgentExecutor:
"""Load agent given tools and LLM.
Args:
tools: List of tools this agent has access to.
llm: Language model to use as the agent.
agent: The agent to use. Valid options are:
`zero-shot-react-description`
`react-docstore`
`self-ask-with-search`
`conversational-react-description`
If None and agent_path is also None, will default to
`zero-shot-react-description`.
callback_manager: CallbackManager to use. Global callback manager is used if
not provided. Defaults to None.
agent_path: Path to serialized agent to use.
**kwargs: Additional key word arguments to pass to the agent.
Returns:
An agent.
"""
if agent is None and agent_path is None:
agent = "zero-shot-react-description"
if agent is not None and agent_path is not None:
raise ValueError(
"Both `agent` and `agent_path` are specified, "
"but at most only one should be."
)
if agent is not None:
if agent not in AGENT_TO_CLASS:
raise ValueError(
f"Got unknown agent type: {agent}. "
f"Valid types are: {AGENT_TO_CLASS.keys()}."
)
agent_cls = AGENT_TO_CLASS[agent]
agent_kwargs = agent_kwargs or {}
agent_obj = agent_cls.from_llm_and_tools(
llm, tools, callback_manager=callback_manager, **agent_kwargs
)
elif agent_path is not None:
agent_obj = load_agent(
agent_path, llm=llm, tools=tools, callback_manager=callback_manager
)
else:
raise ValueError(
"Somehow both `agent` and `agent_path` are None, "
"this should never happen."
)
return AgentExecutor.from_agent_and_tools(
agent=agent_obj,
tools=tools,
callback_manager=callback_manager,
**kwargs,
)

View File

@@ -24,30 +24,6 @@ def _get_python_repl() -> Tool:
)
def _get_serpapi() -> Tool:
return Tool(
"Search",
SerpAPIWrapper().run,
"A search engine. Useful for when you need to answer questions about current events. Input should be a search query.",
)
def _get_google_search() -> Tool:
return Tool(
"Google Search",
GoogleSearchAPIWrapper().run,
"A wrapper around Google Search. Useful for when you need to answer questions about current events. Input should be a search query.",
)
def _get_wolfram_alpha() -> Tool:
return Tool(
"Wolfram Alpha",
WolframAlphaAPIWrapper().run,
"A wrapper around Wolfram Alpha. Useful for when you need to answer questions about Math, Science, Technology, Culture, Society and Everyday Life. Input should be a search query.",
)
def _get_requests() -> Tool:
return Tool(
"Requests",
@@ -66,11 +42,8 @@ def _get_terminal() -> Tool:
_BASE_TOOLS = {
"python_repl": _get_python_repl,
"serpapi": _get_serpapi,
"requests": _get_requests,
"terminal": _get_terminal,
"google-search": _get_google_search,
"wolfram-alpha": _get_wolfram_alpha,
}
@@ -141,10 +114,39 @@ def _get_tmdb_api(llm: BaseLLM, **kwargs: Any) -> Tool:
)
_EXTRA_TOOLS = {
def _get_wolfram_alpha(**kwargs: Any) -> Tool:
return Tool(
"Wolfram Alpha",
WolframAlphaAPIWrapper(**kwargs).run,
"A wrapper around Wolfram Alpha. Useful for when you need to answer questions about Math, Science, Technology, Culture, Society and Everyday Life. Input should be a search query.",
)
def _get_google_search(**kwargs: Any) -> Tool:
return Tool(
"Google Search",
GoogleSearchAPIWrapper(**kwargs).run,
"A wrapper around Google Search. Useful for when you need to answer questions about current events. Input should be a search query.",
)
def _get_serpapi(**kwargs: Any) -> Tool:
return Tool(
"Search",
SerpAPIWrapper(**kwargs).run,
"A search engine. Useful for when you need to answer questions about current events. Input should be a search query.",
)
_EXTRA_LLM_TOOLS = {
"news-api": (_get_news_api, ["news_api_key"]),
"tmdb-api": (_get_tmdb_api, ["tmdb_bearer_token"]),
}
_EXTRA_OPTIONAL_TOOLS = {
"wolfram-alpha": (_get_wolfram_alpha, ["wolfram_alpha_appid"]),
"google-search": (_get_google_search, ["google_api_key", "google_cse_id"]),
"serpapi": (_get_serpapi, ["serpapi_api_key"]),
}
def load_tools(
@@ -167,10 +169,10 @@ def load_tools(
if llm is None:
raise ValueError(f"Tool {name} requires an LLM to be provided")
tools.append(_LLM_TOOLS[name](llm))
elif name in _EXTRA_TOOLS:
elif name in _EXTRA_LLM_TOOLS:
if llm is None:
raise ValueError(f"Tool {name} requires an LLM to be provided")
_get_tool_func, extra_keys = _EXTRA_TOOLS[name]
_get_llm_tool_func, extra_keys = _EXTRA_LLM_TOOLS[name]
missing_keys = set(extra_keys).difference(kwargs)
if missing_keys:
raise ValueError(
@@ -178,7 +180,12 @@ def load_tools(
f"provided: {missing_keys}"
)
sub_kwargs = {k: kwargs[k] for k in extra_keys}
tools.append(_get_tool_func(llm=llm, **sub_kwargs))
tools.append(_get_llm_tool_func(llm=llm, **sub_kwargs))
elif name in _EXTRA_OPTIONAL_TOOLS:
_get_tool_func, extra_keys = _EXTRA_OPTIONAL_TOOLS[name]
sub_kwargs = {k: kwargs[k] for k in extra_keys if k in kwargs}
tools.append(_get_tool_func(**sub_kwargs))
else:
raise ValueError(f"Got unknown tool {name}")
return tools
@@ -186,4 +193,9 @@ def load_tools(
def get_all_tool_names() -> List[str]:
"""Get a list of all possible tool names."""
return list(_BASE_TOOLS) + list(_EXTRA_TOOLS) + list(_LLM_TOOLS)
return (
list(_BASE_TOOLS)
+ list(_EXTRA_OPTIONAL_TOOLS)
+ list(_EXTRA_LLM_TOOLS)
+ list(_LLM_TOOLS)
)

View File

@@ -1,14 +1,19 @@
"""Load agent."""
from typing import Any, List, Optional
"""Functionality for loading agents."""
import json
from pathlib import Path
from typing import Any, List, Optional, Union
from langchain.agents.agent import AgentExecutor
import yaml
from langchain.agents.agent import Agent
from langchain.agents.conversational.base import ConversationalAgent
from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.react.base import ReActDocstoreAgent
from langchain.agents.self_ask_with_search.base import SelfAskWithSearchAgent
from langchain.agents.tools import Tool
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains.loading import load_chain, load_chain_from_config
from langchain.llms.base import BaseLLM
from langchain.utilities.loading import try_load_from_hub
AGENT_TO_CLASS = {
"zero-shot-react-description": ZeroShotAgent,
@@ -17,43 +22,86 @@ AGENT_TO_CLASS = {
"conversational-react-description": ConversationalAgent,
}
URL_BASE = "https://raw.githubusercontent.com/hwchase17/langchain-hub/master/agents/"
def initialize_agent(
tools: List[Tool],
llm: BaseLLM,
agent: str = "zero-shot-react-description",
callback_manager: Optional[BaseCallbackManager] = None,
def _load_agent_from_tools(
config: dict, llm: BaseLLM, tools: List[Tool], **kwargs: Any
) -> Agent:
config_type = config.pop("_type")
if config_type not in AGENT_TO_CLASS:
raise ValueError(f"Loading {config_type} agent not supported")
if config_type not in AGENT_TO_CLASS:
raise ValueError(f"Loading {config_type} agent not supported")
agent_cls = AGENT_TO_CLASS[config_type]
combined_config = {**config, **kwargs}
return agent_cls.from_llm_and_tools(llm, tools, **combined_config)
def load_agent_from_config(
config: dict,
llm: Optional[BaseLLM] = None,
tools: Optional[List[Tool]] = None,
**kwargs: Any,
) -> AgentExecutor:
"""Load agent given tools and LLM.
) -> Agent:
"""Load agent from Config Dict."""
if "_type" not in config:
raise ValueError("Must specify an agent Type in config")
load_from_tools = config.pop("load_from_llm_and_tools", False)
if load_from_tools:
if llm is None:
raise ValueError(
"If `load_from_llm_and_tools` is set to True, "
"then LLM must be provided"
)
if tools is None:
raise ValueError(
"If `load_from_llm_and_tools` is set to True, "
"then tools must be provided"
)
return _load_agent_from_tools(config, llm, tools, **kwargs)
config_type = config.pop("_type")
Args:
tools: List of tools this agent has access to.
llm: Language model to use as the agent.
agent: The agent to use. Valid options are:
`zero-shot-react-description`
`react-docstore`
`self-ask-with-search`
`conversational-react-description`.
callback_manager: CallbackManager to use. Global callback manager is used if
not provided. Defaults to None.
**kwargs: Additional key word arguments to pass to the agent.
if config_type not in AGENT_TO_CLASS:
raise ValueError(f"Loading {config_type} agent not supported")
Returns:
An agent.
"""
if agent not in AGENT_TO_CLASS:
raise ValueError(
f"Got unknown agent type: {agent}. "
f"Valid types are: {AGENT_TO_CLASS.keys()}."
)
agent_cls = AGENT_TO_CLASS[agent]
agent_obj = agent_cls.from_llm_and_tools(
llm, tools, callback_manager=callback_manager
)
return AgentExecutor.from_agent_and_tools(
agent=agent_obj,
tools=tools,
callback_manager=callback_manager,
**kwargs,
)
agent_cls = AGENT_TO_CLASS[config_type]
if "llm_chain" in config:
config["llm_chain"] = load_chain_from_config(config.pop("llm_chain"))
elif "llm_chain_path" in config:
config["llm_chain"] = load_chain(config.pop("llm_chain_path"))
else:
raise ValueError("One of `llm_chain` and `llm_chain_path` should be specified.")
combined_config = {**config, **kwargs}
return agent_cls(**combined_config) # type: ignore
def load_agent(path: Union[str, Path], **kwargs: Any) -> Agent:
"""Unified method for loading a agent from LangChainHub or local fs."""
if hub_result := try_load_from_hub(
path, _load_agent_from_file, "agents", {"json", "yaml"}
):
return hub_result
else:
return _load_agent_from_file(path, **kwargs)
def _load_agent_from_file(file: Union[str, Path], **kwargs: Any) -> Agent:
"""Load agent from file."""
# Convert file to Path object.
if isinstance(file, str):
file_path = Path(file)
else:
file_path = file
# Load from either json or yaml.
if file_path.suffix == ".json":
with open(file_path) as f:
config = json.load(f)
elif file_path.suffix == ".yaml":
with open(file_path, "r") as f:
config = yaml.safe_load(f)
else:
raise ValueError("File type must be json or yaml")
# Load the agent from the config now.
return load_agent_from_config(config, **kwargs)

View File

@@ -7,6 +7,8 @@ from typing import Any, Callable, List, NamedTuple, Optional, Tuple
from langchain.agents.agent import Agent, AgentExecutor
from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS, PREFIX, SUFFIX
from langchain.agents.tools import Tool
from langchain.callbacks.base import BaseCallbackManager
from langchain.chains import LLMChain
from langchain.llms.base import BaseLLM
from langchain.prompts import PromptTemplate
@@ -41,7 +43,7 @@ def get_action_and_input(llm_output: str) -> Tuple[str, str]:
match = re.search(regex, llm_output)
if not match:
raise ValueError(f"Could not parse LLM output: `{llm_output}`")
action = match.group(1)
action = match.group(1).strip()
action_input = match.group(2)
return action, action_input.strip(" ").strip('"')
@@ -49,6 +51,11 @@ def get_action_and_input(llm_output: str) -> Tuple[str, str]:
class ZeroShotAgent(Agent):
"""Agent for the MRKL chain."""
@property
def _agent_type(self) -> str:
"""Return Identifier of agent type."""
return "zero-shot-react-description"
@property
def observation_prefix(self) -> str:
"""Prefix to append the observation with."""
@@ -87,6 +94,30 @@ class ZeroShotAgent(Agent):
input_variables = ["input", "agent_scratchpad"]
return PromptTemplate(template=template, input_variables=input_variables)
@classmethod
def from_llm_and_tools(
cls,
llm: BaseLLM,
tools: List[Tool],
callback_manager: Optional[BaseCallbackManager] = None,
prefix: str = PREFIX,
suffix: str = SUFFIX,
input_variables: Optional[List[str]] = None,
**kwargs: Any,
) -> Agent:
"""Construct an agent from an LLM and tools."""
cls._validate_tools(tools)
prompt = cls.create_prompt(
tools, prefix=prefix, suffix=suffix, input_variables=input_variables
)
llm_chain = LLMChain(
llm=llm,
prompt=prompt,
callback_manager=callback_manager,
)
tool_names = [tool.name for tool in tools]
return cls(llm_chain=llm_chain, allowed_tools=tool_names, **kwargs)
@classmethod
def _validate_tools(cls, tools: List[Tool]) -> None:
for tool in tools:

View File

@@ -15,7 +15,12 @@ from langchain.prompts.base import BasePromptTemplate
class ReActDocstoreAgent(Agent, BaseModel):
"""Agent for the ReAct chin."""
"""Agent for the ReAct chain."""
@property
def _agent_type(self) -> str:
"""Return Identifier of agent type."""
return "react-docstore"
@classmethod
def create_prompt(cls, tools: List[Tool]) -> BasePromptTemplate:

View File

@@ -12,6 +12,11 @@ from langchain.serpapi import SerpAPIWrapper
class SelfAskWithSearchAgent(Agent):
"""Agent for the self-ask-with-search paper."""
@property
def _agent_type(self) -> str:
"""Return Identifier of agent type."""
return "self-ask-with-search"
@classmethod
def create_prompt(cls, tools: List[Tool]) -> BasePromptTemplate:
"""Prompt does not depend on tools."""

View File

@@ -1,6 +1,7 @@
"""Interface for tools."""
from dataclasses import dataclass
from typing import Callable, Optional
from inspect import signature
from typing import Any, Callable, Optional, Union
@dataclass
@@ -11,3 +12,65 @@ class Tool:
func: Callable[[str], str]
description: Optional[str] = None
return_direct: bool = False
def __call__(self, *args: Any, **kwargs: Any) -> str:
"""Make tools callable by piping through to `func`."""
return self.func(*args, **kwargs)
def tool(
*args: Union[str, Callable], return_direct: bool = False
) -> Union[Callable, Tool]:
"""Make tools out of functions, can be used with or without arguments.
Requires:
- Function must be of type (str) -> str
- Function must have a docstring
Examples:
.. code-block:: python
@tool
def search_api(query: str) -> str:
# Searches the API for the query.
return
@tool("search", return_direct=True)
def search_api(query: str) -> str:
# Searches the API for the query.
return
"""
def _make_with_name(tool_name: str) -> Callable:
def _make_tool(func: Callable[[str], str]) -> Tool:
assert func.__doc__, "Function must have a docstring"
# Description example:
# search_api(query: str) - Searches the API for the query.
description = f"{tool_name}{signature(func)} - {func.__doc__.strip()}"
tool = Tool(
name=tool_name,
func=func,
description=description,
return_direct=return_direct,
)
return tool
return _make_tool
if len(args) == 1 and isinstance(args[0], str):
# if the argument is a string, then we use the string as the tool name
# Example usage: @tool("search", return_direct=True)
return _make_with_name(args[0])
elif len(args) == 1 and callable(args[0]):
# if the argument is a function, then we use the function name as the tool name
# Example usage: @tool
return _make_with_name(args[0].__name__)(args[0])
elif len(args) == 0:
# if there are no arguments, then we use the function name as the tool name
# Example usage: @tool(return_direct=True)
def _partial(func: Callable[[str], str]) -> Tool:
return _make_with_name(func.__name__)(func)
return _partial
else:
raise ValueError("Too many arguments for tool decorator")

View File

@@ -4,8 +4,7 @@ from typing import Any, Dict, List, Optional, Tuple
from sqlalchemy import Column, Integer, String, create_engine, select
from sqlalchemy.engine.base import Engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session
from sqlalchemy.orm import Session, declarative_base
from langchain.schema import Generation
@@ -60,7 +59,7 @@ class SQLAlchemyCache(BaseCache):
"""Initialize by creating all tables."""
self.engine = engine
self.cache_schema = cache_schema
Base.metadata.create_all(self.engine)
self.cache_schema.metadata.create_all(self.engine)
def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
"""Look up based on prompt and llm_string."""

View File

@@ -1,7 +1,13 @@
"""Callback handlers that allow listening to events in LangChain."""
import os
from contextlib import contextmanager
from typing import Generator, Optional
from langchain.callbacks.base import BaseCallbackHandler, BaseCallbackManager
from langchain.callbacks.openai_info import OpenAICallbackHandler
from langchain.callbacks.shared import SharedCallbackManager
from langchain.callbacks.stdout import StdOutCallbackHandler
from langchain.callbacks.tracers import SharedLangChainTracer
def get_callback_manager() -> BaseCallbackManager:
@@ -17,4 +23,38 @@ def set_handler(handler: BaseCallbackHandler) -> None:
def set_default_callback_manager() -> None:
"""Set default callback manager."""
set_handler(StdOutCallbackHandler())
default_handler = os.environ.get("LANGCHAIN_HANDLER", "stdout")
if default_handler == "stdout":
set_handler(StdOutCallbackHandler())
elif default_handler == "langchain":
session = os.environ.get("LANGCHAIN_SESSION")
set_tracing_callback_manager(session)
else:
raise ValueError(
f"LANGCHAIN_HANDLER should be one of `stdout` "
f"or `langchain`, got {default_handler}"
)
def set_tracing_callback_manager(session_name: Optional[str] = None) -> None:
"""Set tracing callback manager."""
handler = SharedLangChainTracer()
callback = get_callback_manager()
callback.set_handlers([handler, StdOutCallbackHandler()])
if session_name is None:
handler.load_default_session()
else:
try:
handler.load_session(session_name)
except Exception:
raise ValueError(f"session {session_name} not found")
@contextmanager
def get_openai_callback() -> Generator[OpenAICallbackHandler, None, None]:
"""Get OpenAI callback handler in a context manager."""
handler = OpenAICallbackHandler()
manager = get_callback_manager()
manager.add_handler(handler)
yield handler
manager.remove_handler(handler)

View File

@@ -1,19 +1,33 @@
"""Base callback handler that can be used to handle callbacks from langchain."""
from abc import ABC, abstractmethod
from typing import Any, Dict, List
from pydantic import BaseModel
from typing import Any, Dict, List, Union
from langchain.schema import AgentAction, AgentFinish, LLMResult
class BaseCallbackHandler(BaseModel, ABC):
class BaseCallbackHandler(ABC):
"""Base callback handler that can be used to handle callbacks from langchain."""
ignore_llm: bool = False
ignore_chain: bool = False
ignore_agent: bool = False
@property
def always_verbose(self) -> bool:
"""Whether to call verbose callbacks even if verbose is False."""
return False
@property
def ignore_llm(self) -> bool:
"""Whether to ignore LLM callbacks."""
return False
@property
def ignore_chain(self) -> bool:
"""Whether to ignore chain callbacks."""
return False
@property
def ignore_agent(self) -> bool:
"""Whether to ignore agent callbacks."""
return False
@abstractmethod
def on_llm_start(
@@ -22,14 +36,13 @@ class BaseCallbackHandler(BaseModel, ABC):
"""Run when LLM starts running."""
@abstractmethod
def on_llm_end(
self,
response: LLMResult,
) -> None:
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Run when LLM ends running."""
@abstractmethod
def on_llm_error(self, error: Exception) -> None:
def on_llm_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Run when LLM errors."""
@abstractmethod
@@ -39,11 +52,13 @@ class BaseCallbackHandler(BaseModel, ABC):
"""Run when chain starts running."""
@abstractmethod
def on_chain_end(self, outputs: Dict[str, Any]) -> None:
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Run when chain ends running."""
@abstractmethod
def on_chain_error(self, error: Exception) -> None:
def on_chain_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Run when chain errors."""
@abstractmethod
@@ -57,7 +72,9 @@ class BaseCallbackHandler(BaseModel, ABC):
"""Run when tool ends running."""
@abstractmethod
def on_tool_error(self, error: Exception) -> None:
def on_tool_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Run when tool errors."""
@abstractmethod
@@ -80,89 +97,136 @@ class BaseCallbackManager(BaseCallbackHandler, ABC):
def remove_handler(self, handler: BaseCallbackHandler) -> None:
"""Remove a handler from the callback manager."""
@abstractmethod
def set_handler(self, handler: BaseCallbackHandler) -> None:
"""Set handler as the only handler on the callback manager."""
self.set_handlers([handler])
@abstractmethod
def set_handlers(self, handlers: List[BaseCallbackHandler]) -> None:
"""Set handlers as the only handlers on the callback manager."""
class CallbackManager(BaseCallbackManager):
"""Callback manager that can be used to handle callbacks from langchain."""
handlers: List[BaseCallbackHandler]
def __init__(self, handlers: List[BaseCallbackHandler]) -> None:
"""Initialize callback manager."""
self.handlers: List[BaseCallbackHandler] = handlers
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
self,
serialized: Dict[str, Any],
prompts: List[str],
verbose: bool = False,
**kwargs: Any
) -> None:
"""Run when LLM starts running."""
for handler in self.handlers:
if not handler.ignore_llm:
handler.on_llm_start(serialized, prompts, **kwargs)
if verbose or handler.always_verbose:
handler.on_llm_start(serialized, prompts, **kwargs)
def on_llm_end(
self,
response: LLMResult,
self, response: LLMResult, verbose: bool = False, **kwargs: Any
) -> None:
"""Run when LLM ends running."""
for handler in self.handlers:
if not handler.ignore_llm:
handler.on_llm_end(response)
if verbose or handler.always_verbose:
handler.on_llm_end(response)
def on_llm_error(self, error: Exception) -> None:
def on_llm_error(
self,
error: Union[Exception, KeyboardInterrupt],
verbose: bool = False,
**kwargs: Any
) -> None:
"""Run when LLM errors."""
for handler in self.handlers:
if not handler.ignore_llm:
handler.on_llm_error(error)
if verbose or handler.always_verbose:
handler.on_llm_error(error)
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
self,
serialized: Dict[str, Any],
inputs: Dict[str, Any],
verbose: bool = False,
**kwargs: Any
) -> None:
"""Run when chain starts running."""
for handler in self.handlers:
if not handler.ignore_chain:
handler.on_chain_start(serialized, inputs, **kwargs)
if verbose or handler.always_verbose:
handler.on_chain_start(serialized, inputs, **kwargs)
def on_chain_end(self, outputs: Dict[str, Any]) -> None:
def on_chain_end(
self, outputs: Dict[str, Any], verbose: bool = False, **kwargs: Any
) -> None:
"""Run when chain ends running."""
for handler in self.handlers:
if not handler.ignore_chain:
handler.on_chain_end(outputs)
if verbose or handler.always_verbose:
handler.on_chain_end(outputs)
def on_chain_error(self, error: Exception) -> None:
def on_chain_error(
self,
error: Union[Exception, KeyboardInterrupt],
verbose: bool = False,
**kwargs: Any
) -> None:
"""Run when chain errors."""
for handler in self.handlers:
if not handler.ignore_chain:
handler.on_chain_error(error)
if verbose or handler.always_verbose:
handler.on_chain_error(error)
def on_tool_start(
self, serialized: Dict[str, Any], action: AgentAction, **kwargs: Any
self,
serialized: Dict[str, Any],
action: AgentAction,
verbose: bool = False,
**kwargs: Any
) -> None:
"""Run when tool starts running."""
for handler in self.handlers:
if not handler.ignore_agent:
handler.on_tool_start(serialized, action, **kwargs)
if verbose or handler.always_verbose:
handler.on_tool_start(serialized, action, **kwargs)
def on_tool_end(self, output: str, **kwargs: Any) -> None:
def on_tool_end(self, output: str, verbose: bool = False, **kwargs: Any) -> None:
"""Run when tool ends running."""
for handler in self.handlers:
if not handler.ignore_agent:
handler.on_tool_end(output, **kwargs)
if verbose or handler.always_verbose:
handler.on_tool_end(output, **kwargs)
def on_tool_error(self, error: Exception) -> None:
def on_tool_error(
self,
error: Union[Exception, KeyboardInterrupt],
verbose: bool = False,
**kwargs: Any
) -> None:
"""Run when tool errors."""
for handler in self.handlers:
if not handler.ignore_agent:
handler.on_tool_error(error)
if verbose or handler.always_verbose:
handler.on_tool_error(error)
def on_text(self, text: str, **kwargs: Any) -> None:
def on_text(self, text: str, verbose: bool = False, **kwargs: Any) -> None:
"""Run on additional input from chains and agents."""
for handler in self.handlers:
handler.on_text(text, **kwargs)
if verbose or handler.always_verbose:
handler.on_text(text, **kwargs)
def on_agent_finish(self, finish: AgentFinish, **kwargs: Any) -> None:
def on_agent_finish(
self, finish: AgentFinish, verbose: bool = False, **kwargs: Any
) -> None:
"""Run on agent end."""
for handler in self.handlers:
if not handler.ignore_agent:
handler.on_agent_finish(finish, **kwargs)
if verbose or handler.always_verbose:
handler.on_agent_finish(finish, **kwargs)
def add_handler(self, handler: BaseCallbackHandler) -> None:
"""Add a handler to the callback manager."""
@@ -172,6 +236,6 @@ class CallbackManager(BaseCallbackManager):
"""Remove a handler from the callback manager."""
self.handlers.remove(handler)
def set_handler(self, handler: BaseCallbackHandler) -> None:
"""Set handler as the only handler on the callback manager."""
self.handlers = [handler]
def set_handlers(self, handlers: List[BaseCallbackHandler]) -> None:
"""Set handlers as the only handlers on the callback manager."""
self.handlers = handlers

View File

@@ -0,0 +1,95 @@
"""Callback Handler that prints to std out."""
from typing import Any, Dict, List, Optional, Union
from langchain.callbacks.base import BaseCallbackHandler
from langchain.schema import AgentAction, AgentFinish, LLMResult
class OpenAICallbackHandler(BaseCallbackHandler):
"""Callback Handler that tracks OpenAI info."""
total_tokens: int = 0
@property
def always_verbose(self) -> bool:
"""Whether to call verbose callbacks even if verbose is False."""
return True
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> None:
"""Print out the prompts."""
pass
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Do nothing."""
if response.llm_output is not None:
if "token_usage" in response.llm_output:
token_usage = response.llm_output["token_usage"]
if "total_tokens" in token_usage:
self.total_tokens += token_usage["total_tokens"]
def on_llm_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
) -> None:
"""Print out that we are entering a chain."""
pass
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Print out that we finished a chain."""
pass
def on_chain_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
def on_tool_start(
self,
serialized: Dict[str, Any],
action: AgentAction,
color: Optional[str] = None,
**kwargs: Any,
) -> None:
"""Print out the log in specified color."""
pass
def on_tool_end(
self,
output: str,
color: Optional[str] = None,
observation_prefix: Optional[str] = None,
llm_prefix: Optional[str] = None,
**kwargs: Any,
) -> None:
"""If not the final action, print out observation."""
pass
def on_tool_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
def on_text(
self,
text: str,
color: Optional[str] = None,
end: str = "",
**kwargs: Optional[str],
) -> None:
"""Run when agent ends."""
pass
def on_agent_finish(
self, finish: AgentFinish, color: Optional[str] = None, **kwargs: Any
) -> None:
"""Run on agent end."""
pass

View File

@@ -1,7 +1,7 @@
"""A shared CallbackManager."""
import threading
from typing import Any, Dict, List
from typing import Any, Dict, List, Union
from langchain.callbacks.base import (
BaseCallbackHandler,
@@ -41,18 +41,17 @@ class SharedCallbackManager(Singleton, BaseCallbackManager):
with self._lock:
self._callback_manager.on_llm_start(serialized, prompts, **kwargs)
def on_llm_end(
self,
response: LLMResult,
) -> None:
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Run when LLM ends running."""
with self._lock:
self._callback_manager.on_llm_end(response)
self._callback_manager.on_llm_end(response, **kwargs)
def on_llm_error(self, error: Exception) -> None:
def on_llm_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Run when LLM errors."""
with self._lock:
self._callback_manager.on_llm_error(error)
self._callback_manager.on_llm_error(error, **kwargs)
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
@@ -61,15 +60,17 @@ class SharedCallbackManager(Singleton, BaseCallbackManager):
with self._lock:
self._callback_manager.on_chain_start(serialized, inputs, **kwargs)
def on_chain_end(self, outputs: Dict[str, Any]) -> None:
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Run when chain ends running."""
with self._lock:
self._callback_manager.on_chain_end(outputs)
self._callback_manager.on_chain_end(outputs, **kwargs)
def on_chain_error(self, error: Exception) -> None:
def on_chain_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Run when chain errors."""
with self._lock:
self._callback_manager.on_chain_error(error)
self._callback_manager.on_chain_error(error, **kwargs)
def on_tool_start(
self, serialized: Dict[str, Any], action: AgentAction, **kwargs: Any
@@ -83,10 +84,12 @@ class SharedCallbackManager(Singleton, BaseCallbackManager):
with self._lock:
self._callback_manager.on_tool_end(output, **kwargs)
def on_tool_error(self, error: Exception) -> None:
def on_tool_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Run when tool errors."""
with self._lock:
self._callback_manager.on_tool_error(error)
self._callback_manager.on_tool_error(error, **kwargs)
def on_text(self, text: str, **kwargs: Any) -> None:
"""Run on arbitrary text."""
@@ -108,7 +111,7 @@ class SharedCallbackManager(Singleton, BaseCallbackManager):
with self._lock:
self._callback_manager.remove_handler(callback)
def set_handler(self, handler: BaseCallbackHandler) -> None:
"""Set handler as the only handler on the callback manager."""
def set_handlers(self, handlers: List[BaseCallbackHandler]) -> None:
"""Set handlers as the only handlers on the callback manager."""
with self._lock:
self._callback_manager.handlers = [handler]
self._callback_manager.handlers = handlers

View File

@@ -1,5 +1,5 @@
"""Callback Handler that prints to std out."""
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Union
from langchain.callbacks.base import BaseCallbackHandler
from langchain.input import print_text
@@ -15,11 +15,13 @@ class StdOutCallbackHandler(BaseCallbackHandler):
"""Print out the prompts."""
pass
def on_llm_end(self, response: LLMResult) -> None:
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_llm_error(self, error: Exception) -> None:
def on_llm_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
@@ -30,11 +32,13 @@ class StdOutCallbackHandler(BaseCallbackHandler):
class_name = serialized["name"]
print(f"\n\n\033[1m> Entering new {class_name} chain...\033[0m")
def on_chain_end(self, outputs: Dict[str, Any]) -> None:
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Print out that we finished a chain."""
print("\n\033[1m> Finished chain.\033[0m")
def on_chain_error(self, error: Exception) -> None:
def on_chain_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
@@ -61,7 +65,9 @@ class StdOutCallbackHandler(BaseCallbackHandler):
print_text(output, color=color)
print_text(f"\n{llm_prefix}")
def on_tool_error(self, error: Exception) -> None:
def on_tool_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass

View File

@@ -1,5 +1,5 @@
"""Callback Handler that logs to streamlit."""
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Union
import streamlit as st
@@ -18,11 +18,13 @@ class StreamlitCallbackHandler(BaseCallbackHandler):
for prompt in prompts:
st.write(prompt)
def on_llm_end(self, response: LLMResult) -> None:
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""Do nothing."""
pass
def on_llm_error(self, error: Exception) -> None:
def on_llm_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
@@ -33,11 +35,13 @@ class StreamlitCallbackHandler(BaseCallbackHandler):
class_name = serialized["name"]
st.write(f"Entering new {class_name} chain...")
def on_chain_end(self, outputs: Dict[str, Any]) -> None:
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""Print out that we finished a chain."""
st.write("Finished chain.")
def on_chain_error(self, error: Exception) -> None:
def on_chain_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass
@@ -62,7 +66,9 @@ class StreamlitCallbackHandler(BaseCallbackHandler):
st.write(f"{observation_prefix}{output}")
st.write(llm_prefix)
def on_tool_error(self, error: Exception) -> None:
def on_tool_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Do nothing."""
pass

View File

@@ -0,0 +1,12 @@
"""Tracers that record execution of LangChain runs."""
from langchain.callbacks.tracers.base import SharedTracer, Tracer
from langchain.callbacks.tracers.langchain import BaseLangChainTracer
class SharedLangChainTracer(SharedTracer, BaseLangChainTracer):
"""Shared tracer that records LangChain execution to LangChain endpoint."""
class LangChainTracer(Tracer, BaseLangChainTracer):
"""Tracer that records LangChain execution to LangChain endpoint."""

View File

@@ -0,0 +1,334 @@
"""Base interfaces for tracing runs."""
from __future__ import annotations
import threading
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any, Dict, List, Optional, Union
from langchain.callbacks.base import BaseCallbackHandler
from langchain.callbacks.shared import Singleton
from langchain.callbacks.tracers.schemas import (
ChainRun,
LLMRun,
ToolRun,
TracerSession,
TracerSessionCreate,
)
from langchain.schema import AgentAction, AgentFinish, LLMResult
class TracerException(Exception):
"""Base class for exceptions in tracers module."""
class BaseTracer(BaseCallbackHandler, ABC):
"""Base interface for tracers."""
@abstractmethod
def _add_child_run(
self,
parent_run: Union[ChainRun, ToolRun],
child_run: Union[LLMRun, ChainRun, ToolRun],
) -> None:
"""Add child run to a chain run or tool run."""
@abstractmethod
def _persist_run(self, run: Union[LLMRun, ChainRun, ToolRun]) -> None:
"""Persist a run."""
@abstractmethod
def _persist_session(self, session: TracerSessionCreate) -> TracerSession:
"""Persist a tracing session."""
@abstractmethod
def _generate_id(self) -> Optional[Union[int, str]]:
"""Generate an id for a run."""
def new_session(self, name: Optional[str] = None, **kwargs: Any) -> TracerSession:
"""NOT thread safe, do not call this method from multiple threads."""
session_create = TracerSessionCreate(name=name, extra=kwargs)
session = self._persist_session(session_create)
self._session = session
return session
@abstractmethod
def load_session(self, session_name: str) -> TracerSession:
"""Load a tracing session and set it as the Tracer's session."""
@abstractmethod
def load_default_session(self) -> TracerSession:
"""Load the default tracing session and set it as the Tracer's session."""
@property
@abstractmethod
def _stack(self) -> List[Union[LLMRun, ChainRun, ToolRun]]:
"""Get the tracer stack."""
@property
@abstractmethod
def _execution_order(self) -> int:
"""Get the execution order for a run."""
@_execution_order.setter
@abstractmethod
def _execution_order(self, value: int) -> None:
"""Set the execution order for a run."""
@property
@abstractmethod
def _session(self) -> Optional[TracerSession]:
"""Get the tracing session."""
@_session.setter
@abstractmethod
def _session(self, value: TracerSession) -> None:
"""Set the tracing session."""
def _start_trace(self, run: Union[LLMRun, ChainRun, ToolRun]) -> None:
"""Start a trace for a run."""
self._execution_order += 1
if self._stack:
if not (
isinstance(self._stack[-1], ChainRun)
or isinstance(self._stack[-1], ToolRun)
):
raise TracerException(
f"Nested {run.__class__.__name__} can only be"
f" logged inside a ChainRun or ToolRun"
)
self._add_child_run(self._stack[-1], run)
self._stack.append(run)
def _end_trace(self) -> None:
"""End a trace for a run."""
run = self._stack.pop()
if not self._stack:
self._execution_order = 1
self._persist_run(run)
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> None:
"""Start a trace for an LLM run."""
if self._session is None:
raise TracerException(
"Initialize a session with `new_session()` before starting a trace."
)
llm_run = LLMRun(
serialized=serialized,
prompts=prompts,
extra=kwargs,
start_time=datetime.utcnow(),
execution_order=self._execution_order,
session_id=self._session.id,
id=self._generate_id(),
)
self._start_trace(llm_run)
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
"""End a trace for an LLM run."""
if not self._stack or not isinstance(self._stack[-1], LLMRun):
raise TracerException("No LLMRun found to be traced")
self._stack[-1].end_time = datetime.utcnow()
self._stack[-1].response = response
self._end_trace()
def on_llm_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Handle an error for an LLM run."""
if not self._stack or not isinstance(self._stack[-1], LLMRun):
raise TracerException("No LLMRun found to be traced")
self._stack[-1].error = repr(error)
self._stack[-1].end_time = datetime.utcnow()
self._end_trace()
def on_chain_start(
self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
) -> None:
"""Start a trace for a chain run."""
if self._session is None:
raise TracerException(
"Initialize a session with `new_session()` before starting a trace."
)
chain_run = ChainRun(
serialized=serialized,
inputs=inputs,
extra=kwargs,
start_time=datetime.utcnow(),
execution_order=self._execution_order,
child_runs=[],
session_id=self._session.id,
id=self._generate_id(),
)
self._start_trace(chain_run)
def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
"""End a trace for a chain run."""
if not self._stack or not isinstance(self._stack[-1], ChainRun):
raise TracerException("No ChainRun found to be traced")
self._stack[-1].end_time = datetime.utcnow()
self._stack[-1].outputs = outputs
self._end_trace()
def on_chain_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Handle an error for a chain run."""
if not self._stack or not isinstance(self._stack[-1], ChainRun):
raise TracerException("No ChainRun found to be traced")
self._stack[-1].end_time = datetime.utcnow()
self._stack[-1].error = repr(error)
self._end_trace()
def on_tool_start(
self, serialized: Dict[str, Any], action: AgentAction, **kwargs: Any
) -> None:
"""Start a trace for a tool run."""
if self._session is None:
raise TracerException(
"Initialize a session with `new_session()` before starting a trace."
)
tool_run = ToolRun(
serialized=serialized,
action=action.tool,
tool_input=action.tool_input,
extra=kwargs,
start_time=datetime.utcnow(),
execution_order=self._execution_order,
child_runs=[],
session_id=self._session.id,
id=self._generate_id(),
)
self._start_trace(tool_run)
def on_tool_end(self, output: str, **kwargs: Any) -> None:
"""End a trace for a tool run."""
if not self._stack or not isinstance(self._stack[-1], ToolRun):
raise TracerException("No ToolRun found to be traced")
self._stack[-1].end_time = datetime.utcnow()
self._stack[-1].output = output
self._end_trace()
def on_tool_error(
self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
) -> None:
"""Handle an error for a tool run."""
if not self._stack or not isinstance(self._stack[-1], ToolRun):
raise TracerException("No ToolRun found to be traced")
self._stack[-1].end_time = datetime.utcnow()
self._stack[-1].error = repr(error)
self._end_trace()
def on_text(self, text: str, **kwargs: Any) -> None:
"""Handle a text message."""
pass
def on_agent_finish(self, finish: AgentFinish, **kwargs: Any) -> None:
"""Handle an agent finish message."""
pass
class Tracer(BaseTracer, ABC):
"""A non-thread safe implementation of the BaseTracer interface."""
def __init__(self) -> None:
"""Initialize a tracer."""
self._tracer_stack: List[Union[LLMRun, ChainRun, ToolRun]] = []
self._tracer_execution_order = 1
self._tracer_session: Optional[TracerSession] = None
@property
def _stack(self) -> List[Union[LLMRun, ChainRun, ToolRun]]:
"""Get the tracer stack."""
return self._tracer_stack
@property
def _execution_order(self) -> int:
"""Get the execution order for a run."""
return self._tracer_execution_order
@_execution_order.setter
def _execution_order(self, value: int) -> None:
"""Set the execution order for a run."""
self._tracer_execution_order = value
@property
def _session(self) -> Optional[TracerSession]:
"""Get the tracing session."""
return self._tracer_session
@_session.setter
def _session(self, value: TracerSession) -> None:
"""Set the tracing session."""
if self._stack:
raise TracerException(
"Cannot set a session while a trace is being recorded"
)
self._tracer_session = value
@dataclass
class TracerStack(threading.local):
"""A stack of runs used for logging."""
stack: List[Union[LLMRun, ChainRun, ToolRun]] = field(default_factory=list)
execution_order: int = 1
class SharedTracer(Singleton, BaseTracer, ABC):
"""A thread-safe Singleton implementation of BaseTracer."""
_tracer_stack = TracerStack()
_tracer_session = None
@property
def _stack(self) -> List[Union[LLMRun, ChainRun, ToolRun]]:
"""Get the tracer stack."""
return self._tracer_stack.stack
@property
def _execution_order(self) -> int:
"""Get the execution order for a run."""
return self._tracer_stack.execution_order
@_execution_order.setter
def _execution_order(self, value: int) -> None:
"""Set the execution order for a run."""
self._tracer_stack.execution_order = value
@property
def _session(self) -> Optional[TracerSession]:
"""Get the tracing session."""
return self._tracer_session
@_session.setter
def _session(self, value: TracerSession) -> None:
"""Set the tracing session."""
with self._lock:
# TODO: currently, we are only checking current thread's stack.
# Need to make sure that we are not in the middle of a trace
# in any thread.
if self._stack:
raise TracerException(
"Cannot set a session while a trace is being recorded"
)
self._tracer_session = value

View File

@@ -0,0 +1,112 @@
"""A Tracer implementation that records to LangChain endpoint."""
from __future__ import annotations
import logging
import os
from abc import ABC
from typing import Any, Dict, Optional, Union
import requests
from langchain.callbacks.tracers.base import BaseTracer
from langchain.callbacks.tracers.schemas import (
ChainRun,
LLMRun,
ToolRun,
TracerSession,
TracerSessionCreate,
)
class BaseLangChainTracer(BaseTracer, ABC):
"""An implementation of the SharedTracer that POSTS to the langchain endpoint."""
always_verbose: bool = True
_endpoint: str = os.getenv("LANGCHAIN_ENDPOINT", "http://localhost:8000")
_headers: Dict[str, Any] = {"Content-Type": "application/json"}
if os.getenv("LANGCHAIN_API_KEY"):
_headers["x-api-key"] = os.getenv("LANGCHAIN_API_KEY")
def _persist_run(self, run: Union[LLMRun, ChainRun, ToolRun]) -> None:
"""Persist a run."""
if isinstance(run, LLMRun):
endpoint = f"{self._endpoint}/llm-runs"
elif isinstance(run, ChainRun):
endpoint = f"{self._endpoint}/chain-runs"
else:
endpoint = f"{self._endpoint}/tool-runs"
try:
requests.post(
endpoint,
data=run.json(),
headers=self._headers,
)
except Exception as e:
logging.warning(f"Failed to persist run: {e}")
def _persist_session(self, session_create: TracerSessionCreate) -> TracerSession:
"""Persist a session."""
try:
r = requests.post(
f"{self._endpoint}/sessions",
data=session_create.json(),
headers=self._headers,
)
session = TracerSession(id=r.json()["id"], **session_create.dict())
except Exception as e:
logging.warning(f"Failed to create session, using default session: {e}")
session = TracerSession(id=1, **session_create.dict())
return session
def load_session(self, session_name: str) -> TracerSession:
"""Load a session from the tracer."""
try:
r = requests.get(
f"{self._endpoint}/sessions?name={session_name}",
headers=self._headers,
)
tracer_session = TracerSession(**r.json()[0])
self._session = tracer_session
return tracer_session
except Exception as e:
logging.warning(
f"Failed to load session {session_name}, using empty session: {e}"
)
tracer_session = TracerSession(id=1)
self._session = tracer_session
return tracer_session
def load_default_session(self) -> TracerSession:
"""Load the default tracing session and set it as the Tracer's session."""
try:
r = requests.get(
f"{self._endpoint}/sessions",
headers=self._headers,
)
# Use the first session result
tracer_session = TracerSession(**r.json()[0])
self._session = tracer_session
return tracer_session
except Exception as e:
logging.warning(f"Failed to default session, using empty session: {e}")
tracer_session = TracerSession(id=1)
self._session = tracer_session
return tracer_session
def _add_child_run(
self,
parent_run: Union[ChainRun, ToolRun],
child_run: Union[LLMRun, ChainRun, ToolRun],
) -> None:
"""Add child run to a chain run or tool run."""
if isinstance(child_run, LLMRun):
parent_run.child_llm_runs.append(child_run)
elif isinstance(child_run, ChainRun):
parent_run.child_chain_runs.append(child_run)
else:
parent_run.child_tool_runs.append(child_run)
def _generate_id(self) -> Optional[Union[int, str]]:
"""Generate an id for a run."""
return None

View File

@@ -0,0 +1,76 @@
"""Schemas for tracers."""
from __future__ import annotations
import datetime
from typing import Any, Dict, List, Optional, Union
from pydantic import BaseModel, Field
from langchain.schema import LLMResult
class TracerSessionBase(BaseModel):
"""Base class for TracerSession."""
start_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
name: Optional[str] = None
extra: Optional[Dict[str, Any]] = None
class TracerSessionCreate(TracerSessionBase):
"""Create class for TracerSession."""
pass
class TracerSession(TracerSessionBase):
"""TracerSession schema."""
id: int
class BaseRun(BaseModel):
"""Base class for Run."""
id: Optional[Union[int, str]] = None
start_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
end_time: datetime.datetime = Field(default_factory=datetime.datetime.utcnow)
extra: Optional[Dict[str, Any]] = None
execution_order: int
serialized: Dict[str, Any]
session_id: int
error: Optional[str] = None
class LLMRun(BaseRun):
"""Class for LLMRun."""
prompts: List[str]
response: Optional[LLMResult] = None
class ChainRun(BaseRun):
"""Class for ChainRun."""
inputs: Dict[str, Any]
outputs: Optional[Dict[str, Any]] = None
child_llm_runs: List[LLMRun] = Field(default_factory=list)
child_chain_runs: List[ChainRun] = Field(default_factory=list)
child_tool_runs: List[ToolRun] = Field(default_factory=list)
child_runs: List[Union[LLMRun, ChainRun, ToolRun]] = Field(default_factory=list)
class ToolRun(BaseRun):
"""Class for ToolRun."""
tool_input: str
output: Optional[str] = None
action: str
child_llm_runs: List[LLMRun] = Field(default_factory=list)
child_chain_runs: List[ChainRun] = Field(default_factory=list)
child_tool_runs: List[ToolRun] = Field(default_factory=list)
child_runs: List[Union[LLMRun, ChainRun, ToolRun]] = Field(default_factory=list)
ChainRun.update_forward_refs()
ToolRun.update_forward_refs()

View File

@@ -1,11 +1,13 @@
"""Chains are easily reusable components which can be linked together."""
from langchain.chains.api.base import APIChain
from langchain.chains.conversation.base import ConversationChain
from langchain.chains.hyde.base import HypotheticalDocumentEmbedder
from langchain.chains.llm import LLMChain
from langchain.chains.llm_bash.base import LLMBashChain
from langchain.chains.llm_checker.base import LLMCheckerChain
from langchain.chains.llm_math.base import LLMMathChain
from langchain.chains.llm_requests import LLMRequestsChain
from langchain.chains.loading import load_chain
from langchain.chains.mapreduce import MapReduceChain
from langchain.chains.moderation import OpenAIModerationChain
from langchain.chains.pal.base import PALChain
@@ -20,7 +22,6 @@ from langchain.chains.transform import TransformChain
from langchain.chains.vector_db_qa.base import VectorDBQA
__all__ = [
"APIChain",
"ConversationChain",
"LLMChain",
"LLMBashChain",
@@ -39,4 +40,6 @@ __all__ = [
"MapReduceChain",
"OpenAIModerationChain",
"SQLDatabaseSequentialChain",
"load_chain",
"HypotheticalDocumentEmbedder",
]

View File

@@ -3,7 +3,7 @@ from __future__ import annotations
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, root_validator
from pydantic import BaseModel, Field, root_validator
from langchain.chains.api.prompt import API_RESPONSE_PROMPT, API_URL_PROMPT
from langchain.chains.base import Chain
@@ -18,7 +18,7 @@ class APIChain(Chain, BaseModel):
api_request_chain: LLMChain
api_answer_chain: LLMChain
requests_wrapper: RequestsWrapper
requests_wrapper: RequestsWrapper = Field(exclude=True)
api_docs: str
question_key: str = "question" #: :meta private:
output_key: str = "output" #: :meta private:
@@ -66,11 +66,13 @@ class APIChain(Chain, BaseModel):
api_url = self.api_request_chain.predict(
question=question, api_docs=self.api_docs
)
if self.verbose:
self.callback_manager.on_text(api_url, color="green", end="\n")
self.callback_manager.on_text(
api_url, color="green", end="\n", verbose=self.verbose
)
api_response = self.requests_wrapper.run(api_url)
if self.verbose:
self.callback_manager.on_text(api_response, color="yellow", end="\n")
self.callback_manager.on_text(
api_response, color="yellow", end="\n", verbose=self.verbose
)
answer = self.api_answer_chain.predict(
question=question,
api_docs=self.api_docs,
@@ -100,3 +102,7 @@ class APIChain(Chain, BaseModel):
api_docs=api_docs,
**kwargs,
)
@property
def _chain_type(self) -> str:
return "api_chain"

View File

@@ -1,7 +1,10 @@
"""Base interface that all chains should implement."""
import json
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Any, Dict, List, Optional, Union
import yaml
from pydantic import BaseModel, Extra, Field, validator
import langchain
@@ -44,7 +47,9 @@ class Chain(BaseModel, ABC):
"""Base interface that all chains should implement."""
memory: Optional[Memory] = None
callback_manager: BaseCallbackManager = Field(default_factory=get_callback_manager)
callback_manager: BaseCallbackManager = Field(
default_factory=get_callback_manager, exclude=True
)
verbose: bool = Field(
default_factory=_get_verbosity
) # Whether to print the response text
@@ -54,6 +59,10 @@ class Chain(BaseModel, ABC):
arbitrary_types_allowed = True
@property
def _chain_type(self) -> str:
raise NotImplementedError("Saving not supported for this chain type.")
@validator("callback_manager", pre=True, always=True)
def set_callback_manager(
cls, callback_manager: Optional[BaseCallbackManager]
@@ -134,18 +143,17 @@ class Chain(BaseModel, ABC):
external_context = self.memory.load_memory_variables(inputs)
inputs = dict(inputs, **external_context)
self._validate_inputs(inputs)
if self.verbose:
self.callback_manager.on_chain_start(
{"name": self.__class__.__name__}, inputs
)
self.callback_manager.on_chain_start(
{"name": self.__class__.__name__},
inputs,
verbose=self.verbose,
)
try:
outputs = self._call(inputs)
except Exception as e:
if self.verbose:
self.callback_manager.on_chain_error(e)
except (KeyboardInterrupt, Exception) as e:
self.callback_manager.on_chain_error(e, verbose=self.verbose)
raise e
if self.verbose:
self.callback_manager.on_chain_end(outputs)
self.callback_manager.on_chain_end(outputs, verbose=self.verbose)
self._validate_outputs(outputs)
if self.memory is not None:
self.memory.save_context(inputs, outputs)
@@ -178,3 +186,43 @@ class Chain(BaseModel, ABC):
f"`run` supported with either positional arguments or keyword arguments"
f" but not both. Got args: {args} and kwargs: {kwargs}."
)
def dict(self, **kwargs: Any) -> Dict:
"""Return dictionary representation of chain."""
if self.memory is not None:
raise ValueError("Saving of memory is not yet supported.")
_dict = super().dict()
_dict["_type"] = self._chain_type
return _dict
def save(self, file_path: Union[Path, str]) -> None:
"""Save the chain.
Args:
file_path: Path to file to save the chain to.
Example:
.. code-block:: python
chain.save(file_path="path/chain.yaml")
"""
# Convert file to Path object.
if isinstance(file_path, str):
save_path = Path(file_path)
else:
save_path = file_path
directory_path = save_path.parent
directory_path.mkdir(parents=True, exist_ok=True)
# Fetch dictionary to save
chain_dict = self.dict()
if save_path.suffix == ".json":
with open(file_path, "w") as f:
json.dump(chain_dict, f, indent=4)
elif save_path.suffix == ".yaml":
with open(file_path, "w") as f:
yaml.dump(chain_dict, f, default_flow_style=False)
else:
raise ValueError(f"{save_path} must be json or yaml")

View File

@@ -168,3 +168,7 @@ class MapReduceDocumentsChain(BaseCombineDocumentsChain, BaseModel):
extra_return_dict = {}
output, _ = self.combine_document_chain.combine_docs(result_docs, **kwargs)
return output, extra_return_dict
@property
def _chain_type(self) -> str:
return "map_reduce_documents_chain"

View File

@@ -111,3 +111,7 @@ class MapRerankDocumentsChain(BaseCombineDocumentsChain, BaseModel):
if self.return_intermediate_steps:
extra_info["intermediate_steps"] = results
return output[self.answer_key], extra_info
@property
def _chain_type(self) -> str:
return "map_rerank_documents_chain"

View File

@@ -113,3 +113,7 @@ class RefineDocumentsChain(BaseCombineDocumentsChain, BaseModel):
else:
extra_return_dict = {}
return res, extra_return_dict
@property
def _chain_type(self) -> str:
return "refine_documents_chain"

View File

@@ -83,3 +83,7 @@ class StuffDocumentsChain(BaseCombineDocumentsChain, BaseModel):
inputs = self._get_inputs(docs, **kwargs)
# Call predict on the LLM.
return self.llm_chain.predict(**inputs), {}
@property
def _chain_type(self) -> str:
return "stuff_documents_chain"

View File

@@ -4,7 +4,11 @@ from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Field, root_validator
from langchain.chains.base import Memory
from langchain.chains.conversation.prompt import SUMMARY_PROMPT
from langchain.chains.conversation.prompt import (
ENTITY_EXTRACTION_PROMPT,
ENTITY_SUMMARIZATION_PROMPT,
SUMMARY_PROMPT,
)
from langchain.chains.llm import LLMChain
from langchain.llms.base import BaseLLM
from langchain.prompts.base import BasePromptTemplate
@@ -216,6 +220,89 @@ class ConversationSummaryMemory(Memory, BaseModel):
self.buffer = ""
class ConversationEntityMemory(Memory, BaseModel):
"""Entity extractor & summarizer to memory."""
buffer: List[str] = []
human_prefix: str = "Human"
ai_prefix: str = "AI"
"""Prefix to use for AI generated responses."""
llm: BaseLLM
entity_extraction_prompt: BasePromptTemplate = ENTITY_EXTRACTION_PROMPT
entity_summarization_prompt: BasePromptTemplate = ENTITY_SUMMARIZATION_PROMPT
output_key: Optional[str] = None
input_key: Optional[str] = None
store: Dict[str, Optional[str]] = {}
entity_cache: List[str] = []
k: int = 3
chat_history_key: str = "history"
@property
def memory_variables(self) -> List[str]:
"""Will always return list of memory variables.
:meta private:
"""
return ["entities", self.chat_history_key]
def load_memory_variables(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
"""Return history buffer."""
chain = LLMChain(llm=self.llm, prompt=self.entity_extraction_prompt)
if self.input_key is None:
prompt_input_key = _get_prompt_input_key(inputs, self.memory_variables)
else:
prompt_input_key = self.input_key
output = chain.predict(
history="\n".join(self.buffer[-self.k :]),
input=inputs[prompt_input_key],
)
if output.strip() == "NONE":
entities = []
else:
entities = [w.strip() for w in output.split(",")]
entity_summaries = {}
for entity in entities:
entity_summaries[entity] = self.store.get(entity, "")
self.entity_cache = entities
return {
self.chat_history_key: "\n".join(self.buffer[-self.k :]),
"entities": entity_summaries,
}
def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
"""Save context from this conversation to buffer."""
if self.input_key is None:
prompt_input_key = _get_prompt_input_key(inputs, self.memory_variables)
else:
prompt_input_key = self.input_key
if self.output_key is None:
if len(outputs) != 1:
raise ValueError(f"One output key expected, got {outputs.keys()}")
output_key = list(outputs.keys())[0]
else:
output_key = self.output_key
human = f"{self.human_prefix}: " + inputs[prompt_input_key]
ai = f"{self.ai_prefix}: " + outputs[output_key]
for entity in self.entity_cache:
chain = LLMChain(llm=self.llm, prompt=self.entity_summarization_prompt)
# key value store for entity
existing_summary = self.store.get(entity, "")
output = chain.predict(
summary=existing_summary,
history="\n".join(self.buffer[-self.k :]),
input=inputs[prompt_input_key],
entity=entity,
)
self.store[entity] = output.strip()
new_lines = "\n".join([human, ai])
self.buffer.append(new_lines)
def clear(self) -> None:
"""Clear memory contents."""
self.buffer = []
self.store = {}
class ConversationSummaryBufferMemory(Memory, BaseModel):
"""Buffer with summarizer for storing conversation memory."""

View File

@@ -11,6 +11,28 @@ PROMPT = PromptTemplate(
input_variables=["history", "input"], template=_DEFAULT_TEMPLATE
)
_DEFAULT_ENTITY_MEMORY_CONVERSATION_TEMPLATE = """You are an assistant to a human, powered by a large language model trained by OpenAI.
You are designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, you are able to generate human-like text based on the input you receive, allowing you to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.
You are constantly learning and improving, and your capabilities are constantly evolving. You are able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. You have access to some personalized information provided by the human in the Context section below. Additionally, you are able to generate your own text based on the input you receive, allowing you to engage in discussions and provide explanations and descriptions on a wide range of topics.
Overall, you are a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether the human needs help with a specific question or just wants to have a conversation about a particular topic, you are here to assist.
Context:
{entities}
Current conversation:
{history}
Last line:
Human: {input}
You:"""
ENTITY_MEMORY_CONVERSATION_TEMPLATE = PromptTemplate(
input_variables=["entities", "history", "input"],
template=_DEFAULT_ENTITY_MEMORY_CONVERSATION_TEMPLATE,
)
_DEFAULT_SUMMARIZER_TEMPLATE = """Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.
EXAMPLE
@@ -35,3 +57,64 @@ New summary:"""
SUMMARY_PROMPT = PromptTemplate(
input_variables=["summary", "new_lines"], template=_DEFAULT_SUMMARIZER_TEMPLATE
)
_DEFAULT_ENTITY_EXTRACTION_TEMPLATE = """You are an AI assistant reading the transcript of a conversation between an AI and a human. Extract all of the proper nouns from the last line of conversation. As a guideline, a proper noun is generally capitalized. You should definitely extract all names and places.
The conversation history is provided just in case of a coreference (e.g. "What do you know about him" where "him" is defined in a previous line) -- ignore items mentioned there that are not in the last line.
Return the output as a single comma-separated list, or NONE if there is nothing of note to return (e.g. the user is just issuing a greeting or having a simple conversation).
EXAMPLE
Conversation history:
Person #1: how's it going today?
AI: "It's going great! How about you?"
Person #1: good! busy working on Langchain. lots to do.
AI: "That sounds like a lot of work! What kind of things are you doing to make Langchain better?"
Last line:
Person #1: i'm trying to improve Langchain's interfaces, the UX, its integrations with various products the user might want ... a lot of stuff.
Output: Langchain
END OF EXAMPLE
EXAMPLE
Conversation history:
Person #1: how's it going today?
AI: "It's going great! How about you?"
Person #1: good! busy working on Langchain. lots to do.
AI: "That sounds like a lot of work! What kind of things are you doing to make Langchain better?"
Last line:
Person #1: i'm trying to improve Langchain's interfaces, the UX, its integrations with various products the user might want ... a lot of stuff. I'm working with Person #2.
Output: Langchain, Person #2
END OF EXAMPLE
Conversation history (for reference only):
{history}
Last line of conversation (for extraction):
Human: {input}
Output:"""
ENTITY_EXTRACTION_PROMPT = PromptTemplate(
input_variables=["history", "input"], template=_DEFAULT_ENTITY_EXTRACTION_TEMPLATE
)
_DEFAULT_ENTITY_SUMMARIZATION_TEMPLATE = """You are an AI assistant helping a human keep track of facts about relevant people, places, and concepts in their life. Update the summary of the provided entity in the "Entity" section based on the last line of your conversation with the human. If you are writing the summary for the first time, return a single sentence.
The update should only include facts that are relayed in the last line of conversation about the provided entity, and should only contain facts about the provided entity.
If there is no new information about the provided entity or the information is not worth noting (not an important or relevant fact to remember long-term), return the existing summary unchanged.
Full conversation history (for context):
{history}
Entity to summarize:
{entity}
Existing summary of {entity}:
{summary}
Last line of conversation:
Human: {input}
Updated summary:"""
ENTITY_SUMMARIZATION_PROMPT = PromptTemplate(
input_variables=["entity", "summary", "history", "input"],
template=_DEFAULT_ENTITY_SUMMARIZATION_TEMPLATE,
)

View File

@@ -4,18 +4,19 @@ https://arxiv.org/abs/2212.10496
"""
from __future__ import annotations
from typing import List
from typing import Dict, List
import numpy as np
from pydantic import BaseModel, Extra
from langchain.chains.base import Chain
from langchain.chains.hyde.prompts import PROMPT_MAP
from langchain.chains.llm import LLMChain
from langchain.embeddings.base import Embeddings
from langchain.embeddings.hyde.prompts import PROMPT_MAP
from langchain.llms.base import BaseLLM
class HypotheticalDocumentEmbedder(Embeddings, BaseModel):
class HypotheticalDocumentEmbedder(Chain, Embeddings, BaseModel):
"""Generate hypothetical document for query, and then embed that.
Based on https://arxiv.org/abs/2212.10496
@@ -30,10 +31,24 @@ class HypotheticalDocumentEmbedder(Embeddings, BaseModel):
extra = Extra.forbid
arbitrary_types_allowed = True
@property
def input_keys(self) -> List[str]:
"""Input keys for Hyde's LLM chain."""
return self.llm_chain.input_keys
@property
def output_keys(self) -> List[str]:
"""Output keys for Hyde's LLM chain."""
return self.llm_chain.output_keys
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Call the base embeddings."""
return self.base_embeddings.embed_documents(texts)
def combine_embeddings(self, embeddings: List[List[float]]) -> List[float]:
"""Combine embeddings into final embeddings."""
return list(np.array(embeddings).mean(axis=0))
def embed_query(self, text: str) -> List[float]:
"""Generate a hypothetical document and embedded it."""
var_name = self.llm_chain.input_keys[0]
@@ -42,9 +57,9 @@ class HypotheticalDocumentEmbedder(Embeddings, BaseModel):
embeddings = self.embed_documents(documents)
return self.combine_embeddings(embeddings)
def combine_embeddings(self, embeddings: List[List[float]]) -> List[float]:
"""Combine embeddings into final embeddings."""
return list(np.array(embeddings).mean(axis=0))
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
"""Call the internal llm chain."""
return self.llm_chain._call(inputs)
@classmethod
def from_llm(
@@ -54,3 +69,7 @@ class HypotheticalDocumentEmbedder(Embeddings, BaseModel):
prompt = PROMPT_MAP[prompt_key]
llm_chain = LLMChain(llm=llm, prompt=prompt)
return cls(base_embeddings=base_embeddings, llm_chain=llm_chain)
@property
def _chain_type(self) -> str:
return "hyde_chain"

View File

@@ -1,4 +1,5 @@
"""Chain that just formats a prompt and calls an LLM."""
from string import Formatter
from typing import Any, Dict, List, Sequence, Union
from pydantic import BaseModel, Extra
@@ -7,6 +8,7 @@ from langchain.chains.base import Chain
from langchain.input import get_colored_text
from langchain.llms.base import BaseLLM
from langchain.prompts.base import BasePromptTemplate
from langchain.prompts.prompt import PromptTemplate
from langchain.schema import LLMResult
@@ -61,10 +63,9 @@ class LLMChain(Chain, BaseModel):
for inputs in input_list:
selected_inputs = {k: inputs[k] for k in self.prompt.input_variables}
prompt = self.prompt.format(**selected_inputs)
if self.verbose:
_colored_text = get_colored_text(prompt, "green")
_text = "Prompt after formatting:\n" + _colored_text
self.callback_manager.on_text(_text, end="\n")
_colored_text = get_colored_text(prompt, "green")
_text = "Prompt after formatting:\n" + _colored_text
self.callback_manager.on_text(_text, end="\n", verbose=self.verbose)
if "stop" in inputs and inputs["stop"] != stop:
raise ValueError(
"If `stop` is present in any inputs, should be present in all."
@@ -123,3 +124,18 @@ class LLMChain(Chain, BaseModel):
return new_result
else:
return result
@property
def _chain_type(self) -> str:
return "llm_chain"
@classmethod
def from_string(cls, llm: BaseLLM, template: str) -> Chain:
"""Create LLMChain from LLM and template."""
input_variables = {
v for _, v, _, _ in Formatter().parse(template) if v is not None
}
prompt_template = PromptTemplate(
input_variables=list(input_variables), template=template
)
return cls(llm=llm, prompt=prompt_template)

View File

@@ -0,0 +1 @@
"""Chain that interprets a prompt and executes bash code to perform bash operations."""

View File

@@ -52,12 +52,10 @@ class LLMBashChain(Chain, BaseModel):
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_executor = LLMChain(prompt=self.prompt, llm=self.llm)
bash_executor = BashProcess()
if self.verbose:
self.callback_manager.on_text(inputs[self.input_key])
self.callback_manager.on_text(inputs[self.input_key], verbose=self.verbose)
t = llm_executor.predict(question=inputs[self.input_key])
if self.verbose:
self.callback_manager.on_text(t, color="green")
self.callback_manager.on_text(t, color="green", verbose=self.verbose)
t = t.strip()
if t.startswith("```bash"):
@@ -69,10 +67,13 @@ class LLMBashChain(Chain, BaseModel):
command_list = [s for s in command_list[1:-1]]
output = bash_executor.run(command_list)
if self.verbose:
self.callback_manager.on_text("\nAnswer: ")
self.callback_manager.on_text(output, color="yellow")
self.callback_manager.on_text("\nAnswer: ", verbose=self.verbose)
self.callback_manager.on_text(output, color="yellow", verbose=self.verbose)
else:
raise ValueError(f"unknown format from LLM: {t}")
return {self.output_key: output}
@property
def _chain_type(self) -> str:
return "llm_bash_chain"

View File

@@ -97,3 +97,7 @@ class LLMCheckerChain(Chain, BaseModel):
)
output = question_to_checked_assertions_chain({"question": question})
return {self.output_key: output["revised_statement"]}
@property
def _chain_type(self) -> str:
return "llm_checker_chain"

View File

@@ -53,21 +53,22 @@ class LLMMathChain(Chain, BaseModel):
def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
llm_executor = LLMChain(prompt=self.prompt, llm=self.llm)
python_executor = PythonREPL()
if self.verbose:
self.callback_manager.on_text(inputs[self.input_key])
self.callback_manager.on_text(inputs[self.input_key], verbose=self.verbose)
t = llm_executor.predict(question=inputs[self.input_key], stop=["```output"])
if self.verbose:
self.callback_manager.on_text(t, color="green")
self.callback_manager.on_text(t, color="green", verbose=self.verbose)
t = t.strip()
if t.startswith("```python"):
code = t[9:-4]
output = python_executor.run(code)
if self.verbose:
self.callback_manager.on_text("\nAnswer: ")
self.callback_manager.on_text(output, color="yellow")
self.callback_manager.on_text("\nAnswer: ", verbose=self.verbose)
self.callback_manager.on_text(output, color="yellow", verbose=self.verbose)
answer = "Answer: " + output
elif t.startswith("Answer:"):
answer = t
else:
raise ValueError(f"unknown format from LLM: {t}")
return {self.output_key: answer}
@property
def _chain_type(self) -> str:
return "llm_math_chain"

View File

@@ -18,7 +18,9 @@ class LLMRequestsChain(Chain, BaseModel):
"""Chain that hits a URL and then uses an LLM to parse results."""
llm_chain: LLMChain
requests_wrapper: RequestsWrapper = Field(default_factory=RequestsWrapper)
requests_wrapper: RequestsWrapper = Field(
default_factory=RequestsWrapper, exclude=True
)
text_length: int = 8000
requests_key: str = "requests_result" #: :meta private:
input_key: str = "url" #: :meta private:
@@ -71,3 +73,7 @@ class LLMRequestsChain(Chain, BaseModel):
other_keys[self.requests_key] = soup.get_text()[: self.text_length]
result = self.llm_chain.predict(**other_keys)
return {self.output_key: result}
@property
def _chain_type(self) -> str:
return "llm_requests_chain"

467
langchain/chains/loading.py Normal file
View File

@@ -0,0 +1,467 @@
"""Functionality for loading chains."""
import json
from pathlib import Path
from typing import Any, Union
import yaml
from langchain.chains.api.base import APIChain
from langchain.chains.base import Chain
from langchain.chains.combine_documents.map_reduce import MapReduceDocumentsChain
from langchain.chains.combine_documents.map_rerank import MapRerankDocumentsChain
from langchain.chains.combine_documents.refine import RefineDocumentsChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains.hyde.base import HypotheticalDocumentEmbedder
from langchain.chains.llm import LLMChain
from langchain.chains.llm_bash.base import LLMBashChain
from langchain.chains.llm_checker.base import LLMCheckerChain
from langchain.chains.llm_math.base import LLMMathChain
from langchain.chains.llm_requests import LLMRequestsChain
from langchain.chains.pal.base import PALChain
from langchain.chains.qa_with_sources.base import QAWithSourcesChain
from langchain.chains.qa_with_sources.vector_db import VectorDBQAWithSourcesChain
from langchain.chains.sql_database.base import SQLDatabaseChain
from langchain.chains.vector_db_qa.base import VectorDBQA
from langchain.llms.loading import load_llm, load_llm_from_config
from langchain.prompts.loading import load_prompt, load_prompt_from_config
from langchain.utilities.loading import try_load_from_hub
URL_BASE = "https://raw.githubusercontent.com/hwchase17/langchain-hub/master/chains/"
def _load_llm_chain(config: dict, **kwargs: Any) -> LLMChain:
"""Load LLM chain from config dict."""
if "llm" in config:
llm_config = config.pop("llm")
llm = load_llm_from_config(llm_config)
elif "llm_path" in config:
llm = load_llm(config.pop("llm_path"))
else:
raise ValueError("One of `llm` or `llm_path` must be present.")
if "prompt" in config:
prompt_config = config.pop("prompt")
prompt = load_prompt_from_config(prompt_config)
elif "prompt_path" in config:
prompt = load_prompt(config.pop("prompt_path"))
else:
raise ValueError("One of `prompt` or `prompt_path` must be present.")
return LLMChain(llm=llm, prompt=prompt, **config)
def _load_hyde_chain(config: dict, **kwargs: Any) -> HypotheticalDocumentEmbedder:
"""Load hypothetical document embedder chain from config dict."""
if "llm_chain" in config:
llm_chain_config = config.pop("llm_chain")
llm_chain = load_chain_from_config(llm_chain_config)
elif "llm_chain_path" in config:
llm_chain = load_chain(config.pop("llm_chain_path"))
else:
raise ValueError("One of `llm_chain` or `llm_chain_path` must be present.")
if "embeddings" in kwargs:
embeddings = kwargs.pop("embeddings")
else:
raise ValueError("`embeddings` must be present.")
return HypotheticalDocumentEmbedder(
llm_chain=llm_chain, base_embeddings=embeddings, **config
)
def _load_stuff_documents_chain(config: dict, **kwargs: Any) -> StuffDocumentsChain:
if "llm_chain" in config:
llm_chain_config = config.pop("llm_chain")
llm_chain = load_chain_from_config(llm_chain_config)
elif "llm_chain_path" in config:
llm_chain = load_chain(config.pop("llm_chain_path"))
else:
raise ValueError("One of `llm_chain` or `llm_chain_config` must be present.")
if not isinstance(llm_chain, LLMChain):
raise ValueError(f"Expected LLMChain, got {llm_chain}")
if "document_prompt" in config:
prompt_config = config.pop("document_prompt")
document_prompt = load_prompt_from_config(prompt_config)
elif "document_prompt_path" in config:
document_prompt = load_prompt(config.pop("document_prompt_path"))
else:
raise ValueError(
"One of `document_prompt` or `document_prompt_path` must be present."
)
return StuffDocumentsChain(
llm_chain=llm_chain, document_prompt=document_prompt, **config
)
def _load_map_reduce_documents_chain(
config: dict, **kwargs: Any
) -> MapReduceDocumentsChain:
if "llm_chain" in config:
llm_chain_config = config.pop("llm_chain")
llm_chain = load_chain_from_config(llm_chain_config)
elif "llm_chain_path" in config:
llm_chain = load_chain(config.pop("llm_chain_path"))
else:
raise ValueError("One of `llm_chain` or `llm_chain_config` must be present.")
if not isinstance(llm_chain, LLMChain):
raise ValueError(f"Expected LLMChain, got {llm_chain}")
if "combine_document_chain" in config:
combine_document_chain_config = config.pop("combine_document_chain")
combine_document_chain = load_chain_from_config(combine_document_chain_config)
elif "combine_document_chain_path" in config:
combine_document_chain = load_chain(config.pop("combine_document_chain_path"))
else:
raise ValueError(
"One of `combine_document_chain` or "
"`combine_document_chain_path` must be present."
)
if "collapse_document_chain" in config:
collapse_document_chain_config = config.pop("collapse_document_chain")
if collapse_document_chain_config is None:
collapse_document_chain = None
else:
collapse_document_chain = load_chain_from_config(
collapse_document_chain_config
)
elif "collapse_document_chain_path" in config:
collapse_document_chain = load_chain(config.pop("collapse_document_chain_path"))
return MapReduceDocumentsChain(
llm_chain=llm_chain,
combine_document_chain=combine_document_chain,
collapse_document_chain=collapse_document_chain,
**config,
)
def _load_llm_bash_chain(config: dict, **kwargs: Any) -> LLMBashChain:
if "llm" in config:
llm_config = config.pop("llm")
llm = load_llm_from_config(llm_config)
elif "llm_path" in config:
llm = load_llm(config.pop("llm_path"))
else:
raise ValueError("One of `llm` or `llm_path` must be present.")
if "prompt" in config:
prompt_config = config.pop("prompt")
prompt = load_prompt_from_config(prompt_config)
elif "prompt_path" in config:
prompt = load_prompt(config.pop("prompt_path"))
return LLMBashChain(llm=llm, prompt=prompt, **config)
def _load_llm_checker_chain(config: dict, **kwargs: Any) -> LLMCheckerChain:
if "llm" in config:
llm_config = config.pop("llm")
llm = load_llm_from_config(llm_config)
elif "llm_path" in config:
llm = load_llm(config.pop("llm_path"))
else:
raise ValueError("One of `llm` or `llm_path` must be present.")
if "create_draft_answer_prompt" in config:
create_draft_answer_prompt_config = config.pop("create_draft_answer_prompt")
create_draft_answer_prompt = load_prompt_from_config(
create_draft_answer_prompt_config
)
elif "create_draft_answer_prompt_path" in config:
create_draft_answer_prompt = load_prompt(
config.pop("create_draft_answer_prompt_path")
)
if "list_assertions_prompt" in config:
list_assertions_prompt_config = config.pop("list_assertions_prompt")
list_assertions_prompt = load_prompt_from_config(list_assertions_prompt_config)
elif "list_assertions_prompt_path" in config:
list_assertions_prompt = load_prompt(config.pop("list_assertions_prompt_path"))
if "check_assertions_prompt" in config:
check_assertions_prompt_config = config.pop("check_assertions_prompt")
check_assertions_prompt = load_prompt_from_config(
check_assertions_prompt_config
)
elif "check_assertions_prompt_path" in config:
check_assertions_prompt = load_prompt(
config.pop("check_assertions_prompt_path")
)
if "revised_answer_prompt" in config:
revised_answer_prompt_config = config.pop("revised_answer_prompt")
revised_answer_prompt = load_prompt_from_config(revised_answer_prompt_config)
elif "revised_answer_prompt_path" in config:
revised_answer_prompt = load_prompt(config.pop("revised_answer_prompt_path"))
return LLMCheckerChain(
llm=llm,
create_draft_answer_prompt=create_draft_answer_prompt,
list_assertions_prompt=list_assertions_prompt,
check_assertions_prompt=check_assertions_prompt,
revised_answer_prompt=revised_answer_prompt,
**config,
)
def _load_llm_math_chain(config: dict, **kwargs: Any) -> LLMMathChain:
if "llm" in config:
llm_config = config.pop("llm")
llm = load_llm_from_config(llm_config)
elif "llm_path" in config:
llm = load_llm(config.pop("llm_path"))
else:
raise ValueError("One of `llm` or `llm_path` must be present.")
if "prompt" in config:
prompt_config = config.pop("prompt")
prompt = load_prompt_from_config(prompt_config)
elif "prompt_path" in config:
prompt = load_prompt(config.pop("prompt_path"))
return LLMMathChain(llm=llm, prompt=prompt, **config)
def _load_map_rerank_documents_chain(
config: dict, **kwargs: Any
) -> MapRerankDocumentsChain:
if "llm_chain" in config:
llm_chain_config = config.pop("llm_chain")
llm_chain = load_chain_from_config(llm_chain_config)
elif "llm_chain_path" in config:
llm_chain = load_chain(config.pop("llm_chain_path"))
else:
raise ValueError("One of `llm_chain` or `llm_chain_config` must be present.")
return MapRerankDocumentsChain(llm_chain=llm_chain, **config)
def _load_pal_chain(config: dict, **kwargs: Any) -> PALChain:
if "llm" in config:
llm_config = config.pop("llm")
llm = load_llm_from_config(llm_config)
elif "llm_path" in config:
llm = load_llm(config.pop("llm_path"))
else:
raise ValueError("One of `llm` or `llm_path` must be present.")
if "prompt" in config:
prompt_config = config.pop("prompt")
prompt = load_prompt_from_config(prompt_config)
elif "prompt_path" in config:
prompt = load_prompt(config.pop("prompt_path"))
else:
raise ValueError("One of `prompt` or `prompt_path` must be present.")
return PALChain(llm=llm, prompt=prompt, **config)
def _load_refine_documents_chain(config: dict, **kwargs: Any) -> RefineDocumentsChain:
if "initial_llm_chain" in config:
initial_llm_chain_config = config.pop("initial_llm_chain")
initial_llm_chain = load_chain_from_config(initial_llm_chain_config)
elif "initial_llm_chain_path" in config:
initial_llm_chain = load_chain(config.pop("initial_llm_chain_path"))
else:
raise ValueError(
"One of `initial_llm_chain` or `initial_llm_chain_config` must be present."
)
if "refine_llm_chain" in config:
refine_llm_chain_config = config.pop("refine_llm_chain")
refine_llm_chain = load_chain_from_config(refine_llm_chain_config)
elif "refine_llm_chain_path" in config:
refine_llm_chain = load_chain(config.pop("refine_llm_chain_path"))
else:
raise ValueError(
"One of `refine_llm_chain` or `refine_llm_chain_config` must be present."
)
if "document_prompt" in config:
prompt_config = config.pop("document_prompt")
document_prompt = load_prompt_from_config(prompt_config)
elif "document_prompt_path" in config:
document_prompt = load_prompt(config.pop("document_prompt_path"))
return RefineDocumentsChain(
initial_llm_chain=initial_llm_chain,
refine_llm_chain=refine_llm_chain,
document_prompt=document_prompt,
**config,
)
def _load_qa_with_sources_chain(config: dict, **kwargs: Any) -> QAWithSourcesChain:
if "combine_documents_chain" in config:
combine_documents_chain_config = config.pop("combine_documents_chain")
combine_documents_chain = load_chain_from_config(combine_documents_chain_config)
elif "combine_documents_chain_path" in config:
combine_documents_chain = load_chain(config.pop("combine_documents_chain_path"))
else:
raise ValueError(
"One of `combine_documents_chain` or "
"`combine_documents_chain_path` must be present."
)
return QAWithSourcesChain(combine_documents_chain=combine_documents_chain, **config)
def _load_sql_database_chain(config: dict, **kwargs: Any) -> SQLDatabaseChain:
if "database" in kwargs:
database = kwargs.pop("database")
else:
raise ValueError("`database` must be present.")
if "llm" in config:
llm_config = config.pop("llm")
llm = load_llm_from_config(llm_config)
elif "llm_path" in config:
llm = load_llm(config.pop("llm_path"))
else:
raise ValueError("One of `llm` or `llm_path` must be present.")
if "prompt" in config:
prompt_config = config.pop("prompt")
prompt = load_prompt_from_config(prompt_config)
return SQLDatabaseChain(database=database, llm=llm, prompt=prompt, **config)
def _load_vector_db_qa_with_sources_chain(
config: dict, **kwargs: Any
) -> VectorDBQAWithSourcesChain:
if "vectorstore" in kwargs:
vectorstore = kwargs.pop("vectorstore")
else:
raise ValueError("`vectorstore` must be present.")
if "combine_documents_chain" in config:
combine_documents_chain_config = config.pop("combine_documents_chain")
combine_documents_chain = load_chain_from_config(combine_documents_chain_config)
elif "combine_documents_chain_path" in config:
combine_documents_chain = load_chain(config.pop("combine_documents_chain_path"))
else:
raise ValueError(
"One of `combine_documents_chain` or "
"`combine_documents_chain_path` must be present."
)
return VectorDBQAWithSourcesChain(
combine_documents_chain=combine_documents_chain,
vectorstore=vectorstore,
**config,
)
def _load_vector_db_qa(config: dict, **kwargs: Any) -> VectorDBQA:
if "vectorstore" in kwargs:
vectorstore = kwargs.pop("vectorstore")
else:
raise ValueError("`vectorstore` must be present.")
if "combine_documents_chain" in config:
combine_documents_chain_config = config.pop("combine_documents_chain")
combine_documents_chain = load_chain_from_config(combine_documents_chain_config)
elif "combine_documents_chain_path" in config:
combine_documents_chain = load_chain(config.pop("combine_documents_chain_path"))
else:
raise ValueError(
"One of `combine_documents_chain` or "
"`combine_documents_chain_path` must be present."
)
return VectorDBQA(
combine_documents_chain=combine_documents_chain,
vectorstore=vectorstore,
**config,
)
def _load_api_chain(config: dict, **kwargs: Any) -> APIChain:
if "api_request_chain" in config:
api_request_chain_config = config.pop("api_request_chain")
api_request_chain = load_chain_from_config(api_request_chain_config)
elif "api_request_chain_path" in config:
api_request_chain = load_chain(config.pop("api_request_chain_path"))
else:
raise ValueError(
"One of `api_request_chain` or `api_request_chain_path` must be present."
)
if "api_answer_chain" in config:
api_answer_chain_config = config.pop("api_answer_chain")
api_answer_chain = load_chain_from_config(api_answer_chain_config)
elif "api_answer_chain_path" in config:
api_answer_chain = load_chain(config.pop("api_answer_chain_path"))
else:
raise ValueError(
"One of `api_answer_chain` or `api_answer_chain_path` must be present."
)
if "requests_wrapper" in kwargs:
requests_wrapper = kwargs.pop("requests_wrapper")
else:
raise ValueError("`requests_wrapper` must be present.")
return APIChain(
api_request_chain=api_request_chain,
api_answer_chain=api_answer_chain,
requests_wrapper=requests_wrapper,
**config,
)
def _load_llm_requests_chain(config: dict, **kwargs: Any) -> LLMRequestsChain:
if "llm_chain" in config:
llm_chain_config = config.pop("llm_chain")
llm_chain = load_chain_from_config(llm_chain_config)
elif "llm_chain_path" in config:
llm_chain = load_chain(config.pop("llm_chain_path"))
else:
raise ValueError("One of `llm_chain` or `llm_chain_path` must be present.")
if "requests_wrapper" in kwargs:
requests_wrapper = kwargs.pop("requests_wrapper")
return LLMRequestsChain(
llm_chain=llm_chain, requests_wrapper=requests_wrapper, **config
)
else:
return LLMRequestsChain(llm_chain=llm_chain, **config)
type_to_loader_dict = {
"api_chain": _load_api_chain,
"hyde_chain": _load_hyde_chain,
"llm_chain": _load_llm_chain,
"llm_bash_chain": _load_llm_bash_chain,
"llm_checker_chain": _load_llm_checker_chain,
"llm_math_chain": _load_llm_math_chain,
"llm_requests_chain": _load_llm_requests_chain,
"pal_chain": _load_pal_chain,
"qa_with_sources_chain": _load_qa_with_sources_chain,
"stuff_documents_chain": _load_stuff_documents_chain,
"map_reduce_documents_chain": _load_map_reduce_documents_chain,
"map_rerank_documents_chain": _load_map_rerank_documents_chain,
"refine_documents_chain": _load_refine_documents_chain,
"sql_database_chain": _load_sql_database_chain,
"vector_db_qa_with_sources_chain": _load_vector_db_qa_with_sources_chain,
"vector_db_qa": _load_vector_db_qa,
}
def load_chain_from_config(config: dict, **kwargs: Any) -> Chain:
"""Load chain from Config Dict."""
if "_type" not in config:
raise ValueError("Must specify a chain Type in config")
config_type = config.pop("_type")
if config_type not in type_to_loader_dict:
raise ValueError(f"Loading {config_type} chain not supported")
chain_loader = type_to_loader_dict[config_type]
return chain_loader(config, **kwargs)
def load_chain(path: Union[str, Path], **kwargs: Any) -> Chain:
"""Unified method for loading a chain from LangChainHub or local fs."""
if hub_result := try_load_from_hub(
path, _load_chain_from_file, "chains", {"json", "yaml"}
):
return hub_result
else:
return _load_chain_from_file(path, **kwargs)
def _load_chain_from_file(file: Union[str, Path], **kwargs: Any) -> Chain:
"""Load chain from file."""
# Convert file to Path object.
if isinstance(file, str):
file_path = Path(file)
else:
file_path = file
# Load from either json or yaml.
if file_path.suffix == ".json":
with open(file_path) as f:
config = json.load(f)
elif file_path.suffix == ".yaml":
with open(file_path, "r") as f:
config = yaml.safe_load(f)
else:
raise ValueError("File type must be json or yaml")
# Load the chain from the config now.
return load_chain_from_config(config, **kwargs)

View File

@@ -94,3 +94,7 @@ class NatBotChain(Chain, BaseModel):
self.input_browser_content_key: browser_content,
}
return self(_inputs)[self.output_key]
@property
def _chain_type(self) -> str:
return "nat_bot_chain"

View File

@@ -1,9 +1,23 @@
# flake8: noqa
# type: ignore
import time
from sys import platform
from typing import (
TYPE_CHECKING,
Any,
Dict,
Iterable,
List,
Optional,
Set,
Tuple,
TypedDict,
Union,
)
black_listed_elements = {
if TYPE_CHECKING:
from playwright.sync_api import Browser, CDPSession, Page, sync_playwright
black_listed_elements: Set[str] = {
"html",
"head",
"title",
@@ -19,8 +33,21 @@ black_listed_elements = {
}
class ElementInViewPort(TypedDict):
node_index: str
backend_node_id: int
node_name: Optional[str]
node_value: Optional[str]
node_meta: List[str]
is_clickable: bool
origin_x: int
origin_y: int
center_x: int
center_y: int
class Crawler:
def __init__(self):
def __init__(self) -> None:
try:
from playwright.sync_api import sync_playwright
except ImportError:
@@ -28,16 +55,20 @@ class Crawler:
"Could not import playwright python package. "
"Please it install it with `pip install playwright`."
)
self.browser = sync_playwright().start().chromium.launch(headless=False)
self.page = self.browser.new_page()
self.browser: Browser = (
sync_playwright().start().chromium.launch(headless=False)
)
self.page: Page = self.browser.new_page()
self.page.set_viewport_size({"width": 1280, "height": 1080})
self.page_element_buffer: Dict[int, ElementInViewPort]
self.client: CDPSession
def go_to_page(self, url):
def go_to_page(self, url: str) -> None:
self.page.goto(url=url if "://" in url else "http://" + url)
self.client = self.page.context.new_cdp_session(self.page)
self.page_element_buffer = {}
def scroll(self, direction):
def scroll(self, direction: str) -> None:
if direction == "up":
self.page.evaluate(
"(document.scrollingElement || document.body).scrollTop = (document.scrollingElement || document.body).scrollTop - window.innerHeight;"
@@ -47,7 +78,7 @@ class Crawler:
"(document.scrollingElement || document.body).scrollTop = (document.scrollingElement || document.body).scrollTop + window.innerHeight;"
)
def click(self, id):
def click(self, id: Union[str, int]) -> None:
# Inject javascript into the page which removes the target= attribute from all links
js = """
links = document.getElementsByTagName("a");
@@ -59,41 +90,37 @@ class Crawler:
element = self.page_element_buffer.get(int(id))
if element:
x = element.get("center_x")
y = element.get("center_y")
x: float = element["center_x"]
y: float = element["center_y"]
self.page.mouse.click(x, y)
else:
print("Could not find element")
def type(self, id, text):
def type(self, id: Union[str, int], text: str) -> None:
self.click(id)
self.page.keyboard.type(text)
def enter(self):
def enter(self) -> None:
self.page.keyboard.press("Enter")
def crawl(self):
def crawl(self) -> List[str]:
page = self.page
page_element_buffer = self.page_element_buffer
start = time.time()
page_state_as_text = []
device_pixel_ratio = page.evaluate("window.devicePixelRatio")
device_pixel_ratio: float = page.evaluate("window.devicePixelRatio")
if platform == "darwin" and device_pixel_ratio == 1: # lies
device_pixel_ratio = 2
win_scroll_x = page.evaluate("window.scrollX")
win_scroll_y = page.evaluate("window.scrollY")
win_upper_bound = page.evaluate("window.pageYOffset")
win_left_bound = page.evaluate("window.pageXOffset")
win_width = page.evaluate("window.screen.width")
win_height = page.evaluate("window.screen.height")
win_right_bound = win_left_bound + win_width
win_lower_bound = win_upper_bound + win_height
document_offset_height = page.evaluate("document.body.offsetHeight")
document_scroll_height = page.evaluate("document.body.scrollHeight")
win_upper_bound: float = page.evaluate("window.pageYOffset")
win_left_bound: float = page.evaluate("window.pageXOffset")
win_width: float = page.evaluate("window.screen.width")
win_height: float = page.evaluate("window.screen.height")
win_right_bound: float = win_left_bound + win_width
win_lower_bound: float = win_upper_bound + win_height
# percentage_progress_start = (win_upper_bound / document_scroll_height) * 100
# percentage_progress_end = (
@@ -116,40 +143,35 @@ class Crawler:
"DOMSnapshot.captureSnapshot",
{"computedStyles": [], "includeDOMRects": True, "includePaintOrder": True},
)
strings = tree["strings"]
document = tree["documents"][0]
nodes = document["nodes"]
backend_node_id = nodes["backendNodeId"]
attributes = nodes["attributes"]
node_value = nodes["nodeValue"]
parent = nodes["parentIndex"]
node_types = nodes["nodeType"]
node_names = nodes["nodeName"]
is_clickable = set(nodes["isClickable"]["index"])
strings: Dict[int, str] = tree["strings"]
document: Dict[str, Any] = tree["documents"][0]
nodes: Dict[str, Any] = document["nodes"]
backend_node_id: Dict[int, int] = nodes["backendNodeId"]
attributes: Dict[int, Dict[int, Any]] = nodes["attributes"]
node_value: Dict[int, int] = nodes["nodeValue"]
parent: Dict[int, int] = nodes["parentIndex"]
node_names: Dict[int, int] = nodes["nodeName"]
is_clickable: Set[int] = set(nodes["isClickable"]["index"])
text_value = nodes["textValue"]
text_value_index = text_value["index"]
text_value_values = text_value["value"]
input_value: Dict[str, Any] = nodes["inputValue"]
input_value_index: List[int] = input_value["index"]
input_value_values: List[int] = input_value["value"]
input_value = nodes["inputValue"]
input_value_index = input_value["index"]
input_value_values = input_value["value"]
layout: Dict[str, Any] = document["layout"]
layout_node_index: List[int] = layout["nodeIndex"]
bounds: Dict[int, List[float]] = layout["bounds"]
input_checked = nodes["inputChecked"]
layout = document["layout"]
layout_node_index = layout["nodeIndex"]
bounds = layout["bounds"]
cursor: int = 0
cursor = 0
html_elements_text = []
child_nodes: Dict[str, List[Dict[str, Any]]] = {}
elements_in_view_port: List[ElementInViewPort] = []
child_nodes = {}
elements_in_view_port = []
anchor_ancestry: Dict[str, Tuple[bool, Optional[int]]] = {"-1": (False, None)}
button_ancestry: Dict[str, Tuple[bool, Optional[int]]] = {"-1": (False, None)}
anchor_ancestry = {"-1": (False, None)}
button_ancestry = {"-1": (False, None)}
def convert_name(node_name, has_click_handler):
def convert_name(
node_name: Optional[str], has_click_handler: Optional[bool]
) -> str:
if node_name == "a":
return "link"
if node_name == "input":
@@ -163,7 +185,9 @@ class Crawler:
else:
return "text"
def find_attributes(attributes, keys):
def find_attributes(
attributes: Dict[int, Any], keys: List[str]
) -> Dict[str, str]:
values = {}
for [key_index, value_index] in zip(*(iter(attributes),) * 2):
@@ -181,7 +205,13 @@ class Crawler:
return values
def add_to_hash_tree(hash_tree, tag, node_id, node_name, parent_id):
def add_to_hash_tree(
hash_tree: Dict[str, Tuple[bool, Optional[int]]],
tag: str,
node_id: int,
node_name: Optional[str],
parent_id: int,
) -> Tuple[bool, Optional[int]]:
parent_id_str = str(parent_id)
if not parent_id_str in hash_tree:
parent_name = strings[node_names[parent_id]].lower()
@@ -195,7 +225,7 @@ class Crawler:
# even if the anchor is nested in another anchor, we set the "root" for all descendants to be ::Self
if node_name == tag:
value = (True, node_id)
value: Tuple[bool, Optional[int]] = (True, node_id)
elif (
is_parent_desc_anchor
): # reuse the parent's anchor_id (which could be much higher in the tree)
@@ -212,7 +242,7 @@ class Crawler:
for index, node_name_index in enumerate(node_names):
node_parent = parent[index]
node_name = strings[node_name_index].lower()
node_name: Optional[str] = strings[node_name_index].lower()
is_ancestor_of_anchor, anchor_id = add_to_hash_tree(
anchor_ancestry, "a", index, node_name, node_parent
@@ -253,7 +283,7 @@ class Crawler:
if not partially_is_in_viewport:
continue
meta_data = []
meta_data: List[str] = []
# inefficient to grab the same set of keys for kinds of objects, but it's fine for now
element_attributes = find_attributes(
@@ -274,7 +304,7 @@ class Crawler:
else child_nodes.setdefault(str(ancestor_node_key), [])
)
if node_name == "#text" and ancestor_exception:
if node_name == "#text" and ancestor_exception and ancestor_node:
text = strings[node_value[index]]
if text == "|" or text == "":
continue
@@ -289,7 +319,7 @@ class Crawler:
) # prevent [button ... (button)..]
for key in element_attributes:
if ancestor_exception:
if ancestor_exception and ancestor_node:
ancestor_node.append(
{
"type": "attribute",
@@ -306,7 +336,7 @@ class Crawler:
element_node_value = strings[node_value[index]]
if (
element_node_value == "|"
): # commonly used as a seperator, does not add much context - lets save ourselves some token space
): # commonly used as a separator, does not add much context - lets save ourselves some token space
continue
elif (
node_name == "input"
@@ -344,36 +374,32 @@ class Crawler:
for element in elements_in_view_port:
node_index = element.get("node_index")
node_name = element.get("node_name")
node_value = element.get("node_value")
is_clickable = element.get("is_clickable")
origin_x = element.get("origin_x")
origin_y = element.get("origin_y")
center_x = element.get("center_x")
center_y = element.get("center_y")
meta_data = element.get("node_meta")
element_node_value = element.get("node_value")
node_is_clickable = element.get("is_clickable")
node_meta_data: Optional[List[str]] = element.get("node_meta")
inner_text = f"{node_value} " if node_value else ""
inner_text = f"{element_node_value} " if element_node_value else ""
meta = ""
if node_index in child_nodes:
for child in child_nodes.get(node_index):
for child in child_nodes[node_index]:
entry_type = child.get("type")
entry_value = child.get("value")
if entry_type == "attribute":
if entry_type == "attribute" and node_meta_data:
entry_key = child.get("key")
meta_data.append(f'{entry_key}="{entry_value}"')
node_meta_data.append(f'{entry_key}="{entry_value}"')
else:
inner_text += f"{entry_value} "
if meta_data:
meta_string = " ".join(meta_data)
if node_meta_data:
meta_string = " ".join(node_meta_data)
meta = f" {meta_string}"
if inner_text != "":
inner_text = f"{inner_text.strip()}"
converted_node_name = convert_name(node_name, is_clickable)
converted_node_name = convert_name(node_name, node_is_clickable)
# not very elegant, more like a placeholder
if (

Some files were not shown because too many files have changed in this diff Show More